Involving Security at Every Stage of Development

Article arrow_drop_down

[ad_1]

As large language models (LLMs) become increasingly prevalent in businesses and applications, the need for robust security measures has never been greater. An LLM, if not properly secured, can pose significant risks in terms of data breaches, model manipulation, and even regulatory compliance issues. This is where engaging an external security company becomes crucial.

In this blog, we will explore the key considerations for companies looking to hire a security team to assess and secure their LLM-powered systems, as well as the specific tasks that should be undertaken at different stages of the LLM development lifecycle.

Stage 0: Hosting Model (Physical vs Cloud)

The choice of hosting model, whether physical or cloud-based, can have significant implications for the security of a large language model (LLM). Each approach comes with its own set of security considerations that must be carefully evaluated.

Physical Security Concerns:

  • Physical & Data Center Security – Hosting an LLM model on physical infrastructure introduces risks such as unauthorized facility access, theft, or disruption of critical systems. These threats can stem not only from technical vulnerabilities but also from human factors, like social engineering or tailgating. To mitigate these risks, a physical penetration test should be conducted. This includes simulating real-world intrusion attempts—such as impersonation or bypassing staff—as well as evaluating the resilience of physical barriers, surveillance systems, access controls, and environmental safeguards (e.g., power and fire protection).
  • Network Security – On-prem LLM deployments are particularly vulnerable to external cyberattacks and insider threats if the underlying network architecture is not well-secured. Misconfigured firewalls, exposed services, or overly permissive access controls can open the door to exploitation. To proactively address these risks, a thorough network architecture review and security assessment should be conducted. This includes evaluating segmentation, firewall rules, VPN access, intrusion detection capabilities, and access controls to ensure the LLM environment is protected from both external and internal compromise.
  • Supply Chain Risks – LLM models hosted on-prem rely heavily on hardware components, which can introduce risk if sourced from untrusted or unverified vendors. Compromised chips, malicious firmware, or tampered devices could undermine the integrity of the entire system. To reduce this risk, a security team should perform a comprehensive supply chain review—assessing procurement practices, verifying vendor trustworthiness, physically inspecting hardware components, and validating firmware/software integrity—to ensure the infrastructure is free from embedded threats and adheres to secure sourcing principles.

Cloud-Based Hosting Security Risks:

  • Cloud Misconfigurations – When an LLM is hosted in the cloud, the security posture depends heavily on how cloud services are configured. Unlike physical environments where organizations have full control over hardware and network layers, cloud environments introduce abstraction—and with that, a different set of risks. Misconfigured resources such as open S3 buckets, overly permissive IAM roles, or unprotected API endpoints can expose sensitive training data, model artifacts, or credentials. To address this, security teams should conduct a comprehensive cloud configuration review. This involves auditing storage permissions, identity and access management policies, network security settings, and exposed endpoints to identify and remediate misconfigurations that could lead to unauthorized access or data leakage.
  • Shared Responsibility Misalignment – One of the most important—and often overlooked—aspects of cloud security are understanding the shared responsibility model. In contrast to physical hosting, where the organization owns the full security stack, cloud providers secure the infrastructure while customers are responsible for protecting their data, applications, and configurations. Misunderstanding this division can result in unprotected assets or blind spots in security coverage. Security teams can help by clearly mapping out responsibilities, developing the right set of security controls, and implementing processes that ensure all components of the LLM environment- especially customer-managed ones-are properly secured.
  • Insider Threats – Although rare, there is a risk that cloud provider employees could misuse privileged access. To address this, organizations should enforce strong encryption and role-based access controls and implement detailed monitoring and audit logging. Security teams can assist by validating that sensitive data is encrypted at rest and in transit, and that any access from the provider’s side is transparent and logged.
  • Regulatory Compliance – Hosting LLMs in the cloud may introduce additional regulatory requirements based on industry and geography (e.g., GDPR, HIPAA, etc.). Security teams should assess regulatory obligations and design a compliance framework that includes data residency controls, encryption policies, access auditing, and breach response procedures. This ensures the LLM environment aligns with legal standards and industry’s best practices.

When evaluating the hosting model for an LLM, organizations should carefully assess the security risks and controls associated with both physical and cloud-based approaches. A thorough understanding of the security implications, as well as the organization’s specific requirements and constraints, is crucial in determining the most appropriate hosting solution. The following stages are applicable, whether a Physical or Cloud Hosting model is being used.

Stage 1: Data Curation

The quality of a large language model is directly tied to the quality of the data on which it is trained. While data curation is typically a data science and engineering effort, security risks can still emerge. From a security standpoint, this is a critical point to review the trustworthiness of data sources and implement controls that safeguard the model against poisoning, privacy violations, and biased outputs.

  • Data Poisoning Risks – Training on unverified or adversarial data can introduce hidden vulnerabilities, biases, or even malicious behavior into an LLM. This is especially concerning when using public internet sources or community-contributed datasets. Security teams should conduct an architecture review and/or threat model to assess data ingestion workflows. These reviews help identify weak points where malicious actors could insert poisoned data and ensure the organization is applying controls like data validation, domain whitelisting, and monitoring for anomalous content
  • Privacy and Compliance Violations – Public datasets often include user-generated content that may contain personally identifiable information (PII) or other sensitive data, which could lead to regulatory non-compliance or privacy breaches. Security professionals should work closely with data teams to review the pipeline’s privacy redaction and filtering mechanisms. This involves verifying that rule-based and classifier-based tools are in place to detect and remove PII and that these processes are regularly audited for effectiveness.
  • Embedding-Level Poisoning – Even after standard cleaning, adversarial inputs can persist through embedding layers—introducing subtle shifts in how the model interprets certain content. While hard to detect, this emerging risk can be mitigated through research-driven adversarial evaluation or coordination with red teams focused on LLM embedding behavior. Security should stay engaged in discussions about tokenizer design and embedding robustness for high-risk models.

Stage 2: Model Architecture

At this stage, the organization is selecting and/or designing the neural architecture that will define how the LLM learns and generates output. Most modern LLMs are built on transformer architectures. While these architectural decisions are primarily technical, they carry significant security implications. If left unchecked, vulnerabilities introduced at the model layer, especially in open-source models, can become permanent, persistent threats. Although this stage may not traditionally involve security professionals, incorporating them early on enables preemptive defense against deeply embedded risks.

  • Backdoored Model Components – If an attacker gains access during training or modifies an open-source model, they could introduce malicious behavior directly into the model’s architecture. For example, a modified decoder layer in a transformer-based model might be designed to output specific backdoored content when triggered by certain prompts. To mitigate this, security teams should conduct an architecture review focused on model integrity. This includes validating training provenance, reviewing critical model layers for unauthorized changes, and checking for suspicious behaviors in controlled test prompts.
  • Model Integrity & Trust in Open Source – Many organizations leverage open-source models to speed up development. However, these models can come with trust issues if their training data, architecture, or modification history isn’t transparent. Security teams should implement model vetting and auditing workflows to verify the source, training lineage, and any pre-applied modifications before the model is deployed or further fine-tuned.
  • Security-Enhancing Architecture Choices – While most teams focus on accuracy or efficiency, some architectural choices can enhance security. For instance, integrating anomaly detection modules or designing the model to support adversarial training can improve resilience to future attacks. Security professionals should advise on or recommend security-focused architectural components during early design conversations—especially in high-risk or safety-critical deployments.

Stage 3: Training at Scale

As the LLM is being trained at scale, there is not a lot for the security team to do, if they performed thorough architecture reviews and threat models earlier. They should focus on ensuring the training process itself is secure and stable:

  • Reviewing the security of the training infrastructure – Evaluating the physical, network, and access controls in place to protect the training environment from unauthorized access or disruption.

Stage 4: Evaluation

During the evaluation phase, the LLM’s performance is assessed using benchmark datasets to measure its knowledge, reasoning, and accuracy across diverse tasks. Common benchmarks include Hellaswag (commonsense reasoning), MMLU (domain-specific knowledge), and TruthfulQA (truthfulness and misconception resistance). While these tools are useful for gauging general performance, they also introduce a unique surface area for security risks. At this stage, security professionals are primarily focused on ensuring that evaluation practices are reputable, contamination-free, and not subject to manipulation.

  • Evaluation Contamination – Attackers may attempt to game the evaluation process by inserting benchmark questions (e.g., from Hellaswag or MMLU) into the model’s pre-training corpus. If undetected, this can significantly inflate benchmark scores and give a false sense of model quality. Security teams should work with ML engineers to implement benchmark contamination detection—verifying that no evaluation data was leaked or memorized during training.
  • Trust in Evaluators and Benchmarks – Not all benchmarks are created equal, and some may lack rigor or transparency. Relying on poorly constructed or overly narrow benchmarks can result in misleading performance metrics. Security professionals should help vet the evaluation tools and datasets, ensuring they come from reputable sources, are well-maintained, and align with the organization’s goals and risk tolerance.
  • Assessment Process Integrity – Evaluation scripts, datasets, and scoring logic can be tampered with if not properly secured. Security teams should review the evaluation infrastructure, ensuring that access is restricted, version control is in place, and results are reproducible and auditable to maintain trust in the outcome.

Stage 5: Post-Improvements

As the LLM matures beyond its base training, teams often enhance it with fine-tuning, Retrieval-Augmented Generation (RAG), and prompt engineering to improve performance, introduce domain expertise, or integrate real-time knowledge. While these improvements unlock powerful capabilities, they also create new security concerns—particularly when sensitive data or policy enforcement is involved. At this stage, the security team’s role is to evaluate how these techniques are implemented and ensure that access control, response handling, and ethical safeguards are not compromised in the process.

  • Fine-Tuning Risks – Fine-tuning a model on sensitive or proprietary data without proper guardrails can unintentionally expose confidential information to end users. For example, fine-tuning a model on internal government reports or customer records without chat-level access control may result in unauthorized disclosures. Security teams should review fine-tuning datasets and ensure access-controlled boundaries are enforced on model outputs, especially when models are deployed in multi-user environments.
  • RAG Integration Misconfigurations – RAG (Retrieval Augmented Generation) extends an LLM’s knowledge by retrieving context from a custom knowledge base, but if access controls are not implemented at the retrieval layer, sensitive data can be leaked. For example, one user might be able to access another user’s financial data if the RAG system retrieves documents indiscriminately. Security teams should assess RAG implementation for access control enforcement, document scoping, and auditability of retrieved content.
  • RAI (Responsible AI) Findings Post-Fine-Tuning – While fine-tuning can strengthen model safety, it can also make it harder to detect unwanted behaviors or policy violations, especially if the model has learned to subtly avoid detection. Security professionals should collaborate with RAI teams to test for adversarial prompt bypasses and edge-case safety violations, even on models that have been fine-tuned for alignment.
  • Prompt Engineering Surface Area – Poorly designed prompts or system instructions may lead to unintended model behavior or create prompt injection vulnerabilities, especially when models are integrated into applications. While prompt engineering is generally seen as a low-risk technique, security teams should review system prompts and application-level prompt construction to ensure they are resilient against adversarial inputs and do not override critical safety behaviors.

Stage 6: API Integration

At this point in the lifecycle, the LLM transitions from a passive chatbot to an active system component that can interact with real-world applications and services via APIs. Whether it’s resetting passwords, retrieving internal data, or processing user requests, API integration introduces significant new power—and corresponding risk. From a security perspective, this is the stage where traditional API security principles converge with LLM-specific concerns like prompt injection, indirect execution, and intent manipulation. It becomes one of the most critical stages for offensive testing, especially as LLM-driven applications begin making decisions and acting on behalf of users.

  • Authentication and Authorization Issues – As LLMs begin triggering API calls, it’s critical to verify that only authorized users can perform sensitive actions. Security teams should ensure strong authentication (e.g., OAuth, API keys) and scoped permissions are in place so the model cannot execute privileged actions on behalf of unauthorized users.
  • Prompt Injection and API Misuse – LLMs interpret natural language and decide which API to call based on user input. Attackers may try to manipulate prompts to trigger unintended or harmful behavior. Security teams should test for prompt injection vulnerabilities, ensuring the model cannot be tricked into calling the wrong API or bypassing safety checks.
  • Lack of Input Validation – If user inputs passed to APIs are not properly validated, they can lead to errors, data leaks, or even injection attacks. Security teams should confirm that middleware and API layers validate and sanitize all input, regardless of whether it came from a user or the LLM.
  • Over-Exposed Functionality – APIs may expose more functionality than the LLM needs, increasing the risk of misuse. Security should enforce the principle of least privilege, giving the model access only to the endpoints it truly needs.
  • Relevant OWASP LLM Risks – Many risks in this phase align with the OWASP top risks. Security experts should refer to the list as a guide for testing.

Conclusion

Securing an LLM is a complex and multifaceted challenge that requires a comprehensive, stage-by-stage approach. By partnering with an experienced security team, companies can navigate the unique security considerations of LLM development and deployment, ensuring their models are robust, reliable, and compliant with relevant regulations and industry standards. By prioritizing security from the outset, organizations can unlock the full potential of LLMs while safeguarding their data, their systems, and their reputation.



[ad_2]

Source link

About the author

trending_flat
Defend the Airport

[ad_1] Every day, millions of passengers depend on a vast, complex airport ecosystem to get from Point A to Point B. From airline check-ins and baggage handling to air traffic control and terminal operations, the aviation sector is an intricate web of interconnected third-party providers, technologies, and stakeholders. In this high-stakes environment, a cybersecurity breach is not a single point of failure, it’s a ripple effect waiting to happen. Cyber Threats Aren’t Just IT Problems – They’re Operational Crises When people think about airport cybersecurity, they often picture network firewalls at airline headquarters or secure software for booking systems. But the real threat landscape is far broader and far more vulnerable. If a catering supplier is hit with ransomware, the aircraft turnaround slows. If the baggage conveyor system is compromised, luggage piles up, delaying departures. If the security contractor experiences […]

trending_flat
Securing LLMs Against Prompt Injection Attacks

[ad_1] Introduction Large Language Models (LLMs) have rapidly become integral to applications, but they come with some very interesting security pitfalls. Chief among these is prompt injection, where cleverly crafted inputs make an LLM bypass its instructions or leak secrets. Prompt injection in fact is so wildly popular that, OWASP now ranks prompt injection as the #1 AI security risk for modern LLM applications as shown in their OWASP GenAI top 10. We’ve provided a higher-level overview about Prompt Injection in our other blog, so in this one we’ll focus on the concept with the technical audience in mind. Here we’ll explore how LLMs can be vulnerable at the architectural level and the sophisticated ways attackers exploit them. We’ll also examine effective defenses, from system prompt design to “sandwich” prompting techniques. We’ll also discuss a few tools that can help […]

trending_flat
LLM Prompt Injection – What’s the Business Risk, and What to Do About It

[ad_1] The rise of generative AI offers incredible opportunities for businesses. Large Language Models can automate customer service, generate insightful analytics, and accelerate content creation. But alongside these benefits comes a new category of security risk that business leaders must understand: Prompt Injection Attacks. In simple terms, a prompt injection is when someone feeds an AI model malicious or deceptive input that causes it to behave in an unintended, and often harmful way. This isn’t just a technical glitch, it’s a serious threat that can lead to brand embarrassment, data leaks, or compliance violations if not addressed. As organizations rush to adopt AI capabilities, ensuring the security of those AI systems is now a board-level concern. In this post we’ll provide a high-level overview of prompt injection risks, why they matter to your business, and how Security Innovation’s GenAI Penetration […]

trending_flat
Setting Up a Pentesting Environment for the Meta Quest 2

[ad_1] With the advent of commercially available virtual reality headsets, such as the Meta Quest, the integration of virtual and augmented reality into our daily lives feels closer than ever before. As these devices become more common, so too will the need to secure and protect the data collected and stored by them. The intention of this blog post is to establish a baseline security testing environment for Meta Quest 2 applications and is split into three sections: Enabling Developer Mode, Establishing an Intercepting Proxy, and Injecting Frida Gadget. The Quest 2 runs on a modified version of the Android Open Source Project (AOSP) in addition to proprietary software developed by Meta, allowing the adoption of many established Android testing methods.   Enabling Developer Mode The first step of setting up a security testing environment on the Quest is to […]

trending_flat
Earn the US Cyber Trust Mark and Unlock New IoT Growth Opportunities

[ad_1] As an IoT product manufacturer, building consumer trust in the security of your connected devices is critical for driving sales and staying competitive. Fortunately, the Federal Communications Commission (FCC) has introduced a new program to help - the US Cyber Trust Mark. The Cyber Trust Mark is a new voluntary labeling program that is obtained by demonstrating the cybersecurity of your IoT products. By earning this seal of approval, you can demonstrate to your customers that your devices meet rigorous security standards and can be trusted to protect their personal data and connected home. Retailers like Best Buy and Amazon will be collaborating with the FCC to educate consumers on this new program and increase public demand for the Cyber Trust Mark. But achieving the Cyber Trust Mark isn't a simple process. That's where Security Innovation, a Bureau Veritas […]

trending_flat
The Value of OT Penetration Testing

[ad_1] With the increasing cyber threats targeting operational technology (OT) environments, it's more important than ever to proactively assess and strengthen the security of your Industrial Control Systems (ICS). One of the most effective ways to do this is through an OT penetration test. What is an OT Penetration Test? An OT penetration test is a comprehensive security assessment that simulates real-world cyber-attacks against your ICS environment. Experienced security professionals, with deep expertise in both IT and OT systems, will attempt to gain unauthorized access and exploit vulnerabilities within your industrial control networks and devices. The team will provide you with a realistic understanding of your ICS security posture and the potential impact of a successful attack. The Benefits of OT Penetration Testing Uncover Hidden Vulnerabilities: Pen testers will identify vulnerabilities and misconfigurations that may have been overlooked by traditional […]

Related

trending_flat
Defend the Airport

[ad_1] Every day, millions of passengers depend on a vast, complex airport ecosystem to get from Point A to Point B. From airline check-ins and baggage handling to air traffic control and terminal operations, the aviation sector is an intricate web of interconnected third-party providers, technologies, and stakeholders. In this high-stakes environment, a cybersecurity breach is not a single point of failure, it’s a ripple effect waiting to happen. Cyber Threats Aren’t Just IT Problems – They’re Operational Crises When people think about airport cybersecurity, they often picture network firewalls at airline headquarters or secure software for booking systems. But the real threat landscape is far broader and far more vulnerable. If a catering supplier is hit with ransomware, the aircraft turnaround slows. If the baggage conveyor system is compromised, luggage piles up, delaying departures. If the security contractor experiences […]

trending_flat
Securing LLMs Against Prompt Injection Attacks

[ad_1] Introduction Large Language Models (LLMs) have rapidly become integral to applications, but they come with some very interesting security pitfalls. Chief among these is prompt injection, where cleverly crafted inputs make an LLM bypass its instructions or leak secrets. Prompt injection in fact is so wildly popular that, OWASP now ranks prompt injection as the #1 AI security risk for modern LLM applications as shown in their OWASP GenAI top 10. We’ve provided a higher-level overview about Prompt Injection in our other blog, so in this one we’ll focus on the concept with the technical audience in mind. Here we’ll explore how LLMs can be vulnerable at the architectural level and the sophisticated ways attackers exploit them. We’ll also examine effective defenses, from system prompt design to “sandwich” prompting techniques. We’ll also discuss a few tools that can help […]

trending_flat
LLM Prompt Injection – What’s the Business Risk, and What to Do About It

[ad_1] The rise of generative AI offers incredible opportunities for businesses. Large Language Models can automate customer service, generate insightful analytics, and accelerate content creation. But alongside these benefits comes a new category of security risk that business leaders must understand: Prompt Injection Attacks. In simple terms, a prompt injection is when someone feeds an AI model malicious or deceptive input that causes it to behave in an unintended, and often harmful way. This isn’t just a technical glitch, it’s a serious threat that can lead to brand embarrassment, data leaks, or compliance violations if not addressed. As organizations rush to adopt AI capabilities, ensuring the security of those AI systems is now a board-level concern. In this post we’ll provide a high-level overview of prompt injection risks, why they matter to your business, and how Security Innovation’s GenAI Penetration […]

trending_flat
Setting Up a Pentesting Environment for the Meta Quest 2

[ad_1] With the advent of commercially available virtual reality headsets, such as the Meta Quest, the integration of virtual and augmented reality into our daily lives feels closer than ever before. As these devices become more common, so too will the need to secure and protect the data collected and stored by them. The intention of this blog post is to establish a baseline security testing environment for Meta Quest 2 applications and is split into three sections: Enabling Developer Mode, Establishing an Intercepting Proxy, and Injecting Frida Gadget. The Quest 2 runs on a modified version of the Android Open Source Project (AOSP) in addition to proprietary software developed by Meta, allowing the adoption of many established Android testing methods.   Enabling Developer Mode The first step of setting up a security testing environment on the Quest is to […]

trending_flat
Kiren Rijiju: Why Earth Sciences minister Rijiju is upset with this European IT company |

[ad_1] Earth Sciences Minister Kiren Rijiju is reportedly upset with the French IT company Atos. Reason is said to be delay in the delivery of two supercomputers by the French company to Indian weather forecasting institutes. According to a report in news agency PTI, the Earth Sciences Ministry had ordered two supercomputers worth $100 million from French firm Eviden, of the Atos Group, last year to enhance the computing capabilities of its institutions -- the National Centre for Medium Range Weather Forecasting (NCMRWF) and the Indian Institute of Tropical Meteorology (IITM)."I am more upset because the target we set was December. The Union Cabinet had already approved purchasing the supercomputer. We have only four petaflop capacity. We want to install up to 18 petaflop capacity," Rijiju told PTI in a video interview.He said that the French company ran into some […]

trending_flat
Former Activision boss reportedly wants to buy TikTok

[ad_1] Bobby Kotick, the former head of Activision Blizzard, is reportedly considering buying TikTok, as the app could be banned in the United States. The Wall Street Journal reports that Kotick has talked to ByteDance, the company that owns TikTok, about buying the app, which could cost hundreds of billions of dollars.This comes as US lawmakers introduce a new bill that would make ByteDance sell TikTok within six months or stop it from being available in US app stores.President Joe Biden has said he would approve the bill if it passes in Congress.The Wall Street Journal report adds that Kotick, the head of OpenAI, Sam Altman, discussed teaming up to buy TikTok at a dinner last week. Kotick's interest in TikTok follows a rough end to his 30 years leading Activision Blizzard, which Microsoft acquired last year. The company faced […]

Be the first to leave a comment

Leave a comment

Your email address will not be published. Required fields are marked *