OWASP LLM Top 10 (2025)
The OWASP 2025 Top 10 Risk & Mitigations for LLMs and Gen AI Apps covers the most pressing security risks inherent in LLM apps. The guide below shows how insecure application code gives rise to these vulnerabilities. For each it provides examples, mitigations, and scans that can detect potential issues.
Prompt injection occurs when the LLM input causes behavior not intended by the application designer. For example an application specific chatbot being made to act out of character, or to disclose hidden information. LLMs do not distinguish from data and instructions, and have a shared channel (text input) for both. Either user prompting or instructions concealed in data such as documents passed to the LLM may inject instructions that fool the LLM into deviating from its intended behavior. Prompt injection can be a vector for other kinds of attacks, such as sensitive information disclosure or command execution.
Models may be induced into disclosing sensitive information including confidential data, personal information, or architectural information about the model itself. This could also include information about the training data, or copyright violating outputs.
This vulnerability covers security issues with third party tools including models and code. It can encompass models that have compromised performance as well as malicious.
Either training data or input data is compromised to inject incorrect or harmful information into the system's behavior.
Per OWASP: "Improper Output Handling refers specifically to insufficient validation, sanitization, and handling of the outputs generated by large language models before they are passed downstream to other components and systems." Downstream components can be things like output formatting, tools such as sql queries or shell commands, code execution, etc.
OWASP distinguishes excessive agency from improper output handling, stating that agency is concerned with the range of capabilities an LLM application is granted, while output handling is about how the output is passed to those capabilities. There is definitely redundancy between these two categories.
The system prompt is the over-arching, typically hidden set of instructions passed to the LLM. Accessing this information could help an attacker bypass controls by specifically countering it. Or if the prompt contains sensitive information, revealing this would be in itself compromising.
This category covers any security weaknesses introduced by the handling of embeddings that are often used in LLM apps that include as semantics search component, such as RAG. Weaknesses can include access permissions, data leakage, including embedding inversion, and the use of embeddings as a vector for other attacks such as prompt injection.
Incorrect LLM outputs can be a security vulnerability. OWASP includes factual inaccuracies, baseless claims, and misrepresentation of expertise as potential issues. These could be unintentional or malicious. Another issue is insecure code generation, including "package hallucination" where LLMs recommend installing made-up dependencies that can then be "squatted" by a bad actor resulting in the installation of malicious code.
This category includes susceptibility to denial of service attacks, for example by overloading an application with repeated requests. It also covers model exfiltration attacks that use a large number of systematic requests to recover information about a model.
Ready to Secure Your AI Applications?
Start with your first scan!