janiussyafiq opened a new issue, #13352: URL: https://github.com/apache/apisix/issues/13352
### Context `apisix/plugins/ai-aliyun-content-moderation.lua`, `apisix/plugins/ai-aws-content-moderation.lua`, and the upcoming `ai-lakera-guard` plugin ([api7/rfcs#32](https://github.com/api7/rfcs/pull/32)) all use `protocols.get(ctx.ai_client_protocol).extract_request_content(body)` to assemble the text passed to their vendor moderation API. Each protocol's `extract_request_content` (`openai-chat`, `openai-responses`, `anthropic-messages`, `bedrock-converse`) currently walks **only `body.messages[].content`**. Two coverage gaps are inherited by every moderation plugin that uses it. ### Gap 1 — Anthropic top-level `system:` Anthropic places system prompts outside `messages[]` (top-level `body.system` string). `anthropic-messages.lua`'s `extract_request_content` (lines 177-198) does not surface it. The same module's `get_messages` (lines 203-228) **does** include it, so there's a clear precedent for treating `system` as scannable content — `extract_request_content` just doesn't. The textbook prompt-injection attack — *"ignore your previous instructions and..."* — most commonly targets the system role. Anthropic-format requests bypass scanning entirely for this attack class. OpenAI Chat / Responses and Bedrock Converse put system content *inside* `messages[]` and are unaffected. ### Gap 2 — `body.tools[]` definitions Function-call schemas — `tools[].function.description`, `tools[].function.parameters`, the equivalents in Anthropic and Bedrock — are not scanned by any current `extract_request_content` implementation. A maliciously-crafted tool description (instructions to exfiltrate, jailbreak via tool name, etc.) bypasses scanning across all four protocols. ### Proposed fix Extend `extract_request_content` in each protocol module: - `anthropic-messages.lua`: include `body.system` string content; walk `body.tools[]` (`name`, `description`, `input_schema` text content). - `openai-chat.lua` and `openai-responses.lua`: walk `body.tools[]` (`function.name`, `function.description`, `function.parameters` schema text content). - `bedrock-converse.lua`: include `body.system` block content; walk `body.toolConfig.tools[]` (`toolSpec.name`, `toolSpec.description`, `toolSpec.inputSchema` text content). ### Side effects This is a behavior change. Existing `ai-aliyun-content-moderation` and `ai-aws-content-moderation` users may see traffic that previously passed now flag, because more content is being scanned. Test plans for both plugins should add coverage for the new fields. Worth flagging in release notes. ### Discovered While drafting [api7/rfcs#32](https://github.com/api7/rfcs/pull/32) §3.2 (`ai-lakera-guard` known limitations). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
