shreemaan-abhishek opened a new pull request, #13139:
URL: https://github.com/apache/apisix/pull/13139

   ## Problem
   
   `apisix_llm_active_connections` is a Prometheus Gauge that tracks in-flight 
LLM requests. The counter leaks (never decrements) whenever a plugin calls 
`ngx.exit()` during request processing — not only in SSE streaming, but also in 
**non-streaming responses**.
   
   **Root cause**: When `ai-aliyun-content-moderation` (or any other plugin) 
calls `ngx.exit()` inside a phase handler (e.g. `body_filter`, 
`header_filter`), OpenResty terminates the current coroutine immediately. This 
exit is **not** caught by the `pcall` wrapping the upstream request in 
`ai-proxy/base.lua`. As a result:
   
   1. `exporter.inc_llm_active_connections(ctx)` is called before 
`pcall(do_request)` ✓
   2. A plugin calls `ngx.exit()` — either mid-stream (SSE) or after receiving 
a complete non-streaming response
   3. `exporter.dec_llm_active_connections(ctx)` placed after `pcall` is 
**never reached** ✗
   4. Gauge leaks — only goes up, never down
   
   This affects both `ai-proxy` and `ai-proxy-multi` in all request types: 
non-streaming chat, SSE streaming, and any other path where a downstream plugin 
exits early.
   
   ## Fix
   
   Remove the `dec` call from after `pcall` in `ai-proxy/base.lua` and instead 
rely solely on the **log phase**, which always runs even after `ngx.exit()`. 
Introduce a `ctx.llm_active_connections_tracked` flag to prevent 
double-decrement:
   
   **`ai-proxy/base.lua`** — increment and set flag, no `dec` after `pcall`:
   ```lua
   exporter.inc_llm_active_connections(ctx)
   ctx.llm_active_connections_tracked = true
   local ok, code_or_err, body = pcall(do_request)
   -- dec is intentionally NOT here — handled in log phase
   ```
   
   **`ai-proxy.lua` and `ai-proxy-multi.lua` log phase**:
   ```lua
   function _M.log(conf, ctx)
       if ctx.llm_active_connections_tracked then
           exporter.dec_llm_active_connections(ctx)
           ctx.llm_active_connections_tracked = false
       end
       -- ...
   end
   ```
   
   The log phase runs unconditionally regardless of how the request ended 
(normal completion, upstream error, or `ngx.exit()` from any plugin), so the 
gauge is always correctly decremented.
   
   ## Tests
   
   Added a regression test in `t/plugin/ai-aliyun-content-moderation.t`:
   
   - Creates a route with `prometheus` + `ai-proxy` + 
`ai-aliyun-content-moderation` (`check_response=true`)
   - Sends a **non-streaming** chat request (LLM mock always returns offensive 
content)
   - Content moderation denies the response via `ngx.exit(400)`
   - Asserts `apisix_llm_active_connections{...} 0` in Prometheus metrics after 
the log phase completes
   
   All existing tests in `t/plugin/prometheus-ai-proxy.t` (40 tests) continue 
to pass.
   
   ### Checklist
   
   - [x] I have explained the need for this PR and the problem it solves
   - [x] I have explained the changes or the new features added to this PR
   - [x] I have added tests corresponding to this change
   - [ ] I have updated the documentation to reflect this change
   - [x] I have verified that this change is backward compatible (If not, 
please discuss on the [APISIX mailing 
list](https://github.com/apache/apisix/tree/master#community) first)
   
   <!--
   
   Note
   
   1. Mark the PR as draft until it's ready to be reviewed.
   2. Always add/update tests for any changes unless you have a good reason.
   3. Always update the documentation to reflect the changes made in the PR.
   4. Make a new commit to resolve conversations instead of `push -f`.
   5. To resolve merge conflicts, merge master instead of rebasing.
   6. Use "request review" to notify the reviewer after making changes.
   7. Only a reviewer can mark a conversation as resolved.
   
   -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to