nic-6443 opened a new pull request, #13254:
URL: https://github.com/apache/apisix/pull/13254

   ## Summary
   
   When a downstream client disconnects mid-stream (browser tab closed, Ctrl+C, 
request cancelled), the proxy continues reading all remaining chunks from the 
LLM and performing SSE parsing, token counting, and protocol conversion — 
burning CPU and LLM API quota for no benefit.
   
   ## Root Cause
   
   `lua_response_filter` used `ngx.flush()` (async, no wait), which never 
surfaces client disconnection errors. There was no mechanism to detect a dead 
downstream in the streaming loop.
   
   ## Fix
   
   Add an optional `wait` parameter to `lua_response_filter`. When `wait=true`, 
it uses `ngx.flush(true)` (synchronous flush) which returns an error if the 
client connection is gone. The streaming path in `parse_streaming_response` now 
passes `wait=true` and on flush failure immediately closes the upstream 
connection and exits the read loop.
   
   The `wait` parameter defaults to `false` so all existing callers are 
unaffected.
   
   ## Changes
   
   - `apisix/plugin.lua`: add optional `wait` param to `lua_response_filter`; 
return `(ok, err)` for disconnect detection
   - `apisix/plugins/ai-providers/base.lua`: use `wait=true` in 
`parse_streaming_response` output loop; on flush failure close upstream and 
return
   - `t/plugin/ai-proxy-client-disconnect.t`: integration test verifying 
upstream is aborted after client disconnect
   
   ## Test
   
   The new test sets up a slow SSE mock (30ms/chunk, up to 2000 chunks) and a 
client that reads 3 chunks then closes. It verifies via shared dict that the 
upstream served well under 50 chunks total (stopped shortly after the 
disconnect).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to