nic-6443 opened a new pull request, #13254: URL: https://github.com/apache/apisix/pull/13254
## Summary When a downstream client disconnects mid-stream (browser tab closed, Ctrl+C, request cancelled), the proxy continues reading all remaining chunks from the LLM and performing SSE parsing, token counting, and protocol conversion — burning CPU and LLM API quota for no benefit. ## Root Cause `lua_response_filter` used `ngx.flush()` (async, no wait), which never surfaces client disconnection errors. There was no mechanism to detect a dead downstream in the streaming loop. ## Fix Add an optional `wait` parameter to `lua_response_filter`. When `wait=true`, it uses `ngx.flush(true)` (synchronous flush) which returns an error if the client connection is gone. The streaming path in `parse_streaming_response` now passes `wait=true` and on flush failure immediately closes the upstream connection and exits the read loop. The `wait` parameter defaults to `false` so all existing callers are unaffected. ## Changes - `apisix/plugin.lua`: add optional `wait` param to `lua_response_filter`; return `(ok, err)` for disconnect detection - `apisix/plugins/ai-providers/base.lua`: use `wait=true` in `parse_streaming_response` output loop; on flush failure close upstream and return - `t/plugin/ai-proxy-client-disconnect.t`: integration test verifying upstream is aborted after client disconnect ## Test The new test sets up a slow SSE mock (30ms/chunk, up to 2000 chunks) and a client that reads 3 chunks then closes. It verifies via shared dict that the upstream served well under 50 chunks total (stopped shortly after the disconnect). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
