hanzhenfang commented on issue #13544:
URL: https://github.com/apache/apisix/issues/13544#issuecomment-4704083321
Hi @Baoyuantop , I was able to reproduce this with a local mock upstream,
without calling the
real DashScope API.
The trigger is that `ai-proxy` forwards `Accept-Encoding: gzip` to the
OpenAI-compatible upstream, then `openai-base.lua` reads the encoded response
body and calls `core.json.decode(raw_res_body)` before decoding it.
Simplified flow:
```text
Upstream returns a compressed response
↓
APISIX does not decode it first
↓
APISIX directly parses it as JSON
↓
JSON parsing fails
```
## Environment
```text
Image: apache/apisix:3.16.0-debian
APISIX version: 3.16.0
Mode: standalone / data_plane / yaml
Plugin: ai-proxy
Provider: openai-compatible
```
## Minimal config
`apisix.yaml`:
```yaml
routes:
- id: 1
uri: /qwen/v1/chat/completions
methods:
- POST
plugins:
ai-proxy:
provider: openai-compatible
timeout: 3000
keepalive: false
ssl_verify: false
auth:
header:
Authorization: Bearer mock-qwen-key
options:
model: qwen-plus
override:
endpoint:
http://qwen-gzip-mock:1980/compatible-mode/v1/chat/completions
#END
```
The mock upstream returns a valid OpenAI-compatible JSON body. With
`Accept-Encoding: gzip`, Nginx compresses that JSON response.
## Steps
```bash
docker compose up -d --force-recreate --remove-orphans
docker compose exec -T apisix apisix version
curl -sS -D /tmp/issue-13544-headers -o /tmp/issue-13544-body \
-w 'http=%{http_code} bytes=%{size_download}\n' \
http://127.0.0.1:19544/qwen/v1/chat/completions \
-H 'Content-Type: application/json' \
-H 'Accept-Encoding: gzip' \
-d '{"messages":[{"role":"user","content":"ping"}]}'
file /tmp/issue-13544-body
xxd -l 16 /tmp/issue-13544-body
docker compose logs --no-log-prefix apisix \
| grep 'invalid response body from ai service'
```
## Actual result
The route returns HTTP 200, but the response body is gzip bytes:
```text
http=200 bytes=217
/tmp/issue-13544-body: gzip compressed data, max speed, from Unix, original
size modulo 2^32 296
00000000: 1f8b 0800 0000 0000 0403 ...
```
APISIX logs:
```text
openai-base.lua:362: sending request to LLM server: ...
"accept-encoding":"gzip" ...
openai-base.lua:165: read_response(): invalid response body from ai service:
<gzip bytes> err: Expected value but found invalid token at character 1, it
will cause token usage not available
```
## Control
The same mock response without gzip is parsed correctly:
```bash
curl -sS -D /tmp/issue-13544-plain-headers -o /tmp/issue-13544-plain-body \
-w 'http=%{http_code} bytes=%{size_download}\n' \
http://127.0.0.1:19544/qwen/v1/chat/completions \
-H 'Content-Type: application/json' \
-d '{"messages":[{"role":"user","content":"ping"}]}'
```
Observed:
```text
http=200 bytes=296
/tmp/issue-13544-plain-body: JSON data
openai-base.lua:187: read_response(): got token usage from ai service:
{"total_tokens":9,"prompt_tokens":3,"completion_tokens":6}
```
So the JSON body is valid. The failure is specifically caused by decoding
JSON
before decoding the response body according to `Content-Encoding`.
The proposed fix direction in the issue looks correct: decode `raw_res_body`
with `apisix.utils.content-decode` before `core.json.decode(raw_res_body)`
and
before any response filter that expects a decoded JSON body.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]