Re: [PR] feat(ai-proxy): include AI observability vars in llm_summary [apisix]

via GitHub Thu, 25 Jun 2026 19:00:02 -0700


Copilot commented on code in PR #13609:
URL: https://github.com/apache/apisix/pull/13609#discussion_r3478734626



##########
t/plugin/ai-proxy-kafka-log.t:
##########
@@ -137,6 +137,10 @@ X-AI-Fixture: openai/chat-basic.json
 send data to kafka:
 llm_request
 llm_summary
+tool_count
+cache_read_input_tokens
+cache_creation_input_tokens
+reasoning_tokens
 You are a mathematician
 gpt-35-turbo-instruct
 llm_response_text

Review Comment:
   The test currently asserts only a subset of the new `llm_summary` keys. 
Since this PR’s purpose is to ensure logger plugins receive the full summary 
automatically, it would be better to also assert the remaining newly-added keys 
(`stream`, `has_tool_calls`, `end_user_id`, `content_risk_level`) are present 
in the emitted Kafka log entry (so future refactors don’t accidentally drop 
them).



##########
docs/zh/latest/plugins/ai-proxy.md:
##########
@@ -2082,6 +2082,17 @@ curl "http://127.0.0.1:9080/anything"; -X POST \
 * `llm_model`：LLM 模型。
 * `llm_prompt_tokens`：提示中的令牌数量。
 * `llm_completion_tokens`：提示中的聊天完成令牌数量。
+* `llm_total_tokens`：使用的总令牌数（提示加完成）。
+* `llm_cache_read_input_tokens`：从缓存读取的输入令牌数量。

Review Comment:
   `llm_completion_tokens` currently reads as “提示中的聊天完成令牌数量”, which suggests 
completion tokens are part of the prompt. Completion tokens are produced in the 
completion/response, not in the prompt.



##########
docs/zh/latest/plugins/ai-proxy-multi.md:
##########
@@ -2716,6 +2716,17 @@ curl "http://127.0.0.1:9080/anything"; -X POST \
 * `llm_model`：LLM 模型。
 * `llm_prompt_tokens`：提示中的令牌数量。
 * `llm_completion_tokens`：提示中的聊天完成令牌数量。
+* `llm_total_tokens`：使用的总令牌数（提示加完成）。
+* `llm_cache_read_input_tokens`：从缓存读取的输入令牌数量。

Review Comment:
   `llm_completion_tokens` currently reads as “提示中的聊天完成令牌数量”, which suggests 
completion tokens are part of the prompt. Completion tokens are produced in the 
completion/response, not in the prompt.



##########
docs/en/latest/plugins/ai-proxy.md:
##########
@@ -2082,6 +2082,17 @@ The following example demonstrates how you can log LLM 
request related informati
 * `llm_model`: LLM model.
 * `llm_prompt_tokens`: Number of tokens in the prompt.
 * `llm_completion_tokens`: Number of chat completion tokens in the prompt.
+* `llm_total_tokens`: Total number of tokens used (prompt plus completion).
+* `llm_cache_read_input_tokens`: Number of input tokens read from cache.

Review Comment:
   `llm_completion_tokens` is described as tokens "in the prompt", which is 
misleading next to `llm_prompt_tokens`. Completion tokens are generated in the 
completion/response, not in the prompt.



##########
docs/en/latest/plugins/ai-proxy-multi.md:
##########
@@ -2606,6 +2606,17 @@ The following example demonstrates how you can log LLM 
request related informati
 * `llm_model`: LLM model.
 * `llm_prompt_tokens`: Number of tokens in the prompt.
 * `llm_completion_tokens`: Number of chat completion tokens in the prompt.
+* `llm_total_tokens`: Total number of tokens used (prompt plus completion).
+* `llm_cache_read_input_tokens`: Number of input tokens read from cache.

Review Comment:
   `llm_completion_tokens` is described as tokens "in the prompt", which is 
misleading next to `llm_prompt_tokens`. Completion tokens are generated in the 
completion/response, not in the prompt.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] feat(ai-proxy): include AI observability vars in llm_summary [apisix]

Reply via email to