Copilot commented on code in PR #13609: URL: https://github.com/apache/apisix/pull/13609#discussion_r3478734626
########## t/plugin/ai-proxy-kafka-log.t: ########## @@ -137,6 +137,10 @@ X-AI-Fixture: openai/chat-basic.json send data to kafka: llm_request llm_summary +tool_count +cache_read_input_tokens +cache_creation_input_tokens +reasoning_tokens You are a mathematician gpt-35-turbo-instruct llm_response_text Review Comment: The test currently asserts only a subset of the new `llm_summary` keys. Since this PR’s purpose is to ensure logger plugins receive the full summary automatically, it would be better to also assert the remaining newly-added keys (`stream`, `has_tool_calls`, `end_user_id`, `content_risk_level`) are present in the emitted Kafka log entry (so future refactors don’t accidentally drop them). ########## docs/zh/latest/plugins/ai-proxy.md: ########## @@ -2082,6 +2082,17 @@ curl "http://127.0.0.1:9080/anything" -X POST \ * `llm_model`:LLM 模型。 * `llm_prompt_tokens`:提示中的令牌数量。 * `llm_completion_tokens`:提示中的聊天完成令牌数量。 +* `llm_total_tokens`:使用的总令牌数(提示加完成)。 +* `llm_cache_read_input_tokens`:从缓存读取的输入令牌数量。 Review Comment: `llm_completion_tokens` currently reads as “提示中的聊天完成令牌数量”, which suggests completion tokens are part of the prompt. Completion tokens are produced in the completion/response, not in the prompt. ########## docs/zh/latest/plugins/ai-proxy-multi.md: ########## @@ -2716,6 +2716,17 @@ curl "http://127.0.0.1:9080/anything" -X POST \ * `llm_model`:LLM 模型。 * `llm_prompt_tokens`:提示中的令牌数量。 * `llm_completion_tokens`:提示中的聊天完成令牌数量。 +* `llm_total_tokens`:使用的总令牌数(提示加完成)。 +* `llm_cache_read_input_tokens`:从缓存读取的输入令牌数量。 Review Comment: `llm_completion_tokens` currently reads as “提示中的聊天完成令牌数量”, which suggests completion tokens are part of the prompt. Completion tokens are produced in the completion/response, not in the prompt. ########## docs/en/latest/plugins/ai-proxy.md: ########## @@ -2082,6 +2082,17 @@ The following example demonstrates how you can log LLM request related informati * `llm_model`: LLM model. * `llm_prompt_tokens`: Number of tokens in the prompt. * `llm_completion_tokens`: Number of chat completion tokens in the prompt. +* `llm_total_tokens`: Total number of tokens used (prompt plus completion). +* `llm_cache_read_input_tokens`: Number of input tokens read from cache. Review Comment: `llm_completion_tokens` is described as tokens "in the prompt", which is misleading next to `llm_prompt_tokens`. Completion tokens are generated in the completion/response, not in the prompt. ########## docs/en/latest/plugins/ai-proxy-multi.md: ########## @@ -2606,6 +2606,17 @@ The following example demonstrates how you can log LLM request related informati * `llm_model`: LLM model. * `llm_prompt_tokens`: Number of tokens in the prompt. * `llm_completion_tokens`: Number of chat completion tokens in the prompt. +* `llm_total_tokens`: Total number of tokens used (prompt plus completion). +* `llm_cache_read_input_tokens`: Number of input tokens read from cache. Review Comment: `llm_completion_tokens` is described as tokens "in the prompt", which is misleading next to `llm_prompt_tokens`. Completion tokens are generated in the completion/response, not in the prompt. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
