flearc commented on PR #12841:
URL: https://github.com/apache/apisix/pull/12841#issuecomment-3692459038

   > Another possible solution is to check if `request_type` is `ai_stream` or 
`ai_chat`, and report accordingly. I tried changing the check to report if 
`ctx.var.llm_time_to_first_token != 0`, but this might affect abnormal 
requests. If possible, I can continue to refine this issue based on the 
discussion.
   
   
   ```
       if llm_time_to_first_token ~= "0" then
           metrics.llm_latency:observe(tonumber(llm_time_to_first_token),
               gen_arr(route_id, service_id, consumer_name, balancer_ip,
               vars.request_type, vars.request_llm_model, vars.llm_model,
               unpack(extra_labels("llm_latency", ctx))))
       end
       if vars.llm_prompt_tokens ~= "0" then
           metrics.llm_prompt_tokens:inc(tonumber(vars.llm_prompt_tokens),
               gen_arr(route_id, service_id, consumer_name, balancer_ip,
               vars.request_type, vars.request_llm_model, vars.llm_model,
               unpack(extra_labels("llm_prompt_tokens", ctx))))
       end
       if vars.llm_completion_tokens ~= "0" then
           
metrics.llm_completion_tokens:inc(tonumber(vars.llm_completion_tokens),
               gen_arr(route_id, service_id, consumer_name, balancer_ip,
               vars.request_type, vars.request_llm_model, vars.llm_model,
               unpack(extra_labels("llm_completion_tokens", ctx))))
       end
   ```
   Adjust the if statement might help
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to