Re: [PR] feat: ai-cache plugin [apisix]

via GitHub Thu, 30 Apr 2026 14:15:21 -0700


janiussyafiq commented on code in PR #13308:
URL: https://github.com/apache/apisix/pull/13308#discussion_r3170892117



##########
apisix/plugins/prometheus/exporter.lua:
##########
@@ -260,6 +268,35 @@ function _M.http_init(prometheus_enabled_in_stream)
             unpack(extra_labels("llm_active_connections"))},
             llm_active_connections_exptime)
 
+    metrics.ai_cache_hits = prometheus:counter("ai_cache_hits_total",
+            "AI cache hit count by layer",
+            {"route_id", "service_id", "consumer", "layer",
+            unpack(extra_labels("ai_cache_hits"))},
+            ai_cache_hits_exptime)
+
+    metrics.ai_cache_misses = prometheus:counter("ai_cache_misses_total",
+            "AI cache miss count",
+            {"route_id", "service_id", "consumer",
+            unpack(extra_labels("ai_cache_misses"))},
+            ai_cache_misses_exptime)
+
+    local ai_cache_embedding_latency_buckets = DEFAULT_BUCKETS
+    if attr and attr.ai_cache_embedding_latency_buckets then
+        ai_cache_embedding_latency_buckets = 
attr.ai_cache_embedding_latency_buckets
+    end
+    metrics.ai_cache_embedding_latency = 
prometheus:histogram("ai_cache_embedding_latency",
+            "AI cache embedding API call latency in milliseconds",
+            {"route_id", "service_id", "consumer", "provider",
+            unpack(extra_labels("ai_cache_embedding_latency"))},

Review Comment:
   Declined for consistency, `http_latency` and `llm_latency` didn't use ms.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] feat: ai-cache plugin [apisix]

Reply via email to