On 18/04/2020 13:18, 'vivapolonium' via Prometheus Users wrote:
Hey everyone,

I'm failry new to prometheus and trying to wrap my head around some concepts which are not really clear to me.

I'm running a Scala-Application with the official Prometheus Java client. I'm trying to measure the performance of http endpoints and use a `Summary` for that. I implemented an endpoint where I serve the Metrics via an internal andpoint by taking the `TextFormat.write004` method and serving it by myself (not via the included HTTPServlet).

I've setup a Prometheus instance querying that endpoint every 15s and set the maxAge of the Summary also to 15s. Now I have a PromQL-Query like this: `sum by(route)(requests_latency_seconds_sum/requests_latency_seconds_count)*1000`, which should give me the average response-time of an endpoint in milliseconds for each scrape-interval

When rendering the data though, I get some kind of weirdly aggregated data points which is probably a mixture of bad settings and misunderstanding. Take this metric for example:

```
requests_latency_seconds_count{route="library.get",} 83.0
requests_latency_seconds_sum{route="library.get",} 949.2774687769999
```

This summary does not reset after 15s, instead it keeps accumulating all the data which makes it useless to pin-point timebased anomalies in my application.

That isn't a summary (that would have quantile labels), or at least the bit you are showing doesn't cover that.

Normal counters don't reset except when the application restarts. Within PromQL there is the rate() function which allows you to see spikes in latency over time.

So try to add rate() as described at https://www.robustperception.io/rate-then-sum-never-sum-then-rate

Generally I don't use summaries and instead use histograms. Summaries aren't aggregatable (for example if you run multiple instances) or adjustable within Prometheus. With histograms you can aggregate and calculate percentiles over any range.

I digged into the sourcecode of the java library and did not find a way to reset the values to zero or remove them after scraping them. Is this intentionally? Did I miss something in my configuration? Also, as I understood it, the summary is supposed to reset itself?

Hope someone can give me some hints how to solve this


--
Stuart Clark

--
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/e90d4c2f-13a9-dedb-b6fc-aca8e18b1e18%40Jahingo.com.

Reply via email to