On 07/08/2022 18:14, Johny wrote:
Gauge contains most recent values of a metric, sampled every 1 min or so, and exported by a user application, e.g. some latency sampled at 1 minute intervals by a client application. Lets presume this time series (scraped by Prometheus or sent via remote write) is absolute containing all the information we need for calculating derived statistics. In the most raw form, you can fetch the data points, sort them and calculate percentile. Incidentally, legacy backend has efficient mechanisms to calculate percentiles by scanning and reducing data using map-reduce.

I'm presuming there are more than one request/event every minute or so?

If that is the case it would mean that you can't make a histogram that shows what you actually want to know. While in theory you could look at the 60 samples per hour and plot those on a histogram it would be pretty meaningless. If we assumed 1 request per second, sampling the latest latency value every minute would mean that 59/60 events are being discarded - so you have no idea what is actually happening from looking at that single sampled latency. Your samples could all be returning "low" values, which makes you believe that everything is working fine, but in actual fact the other 59 events per minute are "high" and you would never know.

This is the reason why histograms exist, and why more generally counters are more useful than gauges. A gauge can only tell you about "now" which may or may not be representative of what has actually been happening since the last scrape. A counter however will tell you the absolute change since the last scrape (e.g. the total number of requests since the previous scrape, or the sum of the latencies of all events since the scrape) meaning you never lose information (a counter that represents total latency won't let you know if there was one spike or everything was slow, but it will give you an average since the last scrape instead of losing data).

It would be worth understanding why you aren't able to produce a histogram in the application (or externally via processing an event feed, such as logs)? By design a simple histogram is pretty low impact, being a set of counters for each bucket.

--
Stuart Clark

--
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/57aa312a-b216-6626-2ed8-f8591980b026%40Jahingo.com.

Reply via email to