Thanks Brain, I don't want to go towards Summaries, but with histograms, mainly with Native Histograms, is there a possibility to get Max and Min values for a period of time?
With OTEL-based metrics instrumentation, it is possible to record max and min values. See https://opentelemetry.io/docs/specs/otel/metrics/data-model/#histogram *Histograms consist of the following:* - An *Aggregation Temporality* of delta or cumulative. - A set of data points, each containing: - An independent set of Attribute name-value pairs. - A time window (of (start, end]) time for which the Histogram was bundled. - The time interval is inclusive of the end time. - Time values are specified as nanoseconds since the UNIX Epoch (00:00:00 UTC on 1 January 1970). - A count (count) of the total population of points in the histogram. - A sum (sum) of all the values in the histogram. - *(optional) The min (min) of all values in the histogram.* - *(optional) The max (max) of all values in the histogram.* Br, Teja On Monday, June 23, 2025 at 2:21:03 PM UTC+2 Brian Candler wrote: > Also relevant: > > https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/33645 > https://groups.google.com/g/prometheus-developers/c/dGEaTR7Hyi0 > > On Monday, 23 June 2025 at 13:17:57 UTC+1 Brian Candler wrote: > >> Nice explanation of summaries here: >> >> https://grafana.com/blog/2022/03/01/how-summary-metrics-work-in-prometheus/ >> >> On Monday, 23 June 2025 at 12:42:35 UTC+1 Brian Candler wrote: >> >>> Remember that histograms don't store values. All they do is increment a >>> counter by 1; the value is only used to select which bucket to increment. >>> This means that the amount of storage used by a histogram is very small - a >>> fixed number of buckets with one counter each. It doesn't matter if you are >>> processing 1 sample per second or 10,000 samples per second. >>> >>> If you wanted to retrieve the *exact* lowest or highest value, over >>> *any* arbitrary time period that you query, you would have to store every >>> single value into a database. Prometheus is not a event logging system, and >>> it will never work this way. A columnar datastore like Clickhouse can do >>> that quite well, but if the number of samples is large, you will still have >>> a very large storage issue. >>> >>> More realistically, you could find the minimum or maximum value seen >>> over a fixed time period (say one minute), and at the end of that minute, >>> export the min/max value seen. That's cheap and quick. Indeed, you could do >>> it over a relatively short time period (e.g. 1 second), and use prometheus' >>> min/max_over_time functions if you want to query a longer period, i.e. to >>> find the min of the mins, or the max of the maxes. You need to make sure >>> that every distinct min/max value ends up in the database though; either >>> use remote_write to push them, or scrape your exporter at least twice as >>> fast as the min/max values are changing. >>> >>> In my experience, people are often not so interested in the single >>> minimum or maximum value, but in the quantiles, such as the 1st percentile >>> ("the fastest 1% of queries were answered in less than X seconds") or the >>> 99th percentile ("the slowest 1% of queries were answered in more than Y >>> seconds"). Prometheus can help you using a data type called a "summary": >>> https://prometheus.io/docs/concepts/metric_types/#summary >>> https://prometheus.io/docs/practices/histograms/#quantiles >>> >>> A summary can give you very good estimates of the percentiles over a >>> sliding time window (of a size you have to choose in advance), and uses a >>> relatively small amount of storage like a histogram. It is better than a >>> histogram in the case where you don't know in advance what the highest and >>> lowest values are likely to be (i.e. you don't need to pre-allocate your >>> bucket boundaries correctly). >>> >>> On Monday, 23 June 2025 at 08:15:42 UTC+1 tejaswini vadlamudi wrote: >>> >>>> Thanks Brain, for the clear heads-up and explanation! >>>> >>>> It looks to me that there is no possibility to secure exact maximum and >>>> exact minimum values for durations (based on Prometheus histograms) :-( >>>> >>>> However, for performing exploratory data analysis on the application >>>> software, need this summary statistics information, such as minimum and >>>> maximum values. Legacy monitoring systems have always had this support, >>>> which in turn expects the new technology to fit the use case to ensure >>>> backward compatibility. >>>> >>>> Please share what can be done in this regard to secure this info. >>>> >>>> I'm thinking out loud, please correct/add wherever possible: >>>> >>>> 1. Does changing from Prometheus to OTEL instrumentation provide this >>>> feature (exact max and min duration time)? >>>> 2. Can metrics derived from distributed traces (instrumented with >>>> OTEL/Jaeger) be used to obtain minimum and maximum request durations? >>>> 3. Is it possible to secure the max and min duration time with >>>> Prometheus with any hack? >>>> a. For Classic Histograms? >>>> b. For Native Histograms? >>>> 4. A new PR/contribution on Prometheus to offer this support? >>>> >>>> Thanks, >>>> Teja >>>> >>>> On Thursday, June 19, 2025 at 6:38:59 PM UTC+2 Brian Candler wrote: >>>> >>>>> In general, I don't think you can get an accurate answer to that >>>>> question from a histogram. >>>>> >>>>> You can work out which *bucket* the lowest and highest request >>>>> durations sat in, which means you could give the lower and upper bounds >>>>> of >>>>> the minimum, and the lower and upper bounds of the maximum. Just compare >>>>> the bucket counters at the start and end of the time range, and find the >>>>> lowest boundary (le) which has changed, and the highest boundary which >>>>> has >>>>> changed. But this still doesn't tell you what the *actual* value was. >>>>> >>>>> I don't think there's any point in trying to make an estimate of the >>>>> actual value; these values are, by definition, outliers, so even if your >>>>> data points fitted a nice distribution, these ones would be at the ends >>>>> of >>>>> the curve and subject to high error. >>>>> >>>>> Your LLM answer is essentially what it says in the documentation >>>>> <https://prometheus.io/docs/prometheus/latest/querying/functions/#histogram_quantile> >>>>> >>>>> for histogram_quantile: >>>>> >>>>> *You can use histogram_quantile(0, v instant-vector) to get the >>>>> estimated minimum value stored in a histogram.* >>>>> >>>>> *You can use histogram_quantile(1, v instant-vector) to get the >>>>> estimated maximum value stored in a histogram.* >>>>> I thought it was worth testing. Here is a metric from my home >>>>> prometheus server, running 2.53.4: >>>>> >>>>> *go_gc_pauses_seconds_bucket* >>>>> => >>>>> go_gc_pauses_seconds_bucket{instance="localhost:9090", >>>>> job="prometheus", le="6.399999999999999e-08"} 0 >>>>> go_gc_pauses_seconds_bucket{instance="localhost:9090", >>>>> job="prometheus", le="6.399999999999999e-07"} 0 >>>>> go_gc_pauses_seconds_bucket{instance="localhost:9090", >>>>> job="prometheus", le="7.167999999999999e-06"} 12193 >>>>> go_gc_pauses_seconds_bucket{instance="localhost:9090", >>>>> job="prometheus", le="8.191999999999999e-05"} 15369 >>>>> go_gc_pauses_seconds_bucket{instance="localhost:9090", >>>>> job="prometheus", le="0.0009175039999999999"} 27038 >>>>> go_gc_pauses_seconds_bucket{instance="localhost:9090", >>>>> job="prometheus", le="0.010485759999999998"} 27085 >>>>> go_gc_pauses_seconds_bucket{instance="localhost:9090", >>>>> job="prometheus", le="0.11744051199999998"} 27086 >>>>> go_gc_pauses_seconds_bucket{instance="localhost:9090", >>>>> job="prometheus", le="+Inf"} 27086 >>>>> >>>>> *go_gc_pauses_seconds_bucket - go_gc_pauses_seconds_bucket offset 10m* >>>>> => >>>>> {instance="localhost:9090", job="prometheus", >>>>> le="6.399999999999999e-08"} 0 >>>>> {instance="localhost:9090", job="prometheus", >>>>> le="6.399999999999999e-07"} 0 >>>>> {instance="localhost:9090", job="prometheus", >>>>> le="7.167999999999999e-06"} 5 >>>>> {instance="localhost:9090", job="prometheus", >>>>> le="8.191999999999999e-05"} 5 >>>>> {instance="localhost:9090", job="prometheus", >>>>> le="0.0009175039999999999"} 10 >>>>> {instance="localhost:9090", job="prometheus", >>>>> le="0.010485759999999998"} 10 >>>>> {instance="localhost:9090", job="prometheus", >>>>> le="0.11744051199999998"} 10 >>>>> {instance="localhost:9090", job="prometheus", le="+Inf"} 10 >>>>> >>>>> *rate(go_gc_pauses_seconds_bucket[10m])* >>>>> => >>>>> {instance="localhost:9090", job="prometheus", >>>>> le="6.399999999999999e-08"} 0 >>>>> {instance="localhost:9090", job="prometheus", >>>>> le="6.399999999999999e-07"} 0 >>>>> {instance="localhost:9090", job="prometheus", >>>>> le="7.167999999999999e-06"} 0.007407407407407408 >>>>> {instance="localhost:9090", job="prometheus", >>>>> le="8.191999999999999e-05"} 0.007407407407407408 >>>>> {instance="localhost:9090", job="prometheus", >>>>> le="0.0009175039999999999"} 0.014814814814814815 >>>>> {instance="localhost:9090", job="prometheus", >>>>> le="0.010485759999999998"} 0.014814814814814815 >>>>> {instance="localhost:9090", job="prometheus", >>>>> le="0.11744051199999998"} 0.014814814814814815 >>>>> {instance="localhost:9090", job="prometheus", le="+Inf"} >>>>> 0.014814814814814815 >>>>> >>>>> Those exponential bucket boundaries in scientific notation aren't very >>>>> readable, but you can see that: >>>>> * the lowest response time must have been somewhere >>>>> between 6.399999999999999e-07 and 7.167999999999999e-06 >>>>> * the highest response time must have been somewhere between >>>>> 8.191999999999999e-05 and 0.0009175039999999999 >>>>> >>>>> Here are the answers from the formula the LLM suggested: >>>>> >>>>> >>>>> *histogram_quantile(0, rate(go_gc_pauses_seconds_bucket[10m]))*=> >>>>> {instance="localhost:9090", job="prometheus"} *NaN* >>>>> >>>>> *histogram_quantile(1, rate(go_gc_pauses_seconds_bucket[10m]))* >>>>> => >>>>> {instance="localhost:9090", job="prometheus"} *0.0009175039999999999* >>>>> >>>>> The lower boundary of "NaN" is not useful at all (possibly this is a >>>>> bug?), but I found I could get a value by specifying a very low, but >>>>> non-zero, quantile: >>>>> >>>>> >>>>> *histogram_quantile(0.000000001, >>>>> rate(go_gc_pauses_seconds_bucket[10m]))* >>>>> => >>>>> {instance="localhost:9090", job="prometheus"} *6.40000013056e-07* >>>>> >>>>> Those values *do* sit between the boundaries given: >>>>> >>>>> >>> 6.399999999999999e-07 < 6.40000013056e-07 <= 7.167999999999999e-06 >>>>> True >>>>> >>> 8.191999999999999e-05 < 0.0009175039999999999 <= >>>>> 0.0009175039999999999 >>>>> True >>>>> >>>>> In fact, the "minimum" answer is very close to the lower edge of the >>>>> relevant bucket, and the "maximum" is the upper edge of the relevant >>>>> bucket. >>>>> >>>>> Therefore, these are not the *actual* minimum and maximum request >>>>> times. In effect, they are saying "the minimum request time was *more >>>>> than* 6.399999999999999e-07, and the maximum request time was *no >>>>> more than* 0.0009175039999999999". But that's as good as you can get >>>>> with a histogram. >>>>> >>>>> On Wednesday, 18 June 2025 at 18:17:15 UTC+1 tejaswini vadlamudi wrote: >>>>> >>>>>> Including answer from Gen-AI: >>>>>> >>>>>> | Description | PromQL Query >>>>>> >>>>>> >>>>>> | Notes >>>>>> >>>>>> | >>>>>> >>>>>> |-------------------------------------|------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------| >>>>>> | Minimum request duration (1m) | histogram_quantile(0, sum by >>>>>> (le) (rate(http_request_duration_seconds_bucket[1m]))) >>>>>> >>>>>> | Fast but may be noisy or return NaN if low traffic. Good for >>>>>> near-real-time. | >>>>>> | Maximum request duration (1m) | histogram_quantile(1, sum by >>>>>> (le) (rate(http_request_duration_seconds_bucket[1m]))) >>>>>> >>>>>> | Same as above, for longest duration estimate. >>>>>> >>>>>> | >>>>>> | Minimum request duration (5m) | histogram_quantile(0, sum by >>>>>> (le) (rate(http_request_duration_seconds_bucket[5m]))) >>>>>> >>>>>> | More stable, smoother estimate over a slightly longer window. >>>>>> >>>>>> | >>>>>> | Maximum request duration (5m) | histogram_quantile(1, sum by >>>>>> (le) (rate(http_request_duration_seconds_bucket[5m]))) >>>>>> >>>>>> | Recommended when traffic is bursty or histogram series are >>>>>> sparse. | >>>>>> >>>>>> Please confirm if the above answer is reliable or not. >>>>>> On Wednesday, June 18, 2025 at 3:23:54 PM UTC+2 tejaswini vadlamudi >>>>>> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I’m using Prometheus to monitor request durations via a histogram >>>>>>> metric, e.g., http_request_duration_seconds_bucket. I would like to >>>>>>> query: >>>>>>> >>>>>>> - The minimum time taken by a request >>>>>>> - The maximum time taken by a request >>>>>>> >>>>>>> …over a given time range (say, the last 1h or 24h). >>>>>>> >>>>>>> I understand that histogram buckets give cumulative counts of >>>>>>> requests below certain durations, but I’m not sure how to extract the >>>>>>> actual min or max values of request durations during a time window. >>>>>>> >>>>>>> Is this possible directly via PromQL? Or is there a recommended >>>>>>> workaround (e.g., recording rules, external processing, or using >>>>>>> histogram_quantile() in a specific way)? >>>>>>> >>>>>>> Thanks in advance for any guidance! >>>>>>> >>>>>>> Br, >>>>>>> Teja >>>>>>> >>>>>> -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion visit https://groups.google.com/d/msgid/prometheus-users/b0ea258b-2465-4c63-aa66-88fa03fa62abn%40googlegroups.com.

