If I try to instrument a native histogram using Prometheus Client Libraries
and I see below output for GET on /metrics
# HELP test_http_request_duration_seconds HTTP latency distribution for
/users (random delay, occasional error)
# TYPE test_http_request_duration_seconds histogram
test_http_request_duration_seconds_bucket{endpoint="/users",method="GET",status_code="200",le="+Inf"}
13
test_http_request_duration_seconds_sum{endpoint="/users",method="GET",status_code="200"}
0.5860522520000001
test_http_request_duration_seconds_count{endpoint="/users",method="GET",status_code="200"}
13
test_http_request_duration_seconds_bucket{endpoint="/users",method="GET",status_code="500",le="+Inf"}
4
test_http_request_duration_seconds_sum{endpoint="/users",method="GET",status_code="500"}
0.20080665900000003
test_http_request_duration_seconds_count{endpoint="/users",method="GET",status_code="500"}
4
If I query Prometheus, it results:
{
"status": "success",
"data": {
"result": [
{
"metric": {
"__name__": "test_http_request_duration_seconds",
"endpoint": "/users",
"method": "GET",
"status_code": "200"
},
"histogram": [
1750686407.931,
{
"count": "7",
"sum": "0.35453753899999996",
"buckets": [
[0, "0.005065779510355506", "0.005524271728019902", "1"],
[0, "0.014328188175072986", "0.015625", "1"],
[0, "0.03716272234383503", "0.04052623608284405", "1"],
[0, "0.0625", "0.0681567332915786", "2"],
[0, "0.0810524721656881", "0.08838834764831843", "2"]
]
}
]
},
{
"metric": {
"status_code": "500"
},
"histogram": [
1750686407.931,
{
"count": "2",
"sum": "0.127420411",
"buckets": [
[0, "0.057312752700291944", "0.0625", "1"],
[0, "0.0681567332915786", "0.07432544468767006", "1"]
]
}
]
}
]
}
}
I think histogram_quantile(0, rate(http_request_duration_seconds[1m]))
will give me min value. Is it correct?
Q2. But with OTEL, even min & max values are encoded. I don't understand
how to get such support in Prometheus Native format.
On Monday, June 23, 2025 at 3:31:34 PM UTC+2 tejaswini vadlamudi wrote:
> Thanks Brain, I don't want to go towards Summaries, but with histograms,
> mainly with Native Histograms, is there a possibility to get Max and Min
> values for a period of time?
>
> With OTEL-based metrics instrumentation, it is possible to record max and
> min values. See
> https://opentelemetry.io/docs/specs/otel/metrics/data-model/#histogram
>
> *Histograms consist of the following:*
>
> - An *Aggregation Temporality* of delta or cumulative.
> - A set of data points, each containing:
> - An independent set of Attribute name-value pairs.
> - A time window (of (start, end]) time for which the Histogram was
> bundled.
> - The time interval is inclusive of the end time.
> - Time values are specified as nanoseconds since the UNIX Epoch
> (00:00:00 UTC on 1 January 1970).
> - A count (count) of the total population of points in the
> histogram.
> - A sum (sum) of all the values in the histogram.
> - *(optional) The min (min) of all values in the histogram.*
> - *(optional) The max (max) of all values in the histogram.*
>
>
> Br,
> Teja
>
> On Monday, June 23, 2025 at 2:21:03 PM UTC+2 Brian Candler wrote:
>
>> Also relevant:
>>
>> https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/33645
>> https://groups.google.com/g/prometheus-developers/c/dGEaTR7Hyi0
>>
>> On Monday, 23 June 2025 at 13:17:57 UTC+1 Brian Candler wrote:
>>
>>> Nice explanation of summaries here:
>>>
>>> https://grafana.com/blog/2022/03/01/how-summary-metrics-work-in-prometheus/
>>>
>>> On Monday, 23 June 2025 at 12:42:35 UTC+1 Brian Candler wrote:
>>>
>>>> Remember that histograms don't store values. All they do is increment a
>>>> counter by 1; the value is only used to select which bucket to increment.
>>>> This means that the amount of storage used by a histogram is very small -
>>>> a
>>>> fixed number of buckets with one counter each. It doesn't matter if you
>>>> are
>>>> processing 1 sample per second or 10,000 samples per second.
>>>>
>>>> If you wanted to retrieve the *exact* lowest or highest value, over
>>>> *any* arbitrary time period that you query, you would have to store every
>>>> single value into a database. Prometheus is not a event logging system,
>>>> and
>>>> it will never work this way. A columnar datastore like Clickhouse can do
>>>> that quite well, but if the number of samples is large, you will still
>>>> have
>>>> a very large storage issue.
>>>>
>>>> More realistically, you could find the minimum or maximum value seen
>>>> over a fixed time period (say one minute), and at the end of that minute,
>>>> export the min/max value seen. That's cheap and quick. Indeed, you could
>>>> do
>>>> it over a relatively short time period (e.g. 1 second), and use
>>>> prometheus'
>>>> min/max_over_time functions if you want to query a longer period, i.e. to
>>>> find the min of the mins, or the max of the maxes. You need to make sure
>>>> that every distinct min/max value ends up in the database though; either
>>>> use remote_write to push them, or scrape your exporter at least twice as
>>>> fast as the min/max values are changing.
>>>>
>>>> In my experience, people are often not so interested in the single
>>>> minimum or maximum value, but in the quantiles, such as the 1st percentile
>>>> ("the fastest 1% of queries were answered in less than X seconds") or the
>>>> 99th percentile ("the slowest 1% of queries were answered in more than Y
>>>> seconds"). Prometheus can help you using a data type called a "summary":
>>>> https://prometheus.io/docs/concepts/metric_types/#summary
>>>> https://prometheus.io/docs/practices/histograms/#quantiles
>>>>
>>>> A summary can give you very good estimates of the percentiles over a
>>>> sliding time window (of a size you have to choose in advance), and uses a
>>>> relatively small amount of storage like a histogram. It is better than a
>>>> histogram in the case where you don't know in advance what the highest and
>>>> lowest values are likely to be (i.e. you don't need to pre-allocate your
>>>> bucket boundaries correctly).
>>>>
>>>> On Monday, 23 June 2025 at 08:15:42 UTC+1 tejaswini vadlamudi wrote:
>>>>
>>>>> Thanks Brain, for the clear heads-up and explanation!
>>>>>
>>>>> It looks to me that there is no possibility to secure exact maximum
>>>>> and exact minimum values for durations (based on Prometheus histograms)
>>>>> :-(
>>>>>
>>>>> However, for performing exploratory data analysis on the application
>>>>> software, need this summary statistics information, such as minimum and
>>>>> maximum values. Legacy monitoring systems have always had this support,
>>>>> which in turn expects the new technology to fit the use case to ensure
>>>>> backward compatibility.
>>>>>
>>>>> Please share what can be done in this regard to secure this info.
>>>>>
>>>>> I'm thinking out loud, please correct/add wherever possible:
>>>>>
>>>>> 1. Does changing from Prometheus to OTEL instrumentation provide this
>>>>> feature (exact max and min duration time)?
>>>>> 2. Can metrics derived from distributed traces (instrumented with
>>>>> OTEL/Jaeger) be used to obtain minimum and maximum request durations?
>>>>> 3. Is it possible to secure the max and min duration time with
>>>>> Prometheus with any hack?
>>>>> a. For Classic Histograms?
>>>>> b. For Native Histograms?
>>>>> 4. A new PR/contribution on Prometheus to offer this support?
>>>>>
>>>>> Thanks,
>>>>> Teja
>>>>>
>>>>> On Thursday, June 19, 2025 at 6:38:59 PM UTC+2 Brian Candler wrote:
>>>>>
>>>>>> In general, I don't think you can get an accurate answer to that
>>>>>> question from a histogram.
>>>>>>
>>>>>> You can work out which *bucket* the lowest and highest request
>>>>>> durations sat in, which means you could give the lower and upper bounds
>>>>>> of
>>>>>> the minimum, and the lower and upper bounds of the maximum. Just compare
>>>>>> the bucket counters at the start and end of the time range, and find the
>>>>>> lowest boundary (le) which has changed, and the highest boundary which
>>>>>> has
>>>>>> changed. But this still doesn't tell you what the *actual* value was.
>>>>>>
>>>>>> I don't think there's any point in trying to make an estimate of the
>>>>>> actual value; these values are, by definition, outliers, so even if your
>>>>>> data points fitted a nice distribution, these ones would be at the ends
>>>>>> of
>>>>>> the curve and subject to high error.
>>>>>>
>>>>>> Your LLM answer is essentially what it says in the documentation
>>>>>> <https://prometheus.io/docs/prometheus/latest/querying/functions/#histogram_quantile>
>>>>>>
>>>>>> for histogram_quantile:
>>>>>>
>>>>>> *You can use histogram_quantile(0, v instant-vector) to get the
>>>>>> estimated minimum value stored in a histogram.*
>>>>>>
>>>>>> *You can use histogram_quantile(1, v instant-vector) to get the
>>>>>> estimated maximum value stored in a histogram.*
>>>>>> I thought it was worth testing. Here is a metric from my home
>>>>>> prometheus server, running 2.53.4:
>>>>>>
>>>>>> *go_gc_pauses_seconds_bucket*
>>>>>> =>
>>>>>> go_gc_pauses_seconds_bucket{instance="localhost:9090",
>>>>>> job="prometheus", le="6.399999999999999e-08"} 0
>>>>>> go_gc_pauses_seconds_bucket{instance="localhost:9090",
>>>>>> job="prometheus", le="6.399999999999999e-07"} 0
>>>>>> go_gc_pauses_seconds_bucket{instance="localhost:9090",
>>>>>> job="prometheus", le="7.167999999999999e-06"} 12193
>>>>>> go_gc_pauses_seconds_bucket{instance="localhost:9090",
>>>>>> job="prometheus", le="8.191999999999999e-05"} 15369
>>>>>> go_gc_pauses_seconds_bucket{instance="localhost:9090",
>>>>>> job="prometheus", le="0.0009175039999999999"} 27038
>>>>>> go_gc_pauses_seconds_bucket{instance="localhost:9090",
>>>>>> job="prometheus", le="0.010485759999999998"} 27085
>>>>>> go_gc_pauses_seconds_bucket{instance="localhost:9090",
>>>>>> job="prometheus", le="0.11744051199999998"} 27086
>>>>>> go_gc_pauses_seconds_bucket{instance="localhost:9090",
>>>>>> job="prometheus", le="+Inf"} 27086
>>>>>>
>>>>>> *go_gc_pauses_seconds_bucket - go_gc_pauses_seconds_bucket offset 10m*
>>>>>> =>
>>>>>> {instance="localhost:9090", job="prometheus",
>>>>>> le="6.399999999999999e-08"} 0
>>>>>> {instance="localhost:9090", job="prometheus",
>>>>>> le="6.399999999999999e-07"} 0
>>>>>> {instance="localhost:9090", job="prometheus",
>>>>>> le="7.167999999999999e-06"} 5
>>>>>> {instance="localhost:9090", job="prometheus",
>>>>>> le="8.191999999999999e-05"} 5
>>>>>> {instance="localhost:9090", job="prometheus",
>>>>>> le="0.0009175039999999999"} 10
>>>>>> {instance="localhost:9090", job="prometheus",
>>>>>> le="0.010485759999999998"} 10
>>>>>> {instance="localhost:9090", job="prometheus",
>>>>>> le="0.11744051199999998"} 10
>>>>>> {instance="localhost:9090", job="prometheus", le="+Inf"} 10
>>>>>>
>>>>>> *rate(go_gc_pauses_seconds_bucket[10m])*
>>>>>> =>
>>>>>> {instance="localhost:9090", job="prometheus",
>>>>>> le="6.399999999999999e-08"} 0
>>>>>> {instance="localhost:9090", job="prometheus",
>>>>>> le="6.399999999999999e-07"} 0
>>>>>> {instance="localhost:9090", job="prometheus",
>>>>>> le="7.167999999999999e-06"} 0.007407407407407408
>>>>>> {instance="localhost:9090", job="prometheus",
>>>>>> le="8.191999999999999e-05"} 0.007407407407407408
>>>>>> {instance="localhost:9090", job="prometheus",
>>>>>> le="0.0009175039999999999"} 0.014814814814814815
>>>>>> {instance="localhost:9090", job="prometheus",
>>>>>> le="0.010485759999999998"} 0.014814814814814815
>>>>>> {instance="localhost:9090", job="prometheus",
>>>>>> le="0.11744051199999998"} 0.014814814814814815
>>>>>> {instance="localhost:9090", job="prometheus", le="+Inf"}
>>>>>> 0.014814814814814815
>>>>>>
>>>>>> Those exponential bucket boundaries in scientific notation aren't
>>>>>> very readable, but you can see that:
>>>>>> * the lowest response time must have been somewhere
>>>>>> between 6.399999999999999e-07 and 7.167999999999999e-06
>>>>>> * the highest response time must have been somewhere between
>>>>>> 8.191999999999999e-05 and 0.0009175039999999999
>>>>>>
>>>>>> Here are the answers from the formula the LLM suggested:
>>>>>>
>>>>>>
>>>>>> *histogram_quantile(0, rate(go_gc_pauses_seconds_bucket[10m]))*=>
>>>>>> {instance="localhost:9090", job="prometheus"} *NaN*
>>>>>>
>>>>>> *histogram_quantile(1, rate(go_gc_pauses_seconds_bucket[10m]))*
>>>>>> =>
>>>>>> {instance="localhost:9090", job="prometheus"} *0.0009175039999999999*
>>>>>>
>>>>>> The lower boundary of "NaN" is not useful at all (possibly this is a
>>>>>> bug?), but I found I could get a value by specifying a very low, but
>>>>>> non-zero, quantile:
>>>>>>
>>>>>>
>>>>>> *histogram_quantile(0.000000001,
>>>>>> rate(go_gc_pauses_seconds_bucket[10m]))*
>>>>>> =>
>>>>>> {instance="localhost:9090", job="prometheus"} *6.40000013056e-07*
>>>>>>
>>>>>> Those values *do* sit between the boundaries given:
>>>>>>
>>>>>> >>> 6.399999999999999e-07 < 6.40000013056e-07 <= 7.167999999999999e-06
>>>>>> True
>>>>>> >>> 8.191999999999999e-05 < 0.0009175039999999999 <=
>>>>>> 0.0009175039999999999
>>>>>> True
>>>>>>
>>>>>> In fact, the "minimum" answer is very close to the lower edge of the
>>>>>> relevant bucket, and the "maximum" is the upper edge of the relevant
>>>>>> bucket.
>>>>>>
>>>>>> Therefore, these are not the *actual* minimum and maximum request
>>>>>> times. In effect, they are saying "the minimum request time was *more
>>>>>> than* 6.399999999999999e-07, and the maximum request time was *no
>>>>>> more than* 0.0009175039999999999". But that's as good as you can
>>>>>> get with a histogram.
>>>>>>
>>>>>> On Wednesday, 18 June 2025 at 18:17:15 UTC+1 tejaswini vadlamudi
>>>>>> wrote:
>>>>>>
>>>>>>> Including answer from Gen-AI:
>>>>>>>
>>>>>>> | Description | PromQL Query
>>>>>>>
>>>>>>>
>>>>>>> | Notes
>>>>>>>
>>>>>>> |
>>>>>>>
>>>>>>> |-------------------------------------|------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------|
>>>>>>> | Minimum request duration (1m) | histogram_quantile(0, sum by
>>>>>>> (le) (rate(http_request_duration_seconds_bucket[1m])))
>>>>>>>
>>>>>>> | Fast but may be noisy or return NaN if low traffic. Good for
>>>>>>> near-real-time. |
>>>>>>> | Maximum request duration (1m) | histogram_quantile(1, sum by
>>>>>>> (le) (rate(http_request_duration_seconds_bucket[1m])))
>>>>>>>
>>>>>>> | Same as above, for longest duration estimate.
>>>>>>>
>>>>>>> |
>>>>>>> | Minimum request duration (5m) | histogram_quantile(0, sum by
>>>>>>> (le) (rate(http_request_duration_seconds_bucket[5m])))
>>>>>>>
>>>>>>> | More stable, smoother estimate over a slightly longer window.
>>>>>>>
>>>>>>> |
>>>>>>> | Maximum request duration (5m) | histogram_quantile(1, sum by
>>>>>>> (le) (rate(http_request_duration_seconds_bucket[5m])))
>>>>>>>
>>>>>>> | Recommended when traffic is bursty or histogram series are
>>>>>>> sparse. |
>>>>>>>
>>>>>>> Please confirm if the above answer is reliable or not.
>>>>>>> On Wednesday, June 18, 2025 at 3:23:54 PM UTC+2 tejaswini vadlamudi
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I’m using Prometheus to monitor request durations via a histogram
>>>>>>>> metric, e.g., http_request_duration_seconds_bucket. I would like
>>>>>>>> to query:
>>>>>>>>
>>>>>>>> - The minimum time taken by a request
>>>>>>>> - The maximum time taken by a request
>>>>>>>>
>>>>>>>> …over a given time range (say, the last 1h or 24h).
>>>>>>>>
>>>>>>>> I understand that histogram buckets give cumulative counts of
>>>>>>>> requests below certain durations, but I’m not sure how to extract the
>>>>>>>> actual min or max values of request durations during a time window.
>>>>>>>>
>>>>>>>> Is this possible directly via PromQL? Or is there a recommended
>>>>>>>> workaround (e.g., recording rules, external processing, or using
>>>>>>>> histogram_quantile() in a specific way)?
>>>>>>>>
>>>>>>>> Thanks in advance for any guidance!
>>>>>>>>
>>>>>>>> Br,
>>>>>>>> Teja
>>>>>>>>
>>>>>>>
--
You received this message because you are subscribed to the Google Groups
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion visit
https://groups.google.com/d/msgid/prometheus-users/d65b7eba-d8f1-4655-b9bf-d1c6fbdcf8b9n%40googlegroups.com.