Re: [prometheus-users] synthetic histograms in Prometheus

2022-08-07 Thread Johny
The application publishes metrics to a remote-write endpoint in a 
Prometheus shard at the moment; in future we've plans to migrate to pull 
model as much as possible after building service discovery for native 
deployments -- but for backward compatibility, we are adopting this 
approach currently. 



On Sunday, August 7, 2022 at 3:48:44 PM UTC-4 sup...@gmail.com wrote:

> To put it another way. If you can read every event raw from a log line, 
> like every request has a "took X milliseconds", there are better ways to 
> reconstruct metrics for your use case.
>
> On Sun, Aug 7, 2022 at 9:46 PM Ben Kochie  wrote:
>
>> Right, but more basic, how do you get this information from the 
>> application right now? Are you reading logs? Does it emit statsd data? 
>>
>> You're saying what, but not how.
>>
>> On Sun, Aug 7, 2022 at 7:15 PM Johny  wrote:
>>
>>> Gauge contains most recent values of a metric, sampled every 1 min or 
>>> so, and exported by a user application, e.g. some latency sampled at 1 
>>> minute intervals by a client application. Lets presume this time series 
>>> (scraped by Prometheus or sent via remote write) is absolute containing all 
>>> the information we need for calculating derived statistics. In the most raw 
>>> form, you can fetch the data points, sort them and calculate percentile. 
>>> Incidentally, legacy backend has efficient mechanisms to calculate 
>>> percentiles by scanning and reducing data using map-reduce. 
>>>
>>>
>>>  
>>> On Sunday, August 7, 2022 at 7:49:05 AM UTC-4 sup...@gmail.com wrote:
>>>
 So, let's take a step back and find out some more information, because 
 this question is sounding a lot like an XY Problem.

 How are the current applications generating their metrics right now?
 How are you getting the data to create these histograms?

 On Sun, Aug 7, 2022 at 9:23 AM Johny  wrote:

> We are migrating telemetry backend from legacy database to Prometheus 
> and require estimating percentiles on gauge metrics published by user 
> applications. Estimating percentiles on a gauge metric in Prometheus is 
> not 
> feasible and for a number of reasons, client applications will be 
> difficult 
> to modify to start publishing histograms. 
>
> I am exploring feasibility of creating a histogram in a recording rule 
> in Prometheus based on the metrics published by users. The partial work 
> put 
> in so far seems inefficient, also illegible. Is there a recommended 
> approach to solve this problem? As stated earlier, it will be extremely 
> hard to solve the problem on the client side and I am looking for a 
> solution within Prometheus.
>
> *Current metric is a gauge with with values representing request 
> latency.*
> http_duration_milliseconds_gauge{instance="instance1:port1"}[1h]
> 1659752188  100
> 1659752068  120
> ..
> 1659751708   150
> 1659751588160
>
> *Desired histogram after conversion -*
> http_duration_milliseconds_hist_bucket{instance="instance1:port1", 
> le=100}  133
> http_duration_milliseconds_hist_bucket{instance="instance1:port1", 
> le=120}  222
> http_duration_milliseconds_hist_bucket{instance="instance1:port1", 
> le=140}  311
> http_duration_milliseconds_hist_bucket{instance="instance1:port1", 
> le=160}  330
> http_duration_milliseconds_hist_bucket{instance="instance1:port1", 
> le=180}  339
> http_duration_milliseconds_hist_bucket{instance="instance1:port1", 
> le=200}  340
>
>
>
>
>
>
>
> -- 
> You received this message because you are subscribed to the Google 
> Groups "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send 
> an email to prometheus-use...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/prometheus-users/f95b5512-1c81-4e12-9670-7c7eb0d29f5en%40googlegroups.com
>  
> 
> .
>
 -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "Prometheus Users" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to prometheus-use...@googlegroups.com.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/prometheus-users/c8e3c184-e88d-4217-badc-f5f779b52af3n%40googlegroups.com
>>>  
>>> 
>>> .
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to 

Re: [prometheus-users] synthetic histograms in Prometheus

2022-08-07 Thread Ben Kochie
To put it another way. If you can read every event raw from a log line,
like every request has a "took X milliseconds", there are better ways to
reconstruct metrics for your use case.

On Sun, Aug 7, 2022 at 9:46 PM Ben Kochie  wrote:

> Right, but more basic, how do you get this information from the
> application right now? Are you reading logs? Does it emit statsd data?
>
> You're saying what, but not how.
>
> On Sun, Aug 7, 2022 at 7:15 PM Johny  wrote:
>
>> Gauge contains most recent values of a metric, sampled every 1 min or so,
>> and exported by a user application, e.g. some latency sampled at 1 minute
>> intervals by a client application. Lets presume this time series (scraped
>> by Prometheus or sent via remote write) is absolute containing all the
>> information we need for calculating derived statistics. In the most raw
>> form, you can fetch the data points, sort them and calculate percentile.
>> Incidentally, legacy backend has efficient mechanisms to calculate
>> percentiles by scanning and reducing data using map-reduce.
>>
>>
>>
>> On Sunday, August 7, 2022 at 7:49:05 AM UTC-4 sup...@gmail.com wrote:
>>
>>> So, let's take a step back and find out some more information, because
>>> this question is sounding a lot like an XY Problem.
>>>
>>> How are the current applications generating their metrics right now?
>>> How are you getting the data to create these histograms?
>>>
>>> On Sun, Aug 7, 2022 at 9:23 AM Johny  wrote:
>>>
 We are migrating telemetry backend from legacy database to Prometheus
 and require estimating percentiles on gauge metrics published by user
 applications. Estimating percentiles on a gauge metric in Prometheus is not
 feasible and for a number of reasons, client applications will be difficult
 to modify to start publishing histograms.

 I am exploring feasibility of creating a histogram in a recording rule
 in Prometheus based on the metrics published by users. The partial work put
 in so far seems inefficient, also illegible. Is there a recommended
 approach to solve this problem? As stated earlier, it will be extremely
 hard to solve the problem on the client side and I am looking for a
 solution within Prometheus.

 *Current metric is a gauge with with values representing request
 latency.*
 http_duration_milliseconds_gauge{instance="instance1:port1"}[1h]
 1659752188  100
 1659752068  120
 ..
 1659751708   150
 1659751588160

 *Desired histogram after conversion -*
 http_duration_milliseconds_hist_bucket{instance="instance1:port1",
 le=100}  133
 http_duration_milliseconds_hist_bucket{instance="instance1:port1",
 le=120}  222
 http_duration_milliseconds_hist_bucket{instance="instance1:port1",
 le=140}  311
 http_duration_milliseconds_hist_bucket{instance="instance1:port1",
 le=160}  330
 http_duration_milliseconds_hist_bucket{instance="instance1:port1",
 le=180}  339
 http_duration_milliseconds_hist_bucket{instance="instance1:port1",
 le=200}  340







 --
 You received this message because you are subscribed to the Google
 Groups "Prometheus Users" group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to prometheus-use...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/prometheus-users/f95b5512-1c81-4e12-9670-7c7eb0d29f5en%40googlegroups.com
 
 .

>>> --
>> You received this message because you are subscribed to the Google Groups
>> "Prometheus Users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to prometheus-users+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/prometheus-users/c8e3c184-e88d-4217-badc-f5f779b52af3n%40googlegroups.com
>> 
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CABbyFmqRSt0M1cuEo%2Bbutq%2BedqocGgyFi%3Dw6nxELUfAWq1L4JQ%40mail.gmail.com.


Re: [prometheus-users] synthetic histograms in Prometheus

2022-08-07 Thread Ben Kochie
Right, but more basic, how do you get this information from the application
right now? Are you reading logs? Does it emit statsd data?

You're saying what, but not how.

On Sun, Aug 7, 2022 at 7:15 PM Johny  wrote:

> Gauge contains most recent values of a metric, sampled every 1 min or so,
> and exported by a user application, e.g. some latency sampled at 1 minute
> intervals by a client application. Lets presume this time series (scraped
> by Prometheus or sent via remote write) is absolute containing all the
> information we need for calculating derived statistics. In the most raw
> form, you can fetch the data points, sort them and calculate percentile.
> Incidentally, legacy backend has efficient mechanisms to calculate
> percentiles by scanning and reducing data using map-reduce.
>
>
>
> On Sunday, August 7, 2022 at 7:49:05 AM UTC-4 sup...@gmail.com wrote:
>
>> So, let's take a step back and find out some more information, because
>> this question is sounding a lot like an XY Problem.
>>
>> How are the current applications generating their metrics right now?
>> How are you getting the data to create these histograms?
>>
>> On Sun, Aug 7, 2022 at 9:23 AM Johny  wrote:
>>
>>> We are migrating telemetry backend from legacy database to Prometheus
>>> and require estimating percentiles on gauge metrics published by user
>>> applications. Estimating percentiles on a gauge metric in Prometheus is not
>>> feasible and for a number of reasons, client applications will be difficult
>>> to modify to start publishing histograms.
>>>
>>> I am exploring feasibility of creating a histogram in a recording rule
>>> in Prometheus based on the metrics published by users. The partial work put
>>> in so far seems inefficient, also illegible. Is there a recommended
>>> approach to solve this problem? As stated earlier, it will be extremely
>>> hard to solve the problem on the client side and I am looking for a
>>> solution within Prometheus.
>>>
>>> *Current metric is a gauge with with values representing request
>>> latency.*
>>> http_duration_milliseconds_gauge{instance="instance1:port1"}[1h]
>>> 1659752188  100
>>> 1659752068  120
>>> ..
>>> 1659751708   150
>>> 1659751588160
>>>
>>> *Desired histogram after conversion -*
>>> http_duration_milliseconds_hist_bucket{instance="instance1:port1",
>>> le=100}  133
>>> http_duration_milliseconds_hist_bucket{instance="instance1:port1",
>>> le=120}  222
>>> http_duration_milliseconds_hist_bucket{instance="instance1:port1",
>>> le=140}  311
>>> http_duration_milliseconds_hist_bucket{instance="instance1:port1",
>>> le=160}  330
>>> http_duration_milliseconds_hist_bucket{instance="instance1:port1",
>>> le=180}  339
>>> http_duration_milliseconds_hist_bucket{instance="instance1:port1",
>>> le=200}  340
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "Prometheus Users" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to prometheus-use...@googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/prometheus-users/f95b5512-1c81-4e12-9670-7c7eb0d29f5en%40googlegroups.com
>>> 
>>> .
>>>
>> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/c8e3c184-e88d-4217-badc-f5f779b52af3n%40googlegroups.com
> 
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CABbyFmqJ8yqAw_Qa1yzz_bUkUDfhfk58M%3Dx5UCtOgLuj%3Da-VFg%40mail.gmail.com.


Re: [prometheus-users] synthetic histograms in Prometheus

2022-08-07 Thread Johny
Thanks. While I understand the limitations with a gauge, the objective here 
is to backport existing reports with the new backend, integrate and 
optimize later. There is a period of time we need to continue backward 
compatibility due to high barrier to change in clients. The time window 
used to calculate percentiles is biweekly or months, so taking the last/avg 
window within 1 minute (or few seconds in some cases) window is not too far 
fetched, and accepted by users.  In light of this, is there a reasonable 
approach to recreate histograms/summaries from existing metrics within 
Prometheus?



On Sunday, August 7, 2022 at 2:18:42 PM UTC-4 Stuart Clark wrote:

> On 07/08/2022 18:14, Johny wrote:
> > Gauge contains most recent values of a metric, sampled every 1 min or 
> > so, and exported by a user application, e.g. some latency sampled at 1 
> > minute intervals by a client application. Lets presume this time 
> > series (scraped by Prometheus or sent via remote write) is absolute 
> > containing all the information we need for calculating derived 
> > statistics. In the most raw form, you can fetch the data points, sort 
> > them and calculate percentile. Incidentally, legacy backend has 
> > efficient mechanisms to calculate percentiles by scanning and reducing 
> > data using map-reduce.
>
> I'm presuming there are more than one request/event every minute or so?
>
> If that is the case it would mean that you can't make a histogram that 
> shows what you actually want to know. While in theory you could look at 
> the 60 samples per hour and plot those on a histogram it would be pretty 
> meaningless. If we assumed 1 request per second, sampling the latest 
> latency value every minute would mean that 59/60 events are being 
> discarded - so you have no idea what is actually happening from looking 
> at that single sampled latency. Your samples could all be returning 
> "low" values, which makes you believe that everything is working fine, 
> but in actual fact the other 59 events per minute are "high" and you 
> would never know.
>
> This is the reason why histograms exist, and why more generally counters 
> are more useful than gauges. A gauge can only tell you about "now" which 
> may or may not be representative of what has actually been happening 
> since the last scrape. A counter however will tell you the absolute 
> change since the last scrape (e.g. the total number of requests since 
> the previous scrape, or the sum of the latencies of all events since the 
> scrape) meaning you never lose information (a counter that represents 
> total latency won't let you know if there was one spike or everything 
> was slow, but it will give you an average since the last scrape instead 
> of losing data).
>
> It would be worth understanding why you aren't able to produce a 
> histogram in the application (or externally via processing an event 
> feed, such as logs)? By design a simple histogram is pretty low impact, 
> being a set of counters for each bucket.
>
> -- 
> Stuart Clark
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/2c1915be-fb9f-4858-91ef-bdc22dcac675n%40googlegroups.com.


Re: [prometheus-users] synthetic histograms in Prometheus

2022-08-07 Thread Stuart Clark

On 07/08/2022 18:14, Johny wrote:
Gauge contains most recent values of a metric, sampled every 1 min or 
so, and exported by a user application, e.g. some latency sampled at 1 
minute intervals by a client application. Lets presume this time 
series (scraped by Prometheus or sent via remote write) is absolute 
containing all the information we need for calculating derived 
statistics. In the most raw form, you can fetch the data points, sort 
them and calculate percentile. Incidentally, legacy backend has 
efficient mechanisms to calculate percentiles by scanning and reducing 
data using map-reduce.


I'm presuming there are more than one request/event every minute or so?

If that is the case it would mean that you can't make a histogram that 
shows what you actually want to know. While in theory you could look at 
the 60 samples per hour and plot those on a histogram it would be pretty 
meaningless. If we assumed 1 request per second, sampling the latest 
latency value every minute would mean that 59/60 events are being 
discarded - so you have no idea what is actually happening from looking 
at that single sampled latency. Your samples could all be returning 
"low" values, which makes you believe that everything is working fine, 
but in actual fact the other 59 events per minute are "high" and you 
would never know.


This is the reason why histograms exist, and why more generally counters 
are more useful than gauges. A gauge can only tell you about "now" which 
may or may not be representative of what has actually been happening 
since the last scrape. A counter however will tell you the absolute 
change since the last scrape (e.g. the total number of requests since 
the previous scrape, or the sum of the latencies of all events since the 
scrape) meaning you never lose information (a counter that represents 
total latency won't let you know if there was one spike or everything 
was slow, but it will give you an average since the last scrape instead 
of losing data).


It would be worth understanding why you aren't able to produce a 
histogram in the application (or externally via processing an event 
feed, such as logs)? By design a simple histogram is pretty low impact, 
being a set of counters for each bucket.


--
Stuart Clark

--
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/57aa312a-b216-6626-2ed8-f8591980b026%40Jahingo.com.


Re: [prometheus-users] synthetic histograms in Prometheus

2022-08-07 Thread Johny
Gauge contains most recent values of a metric, sampled every 1 min or so, 
and exported by a user application, e.g. some latency sampled at 1 minute 
intervals by a client application. Lets presume this time series (scraped 
by Prometheus or sent via remote write) is absolute containing all the 
information we need for calculating derived statistics. In the most raw 
form, you can fetch the data points, sort them and calculate percentile. 
Incidentally, legacy backend has efficient mechanisms to calculate 
percentiles by scanning and reducing data using map-reduce. 


 
On Sunday, August 7, 2022 at 7:49:05 AM UTC-4 sup...@gmail.com wrote:

> So, let's take a step back and find out some more information, because 
> this question is sounding a lot like an XY Problem.
>
> How are the current applications generating their metrics right now?
> How are you getting the data to create these histograms?
>
> On Sun, Aug 7, 2022 at 9:23 AM Johny  wrote:
>
>> We are migrating telemetry backend from legacy database to Prometheus and 
>> require estimating percentiles on gauge metrics published by user 
>> applications. Estimating percentiles on a gauge metric in Prometheus is not 
>> feasible and for a number of reasons, client applications will be difficult 
>> to modify to start publishing histograms. 
>>
>> I am exploring feasibility of creating a histogram in a recording rule in 
>> Prometheus based on the metrics published by users. The partial work put in 
>> so far seems inefficient, also illegible. Is there a recommended approach 
>> to solve this problem? As stated earlier, it will be extremely hard to 
>> solve the problem on the client side and I am looking for a solution within 
>> Prometheus.
>>
>> *Current metric is a gauge with with values representing request latency.*
>> http_duration_milliseconds_gauge{instance="instance1:port1"}[1h]
>> 1659752188  100
>> 1659752068  120
>> ..
>> 1659751708   150
>> 1659751588160
>>
>> *Desired histogram after conversion -*
>> http_duration_milliseconds_hist_bucket{instance="instance1:port1", 
>> le=100}  133
>> http_duration_milliseconds_hist_bucket{instance="instance1:port1", 
>> le=120}  222
>> http_duration_milliseconds_hist_bucket{instance="instance1:port1", 
>> le=140}  311
>> http_duration_milliseconds_hist_bucket{instance="instance1:port1", 
>> le=160}  330
>> http_duration_milliseconds_hist_bucket{instance="instance1:port1", 
>> le=180}  339
>> http_duration_milliseconds_hist_bucket{instance="instance1:port1", 
>> le=200}  340
>>
>>
>>
>>
>>
>>
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Prometheus Users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to prometheus-use...@googlegroups.com.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/prometheus-users/f95b5512-1c81-4e12-9670-7c7eb0d29f5en%40googlegroups.com
>>  
>> 
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/c8e3c184-e88d-4217-badc-f5f779b52af3n%40googlegroups.com.


Re: [prometheus-users] synthetic histograms in Prometheus

2022-08-07 Thread Ben Kochie
So, let's take a step back and find out some more information, because this
question is sounding a lot like an XY Problem.

How are the current applications generating their metrics right now?
How are you getting the data to create these histograms?

On Sun, Aug 7, 2022 at 9:23 AM Johny  wrote:

> We are migrating telemetry backend from legacy database to Prometheus and
> require estimating percentiles on gauge metrics published by user
> applications. Estimating percentiles on a gauge metric in Prometheus is not
> feasible and for a number of reasons, client applications will be difficult
> to modify to start publishing histograms.
>
> I am exploring feasibility of creating a histogram in a recording rule in
> Prometheus based on the metrics published by users. The partial work put in
> so far seems inefficient, also illegible. Is there a recommended approach
> to solve this problem? As stated earlier, it will be extremely hard to
> solve the problem on the client side and I am looking for a solution within
> Prometheus.
>
> *Current metric is a gauge with with values representing request latency.*
> http_duration_milliseconds_gauge{instance="instance1:port1"}[1h]
> 1659752188  100
> 1659752068  120
> ..
> 1659751708   150
> 1659751588160
>
> *Desired histogram after conversion -*
> http_duration_milliseconds_hist_bucket{instance="instance1:port1",
> le=100}  133
> http_duration_milliseconds_hist_bucket{instance="instance1:port1",
> le=120}  222
> http_duration_milliseconds_hist_bucket{instance="instance1:port1",
> le=140}  311
> http_duration_milliseconds_hist_bucket{instance="instance1:port1",
> le=160}  330
> http_duration_milliseconds_hist_bucket{instance="instance1:port1",
> le=180}  339
> http_duration_milliseconds_hist_bucket{instance="instance1:port1",
> le=200}  340
>
>
>
>
>
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/f95b5512-1c81-4e12-9670-7c7eb0d29f5en%40googlegroups.com
> 
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CABbyFmpjrzMrPO%3DTMTJpgBN5xmsK3mRGNUnaFYunTGN594EcpA%40mail.gmail.com.


Re: [prometheus-users] synthetic histograms in Prometheus

2022-08-07 Thread Stuart Clark

On 07/08/2022 08:23, Johny wrote:
We are migrating telemetry backend from legacy database to Prometheus 
and require estimating percentiles on gauge metrics published by user 
applications. Estimating percentiles on a gauge metric in Prometheus 
is not feasible and for a number of reasons, client applications will 
be difficult to modify to start publishing histograms.


I am exploring feasibility of creating a histogram in a recording rule 
in Prometheus based on the metrics published by users. The partial 
work put in so far seems inefficient, also illegible. Is there a 
recommended approach to solve this problem? As stated earlier, it will 
be extremely hard to solve the problem on the client side and I am 
looking for a solution within Prometheus.


*Current metric is a gauge with with values representing request latency.*
http_duration_milliseconds_gauge{instance="instance1:port1"}[1h]
1659752188  100
1659752068  120
..
1659751708   150
1659751588    160


I'm not really sure what you are meaning by this metric?

A histogram of request latencies needs access to all the events that 
occur, with details of every single latency value. It can then increment 
the counter for a particular sot of range buckets to map the 
distribution over time. I don't really understand what the single gauge 
represents? Is that the latency of the most recent event? Some average 
over the last hour?


Without access to the underlying events I can't see how this can be 
possible - which is only possible in the application, or if you store 
events elsewhere (e.g. in log files) in a tool that connects to your 
event store system.


--
Stuart Clark

--
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/bd420182-c1da-d47b-ae66-3f6cdf8032b9%40Jahingo.com.


[prometheus-users] synthetic histograms in Prometheus

2022-08-07 Thread Johny
We are migrating telemetry backend from legacy database to Prometheus and 
require estimating percentiles on gauge metrics published by user 
applications. Estimating percentiles on a gauge metric in Prometheus is not 
feasible and for a number of reasons, client applications will be difficult 
to modify to start publishing histograms. 

I am exploring feasibility of creating a histogram in a recording rule in 
Prometheus based on the metrics published by users. The partial work put in 
so far seems inefficient, also illegible. Is there a recommended approach 
to solve this problem? As stated earlier, it will be extremely hard to 
solve the problem on the client side and I am looking for a solution within 
Prometheus.

*Current metric is a gauge with with values representing request latency.*
http_duration_milliseconds_gauge{instance="instance1:port1"}[1h]
1659752188  100
1659752068  120
..
1659751708   150
1659751588160

*Desired histogram after conversion -*
http_duration_milliseconds_hist_bucket{instance="instance1:port1", le=100}  
133
http_duration_milliseconds_hist_bucket{instance="instance1:port1", le=120}  
222
http_duration_milliseconds_hist_bucket{instance="instance1:port1", le=140}  
311
http_duration_milliseconds_hist_bucket{instance="instance1:port1", le=160}  
330
http_duration_milliseconds_hist_bucket{instance="instance1:port1", le=180}  
339
http_duration_milliseconds_hist_bucket{instance="instance1:port1", le=200}  
340







-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/f95b5512-1c81-4e12-9670-7c7eb0d29f5en%40googlegroups.com.