Re: [prometheus-users] Re: my counters start at zero

2020-07-31 Thread Brian Candler
If you want to take this to extremes, you could export a metric which is 
the absolute timestamp when your process started.  As long as the clocks 
between your processing node and prometheus server are properly synced, 
then you could use something like

rate(foo[60s]) or (foo / (timestamp(foo) - process_start_time_seconds))

possibly with some label matching ('on' or 'ignoring' etc) if the label set 
of process_start_time_seconds is not the same as the label set of foo.

In the above, the LHS is the rate over 60 seconds, and the RHS is the 
average rate since the process started, which only needs a single data 
point.

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/d8431219-c927-4a2f-835b-2c44e0c38f46o%40googlegroups.com.


Re: [prometheus-users] Re: my counters start at zero

2020-07-31 Thread Brian Candler
On Friday, 31 July 2020 07:38:21 UTC+1, Mike Spreitzer wrote:
>
> Is there a PromQL query I 
> can write that will behave similarly to `rate()` but will recognize that 
> an initial non-zero count is due to increments since the previous scrape 
> of the same process (yes, restricted to the situations where the process 
> has been scraped before)? 
>

rate(foo[60s]) or min_over_time(foo[5m]) / 10

There are a couple of fundamental issues:

- prometheus only looks back 5 minutes to find a previous value of a 
timeseries.  You can't distinguish between "this counter has just appeared" 
and "this counter went away for >5 mins and came back"

- how to assign a timestamp to the zero value.  As I said before: rate() 
calculates the rate between the first and last available samples in the 
given time window.  If there are two values, it takes the difference 
between the values and divides by the difference between the timestamps.  A 
query like rate(foo[60s]) gives no hint that the data points are being 
scraped at nominally 10-second intervals.  That's why I have to hard-code 
"10" in the query above.

But it means the initial rate will almost certainly be wrong.  Consider for 
example that the process starts with value 0 at time (t-2s) and the value 
of the counter is 10 at scrape time (t).  The rate will be calculated as if 
the process started at time (t-10s), so will be 1/5th of the correct value.

More seriously: if the scrape fails for 5 minutes, and then comes back, you 
will get a stupidly high spike.

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/dff5b1df-695a-484f-9f74-1494efb24298o%40googlegroups.com.


Re: [prometheus-users] Re: my counters start at zero

2020-07-31 Thread Mike Spreitzer
I have a specific scenario.  I have counters that start at zero when the 
scraped process starts; they are counting something that happens in the 
scraped process.  If a counter first appears with a non-zero value, I know 
all those counts happened since the previous scrape.  I am not asserting 
that `rate()` should be changed for everybody.  Is there a PromQL query I 
can write that will behave similarly to `rate()` but will recognize that 
an initial non-zero count is due to increments since the previous scrape 
of the same process (yes, restricted to the situations where the process 
has been scraped before)?

Thanks,
Mike

prometheus-users@googlegroups.com wrote on 07/29/2020 03:27:03 AM:

> From: Brian Candler 
> To: Prometheus Users 
> Date: 07/29/2020 03:27 AM
> Subject: [EXTERNAL] [prometheus-users] Re: my counters start at zero
> Sent by: prometheus-users@googlegroups.com
> 
> rate() calculates the rate between the first and last available 
> samples in the given time window, as long as there are at least two 
samples.
> 
> irate() calculates the rate between the last two samples in the 
> given time window.
> 
> On Wednesday, 29 July 2020 05:25:04 UTC+1, Mike Spreitzer wrote:
> Now suppose instead that foo first shows up in a scrape at time t0 
> with a value of 10, and in every scrape after that the value of foo 
> is also 10.  What will `rate(foo[60s])` give me?  If I understand 
> correctly, it will give me nothing until time t0+60s, and from then 
> on it will give me zero.  Have I got this right?
> 
> It will show a rate of 0 as soon as two values are available, that 
> is, from t0+10s onwards.
> 
> If a new counter appears with value 10, it tells you nothing about 
> rate just before the counter appeared.  It maybe that scraping was 
> broken, and the counter had value 10 for the last year.  It could be
> that the counter had being going 1-2-3-4-5-6-7-8-9-10 at intervals 
> of 10 seconds.  Or at intervals of 1 week.
> 
> As a real-world example, it is very common to start polling an SNMP 
> device and find its interface byte counters already at huge values, 
> reflecting how much traffic has been carried in total by that 
> interface since the device was powered on.  It would be completely 
> wrong to have an enormous blip which effectively compresses months 
> or years of traffic into one sample interval.
> -- 
> You received this message because you are subscribed to the Google 
> Groups "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, 
> send an email to prometheus-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/
> d/msgid/prometheus-users/b9dfe865-3be6-414f-
> b6f9-7e55caa52196o%40googlegroups.com.

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/OF2A889B1D.27215406-ON852585B6.00240C50-852585B6.00247512%40notes.na.collabserv.com.


[prometheus-users] Re: my counters start at zero

2020-07-29 Thread Brian Candler
rate() calculates the rate between the first and last available samples in 
the given time window, as long as there are at least two samples.

irate() calculates the rate between the last two samples in the given time 
window.

On Wednesday, 29 July 2020 05:25:04 UTC+1, Mike Spreitzer wrote:

> Now suppose instead that foo first shows up in a scrape at time t0 with a 
> value of 10, and in every scrape after that the value of foo is also 10.  
> What will `rate(foo[60s])` give me?  If I understand correctly, it will 
> give me nothing until time t0+60s, and from then on it will give me zero.  
> Have I got this right?
>

It will show a rate of 0 as soon as two values are available, that is, from 
t0+10s onwards.

If a new counter appears with value 10, it tells you nothing about rate 
just before the counter appeared.  It maybe that scraping was broken, and 
the counter had value 10 for the last year.  It could be that the counter 
had being going 1-2-3-4-5-6-7-8-9-10 at intervals of 10 seconds.  Or at 
intervals of 1 week.

As a real-world example, it is very common to start polling an SNMP device 
and find its interface byte counters already at huge values, reflecting how 
much traffic has been carried in total by that interface since the device 
was powered on.  It would be completely wrong to have an enormous blip 
which effectively compresses months or years of traffic into one sample 
interval.

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/b9dfe865-3be6-414f-b6f9-7e55caa52196o%40googlegroups.com.