Re: [prometheus-users] Re: my counters start at zero
If you want to take this to extremes, you could export a metric which is the absolute timestamp when your process started. As long as the clocks between your processing node and prometheus server are properly synced, then you could use something like rate(foo[60s]) or (foo / (timestamp(foo) - process_start_time_seconds)) possibly with some label matching ('on' or 'ignoring' etc) if the label set of process_start_time_seconds is not the same as the label set of foo. In the above, the LHS is the rate over 60 seconds, and the RHS is the average rate since the process started, which only needs a single data point. -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/d8431219-c927-4a2f-835b-2c44e0c38f46o%40googlegroups.com.
Re: [prometheus-users] Re: my counters start at zero
On Friday, 31 July 2020 07:38:21 UTC+1, Mike Spreitzer wrote: > > Is there a PromQL query I > can write that will behave similarly to `rate()` but will recognize that > an initial non-zero count is due to increments since the previous scrape > of the same process (yes, restricted to the situations where the process > has been scraped before)? > rate(foo[60s]) or min_over_time(foo[5m]) / 10 There are a couple of fundamental issues: - prometheus only looks back 5 minutes to find a previous value of a timeseries. You can't distinguish between "this counter has just appeared" and "this counter went away for >5 mins and came back" - how to assign a timestamp to the zero value. As I said before: rate() calculates the rate between the first and last available samples in the given time window. If there are two values, it takes the difference between the values and divides by the difference between the timestamps. A query like rate(foo[60s]) gives no hint that the data points are being scraped at nominally 10-second intervals. That's why I have to hard-code "10" in the query above. But it means the initial rate will almost certainly be wrong. Consider for example that the process starts with value 0 at time (t-2s) and the value of the counter is 10 at scrape time (t). The rate will be calculated as if the process started at time (t-10s), so will be 1/5th of the correct value. More seriously: if the scrape fails for 5 minutes, and then comes back, you will get a stupidly high spike. -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/dff5b1df-695a-484f-9f74-1494efb24298o%40googlegroups.com.
Re: [prometheus-users] Re: my counters start at zero
I have a specific scenario. I have counters that start at zero when the scraped process starts; they are counting something that happens in the scraped process. If a counter first appears with a non-zero value, I know all those counts happened since the previous scrape. I am not asserting that `rate()` should be changed for everybody. Is there a PromQL query I can write that will behave similarly to `rate()` but will recognize that an initial non-zero count is due to increments since the previous scrape of the same process (yes, restricted to the situations where the process has been scraped before)? Thanks, Mike prometheus-users@googlegroups.com wrote on 07/29/2020 03:27:03 AM: > From: Brian Candler > To: Prometheus Users > Date: 07/29/2020 03:27 AM > Subject: [EXTERNAL] [prometheus-users] Re: my counters start at zero > Sent by: prometheus-users@googlegroups.com > > rate() calculates the rate between the first and last available > samples in the given time window, as long as there are at least two samples. > > irate() calculates the rate between the last two samples in the > given time window. > > On Wednesday, 29 July 2020 05:25:04 UTC+1, Mike Spreitzer wrote: > Now suppose instead that foo first shows up in a scrape at time t0 > with a value of 10, and in every scrape after that the value of foo > is also 10. What will `rate(foo[60s])` give me? If I understand > correctly, it will give me nothing until time t0+60s, and from then > on it will give me zero. Have I got this right? > > It will show a rate of 0 as soon as two values are available, that > is, from t0+10s onwards. > > If a new counter appears with value 10, it tells you nothing about > rate just before the counter appeared. It maybe that scraping was > broken, and the counter had value 10 for the last year. It could be > that the counter had being going 1-2-3-4-5-6-7-8-9-10 at intervals > of 10 seconds. Or at intervals of 1 week. > > As a real-world example, it is very common to start polling an SNMP > device and find its interface byte counters already at huge values, > reflecting how much traffic has been carried in total by that > interface since the device was powered on. It would be completely > wrong to have an enormous blip which effectively compresses months > or years of traffic into one sample interval. > -- > You received this message because you are subscribed to the Google > Groups "Prometheus Users" group. > To unsubscribe from this group and stop receiving emails from it, > send an email to prometheus-users+unsubscr...@googlegroups.com. > To view this discussion on the web visit https://groups.google.com/ > d/msgid/prometheus-users/b9dfe865-3be6-414f- > b6f9-7e55caa52196o%40googlegroups.com. -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/OF2A889B1D.27215406-ON852585B6.00240C50-852585B6.00247512%40notes.na.collabserv.com.
[prometheus-users] Re: my counters start at zero
rate() calculates the rate between the first and last available samples in the given time window, as long as there are at least two samples. irate() calculates the rate between the last two samples in the given time window. On Wednesday, 29 July 2020 05:25:04 UTC+1, Mike Spreitzer wrote: > Now suppose instead that foo first shows up in a scrape at time t0 with a > value of 10, and in every scrape after that the value of foo is also 10. > What will `rate(foo[60s])` give me? If I understand correctly, it will > give me nothing until time t0+60s, and from then on it will give me zero. > Have I got this right? > It will show a rate of 0 as soon as two values are available, that is, from t0+10s onwards. If a new counter appears with value 10, it tells you nothing about rate just before the counter appeared. It maybe that scraping was broken, and the counter had value 10 for the last year. It could be that the counter had being going 1-2-3-4-5-6-7-8-9-10 at intervals of 10 seconds. Or at intervals of 1 week. As a real-world example, it is very common to start polling an SNMP device and find its interface byte counters already at huge values, reflecting how much traffic has been carried in total by that interface since the device was powered on. It would be completely wrong to have an enormous blip which effectively compresses months or years of traffic into one sample interval. -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/b9dfe865-3be6-414f-b6f9-7e55caa52196o%40googlegroups.com.