Re: [prometheus-developers] Prometheus Pushgateway TTL

2023-07-12 Thread Bjoern Rabenstein
On 12.07.23 10:10, E wrote:
> I think optional TTL per time series is a good idea. It might have several
> use cases, it doesn't break anything, and it shouldn't be too hard to make.
> So why not?

Because all the use cases discussed so far have turned out to be
anti-patterns we don't want to support. This topic was brought up
multiple times at dev-summits etc., and the outcome was always the
same.

> I might have used this feature to trigger short-lived alerts with arbitrary
> text in a label, something I wouldn't do without TTL because it would
> require a cleanup.

I don't quite understand that use case, but feel free to flesh it out
a bit more and propose it as a topic for the dev-summit by adding it
to the agenda:
https://docs.google.com/document/d/11LC3wJcVk00l8w5P3oLQ-m3Y37iom6INAMEu2ZAGIIE/edit?pli=1
 

-- 
Björn Rabenstein
[PGP-ID] 0x851C3DA17D748D03
[email] bjo...@rabenste.in

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-developers/ZK6TOzCck1MSMuW0%40mail.rabenste.in.


Re: [prometheus-developers] Prometheus Pushgateway TTL

2023-07-12 Thread Bjoern Rabenstein
On 11.07.23 15:23, 'Braden Schaeffer' via Prometheus Developers wrote:
> They could live for 5s or 1 hour.

The whole idea of a Prometheus counter doesn't really make sense for a
job that lives for just 5s, if you are scraping every 15s or every
minute or so.

And a job that lives for 1 hour should be scraped directly.

So in the first case, using a counter doesn't make sense, and in the
second case using the Pushgateway doesn't make sense.

> Does it really matter what you send to pushgateway?  It supports
> counters so why not push them?

We could be stricter and just reject counters being pushed to the
Pushgateway, but that would be a breaking change. Historically, the
metric type information in Prometheus was (and to a good part still
is) some kind of "weak typing", so no hard restrictions were imposed
(you can apply `rate` to a gauge or `delta` to a counter without
Prometheus complaining about it).

Also, it feels natural to count "records backed up by the daily
database back up job" in a counter and push it to the
Pushgateway. However, when it arrives on your Prometheus server, it
doesn't really behave as a counter. Summing those values up across
instances is really painful with PromQL, and the reason for that is
that we are essentially handling events here, for which Prometheus as
a whole wasn't really designed.

If you really have to use Prometheus for that case, the "least bad"
solutions I know of is statsd with the statsd-exporter (
https://github.com/prometheus/statsd_exporter ) or the
prom-aggregation-gateway
( https://github.com/zapier/prom-aggregation-gateway ).

A TTL doesn't really address the fundamental problem. It might enable
a very brittle solution that is worse than the solution that are
already available.

-- 
Björn Rabenstein
[PGP-ID] 0x851C3DA17D748D03
[email] bjo...@rabenste.in

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-developers/ZK6SUuQBF4657EAX%40mail.rabenste.in.


Re: [prometheus-developers] Prometheus Pushgateway TTL

2023-07-12 Thread E
I think optional TTL per time series is a good idea. It might have 
several use cases, it doesn't break anything, and it shouldn't be too 
hard to make. So why not?
I might have used this feature to trigger short-lived alerts with 
arbitrary text in a label, something I wouldn't do without TTL because 
it would require a cleanup.


--
Best regards,
Evgeniy Yunkin

On 11/07/2023 22:23, 'Braden Schaeffer' via Prometheus Developers wrote:
They could live for 5s or 1 hour.  Does it really matter what you send 
to pushgateway? It supports counters so why not push them?


A TTL is all we need here.

On Sat, Jul 1, 2023 at 5:32 PM Bjoern Rabenstein  
wrote:


On 29.06.23 08:47, 'Braden Schaeffer' via Prometheus Developers wrote:
> It's the same as calculating the total incoming request rate of
N pods in a
> deployment: sum(rate(grpc_request_count{service=foo}[5m]))

🤔 I'm surprised that you seem to push a counter metric to the
Pushgateway.

I would say the intended use case for the Pushgateway is that a
batch job pushes its metrics upon completion. That means you only ever
have one value of those metrics, so a `rate` on those would always
result in zero.

Are you perhaps pushing multiple times during the runtime of your
batch jobs? That would be weird indeed for a PGW use case. Why don't
you just scrape your jobs normally then?

-- 
Björn Rabenstein

[PGP-ID] 0x851C3DA17D748D03
[email] bjo...@rabenste.in

--
You received this message because you are subscribed to the Google 
Groups "Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send 
an email to prometheus-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-developers/CAKG2A1cBTOk_wrJi%2B2uO_2y6LKG0t0AMpMptBCdM6yTR_cTDxg%40mail.gmail.com 
.


--
You received this message because you are subscribed to the Google Groups 
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-developers+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-developers/12953903-7092-2b9c-d26f-7baf5b2c5c90%40gmail.com.