On 28.01.25 09:29, rohit ahuja wrote:
> 
> Question 1 - what should be my deletion policy to delete stale metrics from 
> pushgateway?
> should it be after my batches are complete for the day? batches runs for 
> 2-3 hours

Ideally never. I would set up your batches in a way that the metrics
of each day produce the same metrics. Then you have a fixed set of
metrics that will live on the PGW forever, overwritten each day.

> Question 2 - I want to define an email alert if any of batch fails. 
> although spring batch provide this metric but it is not working. So i 
> defined a Counter that is available to me in prometheus like this 
> --> app_job_status_total{status="FAILED"} 1. Problem is it always gives me 
> the same value. Using functions increase() or rate() does not help as well. 
> as the value of metric once set is not changing over the evaluated 
> interval. Please advice

The Pushgateway is not a distributed counter. If you have separate
metrics for each of your daily batch job, you could just have a gauge
that is 0 or 1 depending on success or failure, and have an alert
watching all those.

If you cannot avoid the "distributed counter" use case, you could try
a statsd setup and funnel the stastd metrics into Prometehus via the
statsd exporter. Or you try out the prom-aggregation-gateway,
https://github.com/zapier/prom-aggregation-gateway

See also https://github.com/prometheus/pushgateway?tab=readme-ov-file#non-goals

-- 
Björn Rabenstein
[PGP-ID] 0x851C3DA17D748D03
[email] [email protected]

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/prometheus-developers/Z5kg1qdPMjz%2BqbWS%40mail.rabenste.in.

Reply via email to