On Fri, 21 Feb 2020 at 18:29, Jake Utley <jut...@hiya.com> wrote:

> We are exploring using the PushGateway to handle metric collection for
> batch jobs. I am wondering how we should operating the PushGateways in a
> reliable way.
>
> What do we do if the PushGateway is temporarily unavailable? Should we
> assume the client needs to retry?
>
> What happens if we push metrics to the PushGateway, but it crashes before
> it gets scraped?
>

Generally you should ensure that your alerts are resilient to one run of a
batch job failing, for whatever reason - including these.

-- 
Brian Brazil
www.robustperception.io

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAHJKeLoUBwQCjq9d2SSgnCULfw2-T095xPCcS2BNXpo_-11d1A%40mail.gmail.com.

Reply via email to