[prometheus-users] Two issues: Prometheus alerts are firing when containers aren't down; can't change scrape_timeout, prometheus won't start.

James Costello Mon, 04 May 2020 18:22:10 -0700

Hello.

I am trying to get to the bottom of two issues, one which may relate to the 
other. 
Relatively new to Prometheus and alert rules and the like.

We have a prometheus server running version 2.17.2 and cAdvisor running
version 0.35

We have an older prometheus server running version 2.8.0 and cAdvisor
version 0.33

They are monitoring containers running on instances in AWS.

The older prometheus server doesn't have an issue and has been running
steadily for months.

The new server which I implemented last week has issues where prometheus is
firing off critical alerts for our containers randomly but at least one per
hour for different containers.

I have tried everything I know to try including lengthening the
evaluation_interval to 60s and adjusting the rules.alert.yml to trigger if
down for 45 seconds for a length of 10 seconds.

But the random alerts keep coming in. But I watch / refresh the targets in
Prometheus and they are never down.

The containers themselves have been up for days and haven't restarted.

I thought perhaps adjusting scrape_timeout maybe would fix this? But every
time I try to put a different value in for scrape_timeout in the
Prometheus.yml file the Prometheus service starts and then immediately
crashes.

Can anyone offer any help or suggestions? I am running out of ideas on what
to tweak.

Thanks.

--
You received this message because you are subscribed to the Google Groups
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/prometheus-users/2220fee0-8c0b-429a-a888-b1b2193736a7%40googlegroups.com.

[prometheus-users] Two issues: Prometheus alerts are firing when containers aren't down; can't change scrape_timeout, prometheus won't start.

Reply via email to