Hi, On 8/20/20 12:47 PM, 'azha...@googlemail.com' via Prometheus Users wrote: > I have 2 alerts > > - The first being to fire if CPU is more then 70% (WMI) > > - The second to report whether an instance is down > > 100 - (avg by(instance) (rate(wmi_cpu_time_total{mode="idle"}[2m])) * > 100) > 70 > <http://192.168.1.67:9090/graph?g0.expr=100+-+%28avg+by%28instance%29+%28rate%28wmi_cpu_time_total%7Bmode%3D%22idle%22%7D%5B2m%5D%29%29+%2A+100%29+%3E+70&g0.tab=1> > > > > up == 0 <http://192.168.1.67:9090/graph?g0.expr=up+%3D%3D+0&g0.tab=1> > > > Post generating a CPU spike i can confirm that my client CPU is indeed 100% > > @echo off > :loop > goto loop > > however i get the second alert (up==0) firing and reporting the > instance is down despite it not being down. The strange thing is this is > intermittent behavior as occasionally I do get the CPU firing alert > instead of the instance down alert. > > > Im just wondering why when the CPU is clearly maxed out at 100% the > instance is reporting as down... and why sometimes this isn't the case.
So you are getting the Instance Down alert instead of the High CPU alert? The up metric is special. It is generated by Prometheus itself and always exists for anything which is a scrape target. The fact that your CPU alert does not fire and that up == 0 probably indicates that Prometheus fails to receive metrics from your wmi_exporter. We may only speculate why that is. Maybe the load is so high that the scrape times out? You can check the Prometheus Web UI Targets page to see the last scrape error for your target. If it is indeed a timeout ("deadline exceeded") you could try increasing the scrape_timeout option to make Prometheus wait longer for the exporter to reply. Side note: If I remember correctly, the wmi_exporter has been renamed to windows_exporter (along with the metrics). This might mean that you are running an older version. Maybe updating helps if the newer version is more performant (I don't know, just guessing). Kind regards, Christian -- You received this message because you are subscribed to the Google Groups "Prometheus Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/9e0a07ba-baf0-6a97-b652-06d7ecde1d17%40hoffmann-christian.info.