You are looking at the graph over a 4 week time window. A short burst of
up==0 will not be visible. For example, if the graph had 1000 points on the
X axis, then there would only be one sample roughly every 40 minutes.
Zoom in to a shorter time window, or use a query like this:
min_over_time(up[4h]) == 0
> plus in reality Instance wan't Down.
You didn't show the alerting rule, but if it was "up == 0", then I can
assure you that you got up == 0 at least once - Prometheus doesn't lie.
This means a failed scrape. It doesn't necessarily mean the target was
down, but it could be a network problem between prometheus and the target,
or a resource limitation which was preventing the target from responding.
On Monday, 24 March 2025 at 12:17:44 UTC Amol Nagotkar wrote:
> Hi all,
>
> Got this alert from alertmanager.
> [{
> "status": "firing",
> "labels": {
> "alertname": "Instance Down",
> "instance": "IP_ADDRESS:15692",
> "job": "XYZ",
> "severity": "Critical"
> },
> "annotations": {
> "description": "IP_ADDRESS:15692 of job XYZ has been down for more
> than 30 sec.",
> "summary": "Endpoint IP_ADDRESS:15692 down"
> },
> "startsAt": "2025-03-23T11:52:01.891Z",
> "endsAt": "0001-01-01T00:00:00Z",
> "generatorURL": "
> http://XYZ-container-pr:9091/graph?g0.expr=up+%3D%3D+0&g0.tab=1",
> "fingerprint": "bf26da1f76eab5ca"
> }
> ]
>
> [image: Screenshot from 2025-03-24 17-19-16.png]
> But if you see the graph where is no alerts. plus in reality Instance
> wan't Down.
> how to debug/solve this?
>
> Thank you in advance.
>
--
You received this message because you are subscribed to the Google Groups
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion visit
https://groups.google.com/d/msgid/prometheus-users/809f32db-29b7-4dbf-bce2-47f9a49c7933n%40googlegroups.com.