[prometheus-users] Re: Alert Query

sri L Sat, 07 Oct 2023 07:12:36 -0700

Incase of any issue with storage, network with in Prometheus that impacting 
metrics not getting scraped successfully, we would to like to identify such 
scenario by using any specific metric that can alert saying metrics not 
received.


On Friday, October 6, 2023 at 7:13:59 PM UTC+5:30 hartfordfive wrote:

> If you're looking to determine if a target is reachable or not, you could 
> use the "*up*" metric which is automatically added to the scrape of a 
> given target (see docs 
> <https://prometheus.io/docs/concepts/jobs_instances/#automatically-generated-labels-and-time-series>).
>   
> The alerting condition could look something like this:
>
>
>
>
>
>
>
>
> *alert: TargetIsUnreachableexpr: up == 0for: 3mlabels:  severity: 
> warningannotations:  title: Instance {{ $labels.instance }} is unreachable  
> description: Prometheus is unable to scrape {{ $labels.instance }}. This 
> could indicate the target being down or at network issue.*
>
>
> This will trigger the alert if the "*up*" metric is continuously equal to 
> 0 (or in other words, the instance is unreachable) for a period of 3 
> minutes.   The value of the "for" parameter should probably be at least 2 
> to 3 times higher than what your scrape_interval setting (see docs for 
> reference 
> <https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config>)
>  
> .  It's often advised to add the "*for*" parameter to alerting conditions 
> to avoid noise from flapping alerts.  You wouldn't want to necessarily be 
> notified if a single scrape fails, say due to a transient network 
> connectivity problem.    There is also the "*absent*" function (see docs 
> <https://prometheus.io/docs/prometheus/latest/querying/functions/#absent>) 
> which you can use to determine if series (aka samples) exist for a given 
> metric name and label combination.   You would use that in cases like where 
> you might want to be notified if a given metric disappears due to the 
> target itself disappearing from the service discovery 
> <https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config>
> .
>
> As for determining if there is an actual problem with Prometheus itself, 
> that can vary depending on the issue but here's a good list of known 
> alerting conditions that you can use to monitor the state of Prometheus 
> instances:
>
> https://samber.github.io/awesome-prometheus-alerts/rules.html#prometheus-self-monitoring
>
>
> On Wednesday, October 4, 2023 at 10:59:23 PM UTC-4 sri L wrote:
>
>> Hi all,
>>
>> Can anyone please suggest alert expression for configuring alert rule for 
>> below condition.
>>
>> "metric data is not being received by Prometheus and to alert that there 
>> is an issue with the Prometheus and it is unable to scrape".
>>
>> Thanks
>>
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/8bbcea86-8cee-4121-a737-db9577e3f4f9n%40googlegroups.com.

[prometheus-users] Re: Alert Query

Reply via email to