https://github.com/cloudflare/pint can try to estimate the number of time 
an alert would trigger if you configure it to do so,
see example at 
https://github.com/cloudflare/pint/blob/main/docs/examples/config.hcl#L58

Not exactly what you're looking for, but it can be useful to find alerts 
that would fire too often or would never fire (if you know it should fire).

On Tuesday, 28 September 2021 at 09:20:31 UTC+1 [email protected] wrote:

> I'm thinking about ways we can reduce noisy alerts. One of the problems is 
> it's tricky to tweak alert thresholds without any data on the precision and 
> recall of the alert. It's a non-trivial problem to get this data because a 
> human is typically required to classify an alert as a true positive or a 
> false negative. This makes it hard to fully automate gathering this data. I 
> am considering whether there is a way of obtaining this data using a hybrid 
> approach: a human is able to classify an alert as a true positive of false 
> positive - for example via a button in the alert body (e.g. in Slack or 
> PagerDuty) and this gets sent to an analytics database which we can later 
> prioritise which alert thresholds that need tweaking.
>
> My question is, is there any precedent for this kind of system in 
> the Prometheus/Alertmanager ecosystem? i.e. open source software that does 
> this out of the box, or experience report blog posts?
>
> Many thanks,
> Will
>

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/0f6690ca-2b99-417c-bb21-e48b5cb94e6fn%40googlegroups.com.

Reply via email to