Re: [prometheus-users] Discuss Prmetheus alerts suppressing

Badreddin Aboubakr Fri, 19 Feb 2021 13:58:33 -0800

Unfortunately this won't work for us due to different reasons :
* Our environment is highly dynamic, we take servers in and out from
production more often
* We then would need an aggregator like (thanos-query) if we want to see
our data in one place
* We also want to avoid as much as we can the appearing and disappearing
time-series


On Fri, Feb 19, 2021 at 10:48 PM Stuart Clark <stuart.cl...@jahingo.com>
wrote:

> On 19/02/2021 19:43, Badreddin Aboubakr wrote:
>
>
> Hello,
>
> We use Prometheus to monitor our infrastructure (hypervisors, gateways,
> storage servers, etc). Scrape targets are sourced from a Postgres database,
> which contains additional information about the “in production” state of
> the target. In the beginning we used to have a metadata metric which
> indicated the state of the server as an `enum` metric.
>
> By joining the state metric on each alerting rule and then dropping the
> alerts which have specific state, we were able to suppress un-needed alerts
>
> With the growth of number of alerting rules and the number of states,
> joining on these metrics in all alerting rules became so expensive that we
> wrote some recording rules which keeps evaluating the enum metric and
> produces enum metric with less cardinality (production (where alerts shall
> pass to their receivers) and everything else (Will be dropped at
> alertmanager step))
>
> so again we join on these metrics and drop alerts which have
> non-production.
>
> Now that is not going to scale but it was a temporary solution as our
> alerting rules are growing.
>
> So we discussed some solutions:
>
> * We can set silences and remove them on state change using alert manager
> API:
>
>    This approach is too dynamic however (I don’t know if alertmanager API
> was designed for this purpose and, maybe it’s )  Will that scale with
> number of silences and hosts
>
> * We can develop a kind of proxy which will be deployed between Prometheus
> and alertmanager, and drop alerts for hosts in non-production state:
>
>    This approach is dangerous as if the proxy fails, no alerts will reach
> alertmanager
>
> * put the proxy on the notification path: This will make it a bit
> complicated as the proxy has to understand receivers, etc
>
> PS: We still want to scrape and monitor the servers which are not in
> production state.
>
> We will be really thankful for any suggestions or ideas.
>
> Couldn't you run two sets of Prometheus servers to monitor the production
> infrastructure separately from the non-production. Then just don't have
> alerting rules or connect alertmanagers to the non-production servers.
>
> --
> Stuart Clark
>
>

-- 

Badreddin Aboubakr


GAPS (IONOS Cloud)

1&1 IONOS SE | Greifswalder Str. 207 | 10405 Berlin | Germany
Phone:
E-mail: badreddin.aboub...@cloud.ionos.com | Web: www.ionos.de

Hauptsitz Montabaur, Amtsgericht Montabaur, HRB 24498

Vorstand: Hüseyin Dogan, Dr. Martin Endreß, Claudia Frese, Henning Kettler,
Arthur Mai, Matthias Steinberg, Achim Weiß
Aufsichtsratsvorsitzender: Markus Kadelke

Member of United Internet

Diese E-Mail kann vertrauliche und/oder gesetzlich geschützte Informationen
enthalten. Wenn Sie nicht der bestimmungsgemäße Adressat sind oder diese
E-Mail irrtümlich erhalten haben, unterrichten Sie bitte den Absender und
vernichten Sie diese E-Mail. Anderen als dem bestimmungsgemäßen Adressaten
ist untersagt, diese E-Mail zu speichern, weiterzuleiten oder ihren Inhalt
auf welche Weise auch immer zu verwenden.

This e-mail may contain confidential and/or privileged information. If you
are not the intended recipient of this e-mail, you are hereby notified that
saving, distribution or use of the content of this e-mail in any way is
prohibited. If you have received this e-mail in error, please notify the
sender and delete the e-mail.

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CALJcnC-JBzKXwnw3fSZBB%3Dyh4TiyijQukNBtuBObDaBYsbzb5Q%40mail.gmail.com.

Re: [prometheus-users] Discuss Prmetheus alerts suppressing

Reply via email to