> Honestly, most of what you want is stuff we could support in Alertmanager
without a lot of trouble. And are things that other users would want as
well. Rather than build a whole new system, why not contribute improvements
directly to the Alertmanager.
That's a very good point and something I
On Mon, Nov 22, 2021 at 4:03 PM Tony Di Nucci wrote:
> Thanks for the feedback Stuart, I really appreciate you taking the time
> and you've given me reason to pause and reconsider my options.
>
> I fully understand your concerns over having a new data store. I'm not
> sure that AlertManager and
Thanks for the feedback Stuart, I really appreciate you taking the time and
you've given me reason to pause and reconsider my options.
I fully understand your concerns over having a new data store. I'm not
sure that AlertManager and Prometheus contain the state I need though and
I'm not sure
On 20/11/2021 23:42, Tony Di Nucci wrote:
Yes, the diagram is a bit of a simplification but not hugely.
There may be multiple instances of AlertRouter however they will share
a database. Most likely things will be kept simple (at least
initially) where each instance holds no state of its
Yes, the diagram is a bit of a simplification but not hugely.
There may be multiple instances of AlertRouter however they will share a
database. Most likely things will be kept simple (at least initially)
where each instance holds no state of its own. Each active alert in the DB
will be
It sounds like you are planning on creating a fairly complex system that
duplicates a reasonable amount of what Alertmanager already does. I'm presuming
your diagram is a simplification and that the application is itself a cluster,
so each instance would be querying each instance of
There are other things I need to do as well, alert enrichment, complex
routing, etc. which means that I think some additional system is needed
between AlertManager and the final destination in any case.
The main question in my mind is really; are there reasons why I should
prefer to have
Thanks for the feedback.
> What gives you the impression that the Alertmanager is "best effort"?
Sorry, best-effort probably wasn't the right term to use. I am aware of
there being retries however these could still all fail and I'm thinking I
wouldn't be made aware of the issue for potentially
Also, the alertmanager does have an "even store", it's a shared state
between all instances.
If you're interested in changing some of the behavior of the retry
mechanisms or how this works, feel free to open specific issues. You don't
need to build an entirely new system, we can add new features
What gives you the impression that the Alertmanager is "best effort"?
The alertmanager provides a reasonably robust HA solution (gossip
clustering). The only thing best-effort here is actually deduplication. The
Alertmanager design is "at least once" delivery, so it's robust against
network
Cross-posted from
https://discuss.prometheus.io/t/is-this-alerting-architecture-crazy/610
In relation to alerting, I’m looking for a way to get strong alert delivery
guarantees (and if delivery is not possible I want to know about it
quickly).
Unless I’m mistaken AlertManager only offers
11 matches
Mail list logo