I have a fairly simple setup with a handful of alerts from a specific host.
If that host is down, I want to suppress these alerts. I'm able to do that
with a straightforward inhibit.

- source_match:
    alertname: 'HostMissing'
  equal: ['instance']

The problem comes when the host comes back. The previously inhibited alerts
continue firing for another minute or two, but the inhibit is gone, so the
alerts fire notifications immediately.

Is there a way I can say "inhibit these alerts if the HostMissing alert is
firing, or has been firing within the past N min"?

Or, for bonus points, is it possible to say "hold these alerts in case the
inhibit begins firing within the next N min"? I realize this case would
delay notifications for N min generally, but it would make nice feature
parity for "within N min before".

In some sense, this is the opposite of alertmanager pull#1331. If an
inhibit is defined as "within N min before", don't update the inhibitor's
cache of alerts until N min have passed.

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/CAMhgSL0F8s8BFa4ko4hg%3DB%3DRX12pRrwTaV7AJLsVNNpU2MABEQ%40mail.gmail.com.

Reply via email to