Dear Icinga users,

sorry for the misleading topic, but I couldn't figure out a better name for
this feature.

Is there a way in Icinga to escalate a host/service check?

Imagine this: host A availability is checked with "command" check-host. If
check-host fails for the first time, then Icinga re-checks host A every n
seconds (retry_interval) and puts host A in a hard failed (critical) status
after m failed checks (max_check_attempts).
I would be interested in a solution which - after max_check_attempts
"escalates" the host/service check and tries an alternate host/service
check.

For example: I'm running check_disk on every host I'm monitoring with
Icinga, most of which have FSs mounted over NFS. If the NFS server is shut
down, is unaccessible and the FS is unaccessible on the host mounting it,
then check_disk times out after (the default) 10 seconds. In this case I
don't know why exactly NRPE timed out. Was it because one/more NFS FSs were
unavailable or the whole server was lagging or disconnected from the
network?

It would be a great feature if Icinga was able to do the following:
execute check_foo
if check_foo fails then
while(x<m) do
retry check_foo
m++
if x==m then
check_bar

where check_bar != check_foo

Thank you for any ideas and comments!

Regards,
Ferenc STELCZ
_______________________________________________
icinga-users mailing list
[email protected]
https://lists.icinga.org/mailman/listinfo/icinga-users

Reply via email to