Hi all,
this is a sort of feature request, so I hope this is the right place to post
it to.
I manage - or help to manage - a bunch of sites monitored with nagios/icinga.
Very often I'm asked if there's a way to prevent a check from going back to OK
state after it went warning/critical. This is, mainly, because volatile
services (read: log checkers) tipically won't stay in a wrong state for a long
time (that's the reason why they're volatile, after all...) and often go
unseen.
There are some workarounds to this situation, but they're always left to the
probe (probes like mk_logwatch handle this problem, for example) or to
notification methods, but may sites just rely on the console-staring method to
get the alerts.
What I'm thinking about is a property for the service to specify that it can't
go back to a lower state, for example:
service_is_persistent 1
persistent_states w,c,u
means that the transitions allowed are:
OK --> WARNING, CRITICAL, UNKNOWN
WARNING --> CRITICAL, UNKNOWN
CRITICAL --> UNKNOWN
but not in the opposite direction; excluding UNKNOWN form the persistence:
persistent_states w,c
the allowed transitions would be:
OK --> WARNING, CRITICAL, UNKNOWN
WARNING --> CRITICAL, UNKNOWN
CRITICAL --> UNKNOWN
UNKNOWN --> CRITICAL, WARNING, OK (the last seen worst state, not less; e.g.:
CRITICAL --> UNKNOWN --> CRITICAL allowed
WARNING --> UNKNOWN --> CRITICAL, WARNING allowed
OK --> UNKNOWN --> CRITICAL, WARNING, OK allowed
CRITICAL --> UNKNOWN --> WARNING, OK forbidden
WARNING --> UNKNOWN --> OK forbidden
)
An acknowledge of the problem should be required to reset the state back to OK
(or a passive check - a manual "OK" insert - but this would prevent pure
passive checks from being defined as persistent).
I really know a lot of people who think this could be useful, I hope someone
of you agrees to that. Is this feasible?
Thanks,
regards.
Giacomo
------------------------------------------------------------------------------
Don't let slow site performance ruin your business. Deploy New Relic APM
Deploy New Relic app performance management and know exactly
what is happening inside your Ruby, Python, PHP, Java, and .NET app
Try New Relic at no cost today and get our sweet Data Nerd shirt too!
http://p.sf.net/sfu/newrelic-dev2dev
_______________________________________________
icinga-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/icinga-devel