On Thu, 19 May 2005 13:12:59 -0700 I wrote:
>That is, assuming A and B have the same test interval, there is a 50%
>chance that a B failure will not have been detected in time to
suppress
>the an A failure alert.
>Right?
I see from several old discussions that this is the case.
I think I have a simple method to eliminate the "missed dependant
failure" which I did not see discussed:
Just use the "alertafter" tag in the period section to
require a longer successive failure for the dependant service than it's
dependancy. For example if both use the same interval, make the
dependant service (e.g. http) use alertafter 2 and the service it is
dependent on (e.g. ping) use the default. Of course this makes you
burn more cpu or add latency, but it's simple.
Anybody using this?
Cheers,
Michael
____________________________________________________
Yahoo! Sports
Rekindle the Rivalries. Sign up for Fantasy Football
http://football.fantasysports.yahoo.com
_______________________________________________
mon mailing list
[email protected]
http://linux.kernel.org/mailman/listinfo/mon