On Thu, 19 May 2005 13:12:59 -0700 I wrote: >That is, assuming A and B have the same test interval, there is a 50% >chance that a B failure will not have been detected in time to suppress >the an A failure alert. >Right?
I see from several old discussions that this is the case. I think I have a simple method to eliminate the "missed dependant failure" which I did not see discussed: Just use the "alertafter" tag in the period section to require a longer successive failure for the dependant service than it's dependancy. For example if both use the same interval, make the dependant service (e.g. http) use alertafter 2 and the service it is dependent on (e.g. ping) use the default. Of course this makes you burn more cpu or add latency, but it's simple. Anybody using this? Cheers, Michael ____________________________________________________ Yahoo! Sports Rekindle the Rivalries. Sign up for Fantasy Football http://football.fantasysports.yahoo.com _______________________________________________ mon mailing list mon@linux.kernel.org http://linux.kernel.org/mailman/listinfo/mon