On 9/8/06, Bill Chmura <[EMAIL PROTECTED]> wrote: > I've recently spent a lot of time overhauling my mon.cf. I moved to m4 > macros which I had been meaning to try (I recommend them to anyone who > has not tried them for mon.cf). >
(Note to self: I really need to put together a public release of the system we use at CMU for maintaining our mon config file. It's a complete database driven web app for maintaining a large mon config...) > Basically, I was thinking for a few services that are touchy to have the > system regularly test every 30 minutes. But if it has a failure to test > every minute. Then issue an alert if it fails 5 times in one minute. Is that a typo? How can it fail 5 times in one minute if you're only testing in every minute? Since you didn't include a mon.cf snippet I'll have to guess a bit here about whats going on. I suspect you're trying to describe something like: ... interval 30m failure_interfal 10s period .... alertafter 5 1m .... I think you're trying to use the two-argument form of alertafter in a way other then the intent. The two argument form is to detect intermittent failures, i.e. 'alertafter 2 6h' would alert if a service fails twice in six hours. In the case of an intermittent failure a single failure would only result in two tests at the faster test rate before returning to the regular test rate. For what you're describing I think you want either 'alertafter 5' (i.e. 5 consecutive failures) or 'alertafter 50s' (i.e. alert when a service has failed every test for 50 seconds) -David _______________________________________________ mon mailing list mon@linux.kernel.org http://linux.kernel.org/mailman/listinfo/mon