On 9/8/06, Bill Chmura <[EMAIL PROTECTED]> wrote:
> I've recently spent a lot of time overhauling my mon.cf.  I moved to m4
> macros which I had been meaning to try (I recommend them to anyone who
> has not tried them for mon.cf).
>

(Note to self: I really need to put together a public release of the
system we use at CMU for maintaining our mon config file.  It's a
complete database driven web app for maintaining a large mon
config...)






> Basically, I was thinking for a few services that are touchy to have the
> system regularly test every 30 minutes.  But if it has a failure to test
> every minute.  Then issue an alert if it fails 5 times in one minute.

Is that a typo?  How can it fail 5 times in one minute if you're only
testing in every minute?

Since you didn't include a mon.cf snippet I'll have to guess a bit
here about whats going on.

I suspect you're trying to describe something like:
...
interval 30m
failure_interfal 10s
period ....
  alertafter 5 1m
  ....


I think you're trying to use the two-argument form of alertafter in a
way other then the intent.  The two argument form is to detect
intermittent failures, i.e. 'alertafter 2 6h' would alert if a service
fails twice in six hours.  In the case of an intermittent failure a
single failure would only result in two tests at the faster test rate
before returning to the regular test rate.

For what you're describing I think you want either 'alertafter 5'
(i.e. 5 consecutive failures) or 'alertafter 50s' (i.e. alert when a
service has failed every test for 50 seconds)


-David

_______________________________________________
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon

Reply via email to