--On Wednesday, June 25, 2003 1:55 PM +0200 Jeroen Moors <[EMAIL PROTECTED]> wrote:

Hi All,

We are running here MON at our datacenter to monitor servers of our
customers. I've created a special alert script that sends mails directly
to the customer if his server goes down.

Now I think I've the following beheavior. When 2 hosts in 1 group go
down, I only receive an up-alert when both hosts are back. For me this
isn't the perfect situation, because the first customer in this case
doesn't receive a mail with an upalert.

Is mon really working like this and if so is there a way to bypass this?


Yes, right now you only get an up-alert when the entire hostgroup is back to an OK state. So if you want to get upalerts about individual hosts, you have to put them in separate hostgroups. Right now you'll get a sequence of 3 alerts, and an upalert. i.e.:


alert: HOST1
alert: HOST1 HOST2
alert: HOST1
upalert: ALL OK


Part of the issue here is that currently Mon doesn't actually know the real status of individual hosts. It knows "the script returned an error code" and "the error summary was XXX". Most monitor scripts return the list of failed hosts as the error summary, but not all do.


There has been discussion about adding true per-host status tracking to Mon. There is a lot of interest in having that support added. Clearly the better behavior for the above scenario would be to add a new alert type, which I'll
call status-change:


alert: HOST1 failed
status-change: HOST2 failed, HOST1 still failed
status-change: HOST2 OK, HOST1 still failed
upalert: HOST1 OK, All OK


I'd like to spend the month or two doing the Mon rewrite necessary for that to work, when I'm done with my current Mon related project(*). Unfortunately it is still unclear whether my employer will agree that the work is important enough to justify the time.


(*): For those playing the home game... My current project, the full featured web interface to the Mon configuration, is coming along nicely. The projected completion date is the end of August. If anyone is seriously interested in taking a look and giving me feedback, let me know.


-David Nolan Network Software Developer Computing Services Carnegie Mellon University

_______________________________________________
mon mailing list
[EMAIL PROTECTED]
http://linux.kernel.org/mailman/listinfo/mon

Reply via email to