To answer your question "What is "best practice", I believe that if you're monitoring something critical, mail is likely the worst alert method.

I'm using an in-house alerting protocol which is heartbeat based.  Upon a successful monitor event a heartbeat is sent to a system listening for it along with a message regarding when the next heartbeat is coming.  
If it doesn't receive the next heartbeat within %150 of the interval, an audible alarm sounds inside the ops control room.  This works quite well when you have 24/7 staffing and eliminate false alarms.  
Poke around on freshmeat and sourceforge and you'll find heartbeat based utilities to do the same sort of thing.

If you can't do something like that, I'd consider installing a modem to send text pages or even a computer generated voice message.  
If you have a major outage or network problems, your phone line may be your last resort without a 24/7 operations department.

Plan for the worst.  It WILL happen eventually!



Michael Vogt <[EMAIL PROTECTED]>
Sent by: [EMAIL PROTECTED]

03/21/2005 10:16 PM

To
mon@linux.kernel.org
cc
Subject
monitoring email capability for monitor alerts





Hi all,

I am aware of the smtp.monitor but am still unsure/unconvinced that
this is enough to detect if my mon monitors have lost the capability to
deliver email to the administrators.

I have two monitors in separate locations/cities.  One monitor is
located at a remote hosting center and monitors the applications and
network on that datacenter and also monitors the other monitor. The
other monitor is at a different location and primarily monitors the
monitor at the datacenter.  I want to monitor that each of these
monitors has email capability.  My first inclination is to run the
sendmail daemon on each mon machine and configure the smtp.monitor on
each mon to check the smtp server on the other mon machine. I'm still
relatively ignorant regarding email but it seems like there are still a
lot of things that can go wrong which can prevent mail.alerts from
reaching their destination but which are not detected by this simple
mutual smtp.monitor approach.  Perhaps a daily status email to the
admins would prevent a problem from going unnoticed for more than a
day.

I have one idea for an improvement. As with some of the application
monitoring, I prefer to actually make the application "do the thing" it
does. In the case of email that could be to make each monitor send an
email to the other monitor and have each  monitor check for those
emails.  For example, send an email every 5 minutes and every 10
minutes make sure you received an email (and empty the mailbox) else
alert that email capability is suspect.  I can see that this still is
not foolproof, since emails need to go to other locations to inform the
admins.  

I saw in searching through the posts regarding mail that there exists
an smtp loopback monitor.  It looks like this would have most of the
code needed to do the above.  Perhaps it does everything I want, as is,
by configuring it to loop through the smtp servers that would service
the administrators.


Anyway, after all this rambling, the bottom line question is:

What is "best practice" out there and what do most admins find
sufficient (along with any holes in each approach)?


Many thanks for any insights (or concrete examples).

Michael Vogt



__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

_______________________________________________
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon

_______________________________________________
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon

Reply via email to