Re: LONG Proposal: Making mon aware of individual host failures(WAS RE: n00b alert)

David Nolan Wed, 18 Sep 2002 11:06:31 -0700

--On Tuesday, September 17, 2002 6:27 PM -0700 Andrew Ryan 
<[EMAIL PROTECTED]> wrote:

>
> This is what I'm most interested in from this change. Can you share your
> proposed new downtime log format, based on the new significance of hosts?

I haven't even thought about the format at all yet.  It might be easiest to 
create an entirely new downtime log format for per-host downtimes.

> Also be aware that if you present different monitor formats, you risk
> forking monitor development. As the maintainer of the mon contribs, I'm
> particularly sensitive to this. A possibility might be if the monitors
> defaulted to the old style, and were patched to allow an '--extended'
> mode, but that is a lot of work.
>

I was thinking about either that, or setting an environment variable before 
execution of the monitor, so mon admins don't have to remember to add the 
--extended option to every monitor line in their config.

> And keep in mind that most installations of mon have significant work
> in custom monitors which they have not contributed back to the community
> because of lack of widespread utility or licensing/management issues. So
> even if you were to modify all the monitors which come with mon, and a
> good portion of the popular contrib'ed monitors, that still leaves a lot
> of monitors out there.
>

Don't I know it...  We've got custom monitor scripts for a bunch of stuff 
already, and we haven't actually gone live with our Mon system yet.  just 
incase any is interested, we've got custom monitors for: afs servers, 
authbridge (an in-house network access/filtering system, which we're 
eventually going to release), lpd, snmpd, imap with plain text passwords, 
imap over ssl, imap with starttls (request SSL over the primary imap port), 
lmtp (part of the cyrus imap suite), imsp, jetstor raid arrays, kerberos5 
servers, wireless access points, and netmon (a locally developed network 
device data gathering system.  for most devices we're letting netmon do the 
actual polling, and we're just looking at the up/down/slow status bits in 
the database)

I don't suppose anybody already has monitor scripts for active directory 
servers, or Cisco Voice Over IP call managers?

> Since virtually all monitors use the summary line to output the list of
> failed hosts, can't your patches assume that is the list of failed hosts
> (rejecting output that doesn't look like a host in the hostgroup)? That's
> what I did in dtquery, and it worked pretty well.

Thats actually part of the point.  Most do, but not all.  Why not 
standardize on a way to provide the host list and the summary separately, 
so we can actually really treat it as a host list, rather then guessing it 
might be one.  My current implementation of per-host dependencies goes the 
guessing thing, but being able to alert or log per-host events should 
probably require more precision.



-David Nolan
 Network Software Developer
 Computing Services
 Carnegie Mellon University

_______________________________________________
mon mailing list
[EMAIL PROTECTED]
http://linux.kernel.org/mailman/listinfo/mon
Re: LONG Proposal: Making mon aware of individual host failures(WAS RE: n00b alert)

Reply via email to