Mon Debugging

2005-07-14 Thread Aaron Segura



I hacked up the 
standard ping.monitor to send heartbeats to our in-house monitoring front-end 
upon a successful ping. We started having problems recently with mon not 
getting the heartbeats out in a timely manner.  I turned on the debug 
option and started digging through the log.  I noticed that usually, once 
per second, there's an entry in the log that's just a number, and it increments 
by one...usually every second...
 
usually.
 
So, on to my 
question:  Is the counter in the debug log SUPPOSED to count up once per 
second?  If it doesn't, is that a symptom of too much load on the 
system? 
 
# tail -f 
/var/log/mon | egrep 'mon\[[0-9]+\]: [0-9]+ '
Jul 14 12:58:03 
ops-inet-mon mon[3430]: 222348Jul 14 12:58:05 ops-inet-mon mon[3430]: 
222349Jul 14 12:58:06 ops-inet-mon mon[3430]: 222350Jul 14 12:58:09 
ops-inet-mon mon[3430]: 222351Jul 14 12:58:10 ops-inet-mon mon[3430]: 
222352Jul 14 12:58:13 ops-inet-mon mon[3430]: 222353Jul 14 12:58:16 
ops-inet-mon mon[3430]: 222354Jul 14 12:58:20 ops-inet-mon mon[3430]: 
222355Jul 14 12:58:24 ops-inet-mon mon[3430]: 222356Jul 14 12:58:29 
ops-inet-mon mon[3430]: 222357Jul 14 12:58:33 ops-inet-mon mon[3430]: 
222358
 
Sorry if this has 
been covered, but I couldn't find anything in the mailing list or man 
page...
 
___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


RE: Mon Debugging

2005-07-14 Thread Aaron Segura
[EMAIL PROTECTED] mon]# mon -v
$Id: mon,v 1.4.2.14 2004/11/19 18:27:47 trockij Exp $
$Name: mon-1-0-0pre5 $ 

So, for future troubleshooting efforts, the debug counter *is* supposed
to update once a second?

I will try upgrading to 1.1.0pre1.  Thanks for the info.

-Original Message-
From: David Nolan [mailto:[EMAIL PROTECTED] 
Sent: Thursday, July 14, 2005 4:27 PM
To: Aaron Segura; mon@linux.kernel.org
Subject: Re: Mon Debugging



--On Thursday, July 14, 2005 12:23 PM -0500 Aaron Segura 
<[EMAIL PROTECTED]> wrote:

> I hacked up the standard ping.monitor to send heartbeats to our
in-house
> monitoring front-end upon a successful ping. We started having
problems
> recently with mon not getting the heartbeats out in a timely manner.
I
> turned on the debug option and started digging through the log.  I
> noticed that usually, once per second, there's an entry in the log
> that's just a number, and it increments by one...usually every
second...
>

What version of mon are you running?  Mon prior to the 1.1pre* series 
blocked during the execution of an alert script, so I'd check to see if
you 
have an alert script which is hanging.

-David Nolan
 Network Software Designer
 Computing Services
 Carnegie Mellon University


___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


RE: Mon Debugging

2005-07-14 Thread David Nolan



--On Thursday, July 14, 2005 4:42 PM -0500 Aaron Segura 
<[EMAIL PROTECTED]> wrote:




So, for future troubleshooting efforts, the debug counter *is* supposed
to update once a second?

I will try upgrading to 1.1.0pre1.  Thanks for the info.


I wasn't sure, but I just went and read the code.  Its a loop counter, 
incremented once on every pass through the main loop.  In a normal 
situation mon sleeps for 1 second on every loop after processing any 
pending I/O.  So unless mon is either really busy (lots of jobs being 
queued, lots of I/O to process) or hanging (due to alert issues with older 
mon), yes, the counter should go up approximately once a second.


-David

David Nolan<*>[EMAIL PROTECTED]
curses: May you be forced to grep the termcap of an unclean yacc while
 a herd of rogue emacs fsck your troff and vgrind your pathalias!

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


RE: Mon Debugging

2005-07-14 Thread Jim Trocki

On Thu, 14 Jul 2005, David Nolan wrote:

process) or hanging (due to alert issues with older mon), yes, the counter 
should go up approximately once a second.


i'm not sure why anyone is paying attention to that counter or how long it
takes to tick, since it's not used for anything other than to count how many
times the main loop has iterated.

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon