I've found what appears to be a bug, and I'm hoping Jim or someone
can shed some light on it?

I'm using traps with the traptimeout feature, to do heartbeat-style
monitoring.

Looking in the 'mon' source code, I see that each trap has a
value called _trap_timer, and with every iteration through the
main monitoring loop, this timer gets decremented by the amount
of time since the last iteration.

Also, I see that if a trap ever times out, an alert gets sent,
and its _trap_timer gets reset to the value of its traptimeout.
So far so good.

But what if the trap never times out?  It appears that the
value of _trap_timer just keeps getting decremented forever!
(There's a different conditional that keeps alerts from being
sent after it gets below zero.)  I can't find anything in the
code that could ever reset it.  Am I misunderstanding the
intended purpose of _trap_timer?

To see what I'm talking about, do this for any trap in
your config file that has a traptimeout defined:
     moncmd get <watchgroup> <service> _trap_timer
Assuming mon has been running for a while and that trap hasn't
timed out recently, you'll probably see a big negative number. :-)

Tim

_______________________________________________
mon mailing list
[EMAIL PROTECTED]
http://linux.kernel.org/mailman/listinfo/mon

Reply via email to