>> Doing some testing, I notice that the polling interval doesn't seem >> to update with the failure_interval after a montrap is received >> causing the service to fail. >> >> If you manually test the service, it updates the polling interval >> correctly, and after mon does the first check of the service after >> the montrap is received, interval is also updated. But not when the >> actual trap is received. >> >> Is this intentional, by design or a bug? I would think that the >> wanted behaviour would be to update the interval (call >> reset_timer()?) when a montrap is received aswell? > > > Since traps are asynchronous, updating the polling interval is > generally meaningless (or at least lacking context). >
Thank you for the reply, and I understand you point, however, in our setup it is useful. Perhaps a config example can help illustrate: server side: service somelocalcheck interval 60m failure_interval 5m monitor rmon -s somelocalcheck period wd {Sun-Sat} alertevery 60m alert mainalertscript alertafter 5 on the agent: service somelocalcheck interval 1m monitor somelocalcheck period wd {Sun-Sat} alertevery 60m alert remote.alert -P 2583 -H server upalert remote.alert -P 2583 -H server This way the local agents runs the monitors and check something(TM) often, and sends a montrap when i detects a problem. On the server side however, the check works as a heartbeat. Checking if the localservice is still alive. But this is only performed once every hour. Since it's the server that triggers the emails/sms alerts, it is useful in this setup to have the first montrap reset the interval and make it start using the failure_interval. Not sure if this makes it easier to understand why prefer to have this behaviour. Anders Synstad Basefarm AS
_______________________________________________ mon mailing list mon@linux.kernel.org http://linux.kernel.org/mailman/listinfo/mon