Hello mon'ers ...

Thanks to Jim and all the mon mail list participants, we've got a working
mon installation. Of course I have a few questions for which I couldn't
find answers. Here is the mon.cf entry on which my questions are based:

watch test
service daily-test
description sysadmin 9pm paging test
interval 15m
monitor telnet.monitor
period wd {Sun-Sat} hr {9pm-10pm}
#           exclude_period wd {Sun-Sat} hr {10pm-8pm}
alertafter 1
alert nightly.alert sysadmin
numalerts 1

Once a day, period wd {Sun-Sat} hr {9pm-10pm} we trigger a failure so our
sysadmin staff will receive a qpage.alert - nightly.alert above is just
a modified qpage.alert to send a particular text string. This lets us know
mon is doing it's job. I tried to add:

exclude_period wd {Sun-Sat} hr {10pm-8pm}
so the monitor would not operate during this period, but it appears mon is
applying this directive to the entire mon.cf even though it's only inside
one watch group.

Does anyone use this exclude_period directive and could you give me a
sample of it's use inside a watch group? How do you achieve having mon run
it's monitor script for this watch group only during {9pm-10pm}?

Secondly, in the example above, I think it's also necessary to reset the
status on this watch so that next day it knows to again trigger an alert?
I've tried using 'moncmd list failures' to see the stored variables and
'moncmd set test daily-test alerts_sent 0' with this result:
test daily-test alerts_sent='0'
220 set completed
A subsequent 'moncmd list failures' still shows the variable
alerts_sent='1'. There must be a simple way to do this.

Lastly, we'd like to access the value stored in 'failure_duration' for
both email and pager notifications. Does anyone use this or has anyone
started some work on it?

-- 
Best regards,

Tony Hunter

Reply via email to