Hi Callum, the event filter which you use is correct way to suppress particular alerts - the filter should be however set to match the test rule which generates the event, in the case "resource":
alert [email protected] but not on { resource } The "exec" event is generated when the program execution failed (for example start or stop program wasn't able to start the process). The alert settings is for the whole service => if you want to receive the alert when the space usage exceeded some hard limit, you need to use two service entries - for example: --8<-- check device cachefs_cleanup with path /foo/cache alert [email protected] but not on { resource } if space usage > 90% then exec "/bar/reduce_cache.sh" as uid apache and gid apache if inode usage > 80% then exec "/bar/reduce_cache.sh" as uid apache and gid apache check device cachefs_alarm with path /foo/cache if space usage > 97% for 5 cycles then alert if inode usage > 90% for 5 cycles then alert --8<-- Regards, Martin On Feb 3, 2012, at 5:46 PM, Callum Macdonald wrote: > I'm using monit to monitor our tmpfs filesystems which are used for > caching purposes. I've written a script which deletes the oldest 10% of > files, according to their atime. > > I've configured monit to run that script with exec when the filesystem > usage reaches 90%, like this: > > check device cachefs with path /foo/cache > alert [email protected] but not on { exec } > if space usage > 90% then exec "/bar/reduce_cache.sh" as uid > apache and gid apache > if inode usage > 80% then exec "/bar/reduce_cache.sh" as uid > apache and gid apache > if space usage > 97% for 5 cycles then alert > if inode usage > 90% for 5 cycles then alert > > This node is running monit 4.10.1. > > Currently I get emails like this: > * Resource limit matched Service cachefs > * Resource limit passed Service cachefs > > I get two emails every time, one when the exec is run, then a second > when the test recovers. I want to disable both of those emails, for this > filesystem check only. > > I *do* want to get an email if the script fails and filesystem usage > hits 97% or more. > > Can I do something like: > if space usage > 90% then exec "/foo/reduce_cache.sh" not alert > > I've read the docs but can't figure out the syntax. As I read the > documentation, alerts can be enabled for the whole filesystem check or > not at all, but not per test. Is that correct? > > Do I need to create two different checks? Is there any danger in having > two monit checks monitoring the same filesystem? > > Alternatively, is using monit's exec the "wrong" approach to control > cache usage? I figured it was simpler and possibly more reliable than a > cron script which parses the output of `df` and calls the script when > required. > > Thanks in advance for any input. > > Love & joy - Callum. > > == > Callum Macdonald > > French mobile: +33 7 8708 5410 > UK mobile: +44 7968 378 810 > Desk: +44 845 126 0875 > www.callum-macdonald.com > > > > > -- > To unsubscribe: > https://lists.nongnu.org/mailman/listinfo/monit-general -- To unsubscribe: https://lists.nongnu.org/mailman/listinfo/monit-general
