Hi, I'm looking for help, because I can't figure out what I'm doing wrong. I have a simple monit setup, which is supposed to monitor a web server and restart it if anything seems wrong.
This seems to work but not always. Monit does restart the service, but on subsequent failures it just notices that the service isn't working and doesn't act anymore. Example from the log, where the service was restarted, but went down again, and monit didn't do anything: [CEST May 31 06:44:11] info : 'triac.mysite.com' Monit 5.16 started [CEST May 31 09:36:29] error : 'mysite.com' failed protocol test [HTTP] at [mysite.com]:443 [TCP/IP SSL] -- HTTP: Error receiving data -- Resource temporarily unavailable [CEST May 31 09:37:39] error : 'mysite.com' failed protocol test [HTTP] at [mysite.com]:443 [TCP/IP SSL] -- HTTP: Error receiving data -- Resource temporarily unavailable [CEST May 31 09:37:39] info : 'mysite.com' exec: /usr/bin/supervisorctl [CEST May 31 09:38:49] error : 'mysite.com' failed protocol test [HTTP] at [mysite.com]:443 [TCP/IP SSL] -- HTTP: Error receiving data -- Resource temporarily unavailable [CEST May 31 09:39:59] error : 'mysite.com' failed protocol test [HTTP] at [mysite.com]:443 [TCP/IP SSL] -- HTTP: Error receiving data -- Resource temporarily unavailable [CEST May 31 09:41:09] error : 'mysite.com' failed protocol test [HTTP] at [mysite.com]:443 [TCP/IP SSL] -- HTTP: Error receiving data -- Resource temporarily unavailable [CEST May 31 09:42:19] error : 'mysite.com' failed protocol test [HTTP] at [mysite.com]:443 [TCP/IP SSL] -- HTTP: Error receiving data -- Resource temporarily unavailable [CEST May 31 09:43:29] error : 'mysite.com' failed protocol test [HTTP] at [mysite.com]:443 [TCP/IP SSL] -- HTTP: Error receiving data -- Resource temporarily unavailable [CEST May 31 09:44:39] error : 'mysite.com' failed protocol test [HTTP] at [mysite.com]:443 [TCP/IP SSL] -- HTTP: Error receiving data -- Resource temporarily unavailable [CEST May 31 09:45:50] error : 'mysite.com' failed protocol test [HTTP] at [mysite.com]:443 [TCP/IP SSL] -- HTTP: Error receiving data -- Resource temporarily unavailable [CEST May 31 09:47:00] error : 'mysite.com' failed protocol test [HTTP] at [mysite.com]:443 [TCP/IP SSL] -- HTTP: Error receiving data -- Resource temporarily unavailable [CEST May 31 09:48:10] error : 'mysite.com' failed protocol test [HTTP] at [mysite.com]:443 [TCP/IP SSL] -- HTTP: Error receiving data -- Resource temporarily unavailable The net result is that the service doesn't work and monit just sits there, knowing that the service failed the protocol test, but doing nothing about it. I suspect this is because monit does not notice that the service was OK after restarting for a moment, so it does not notice another transition from OK to failed. Here is the relevant part of the configuration (nearly all of it): set daemon 60 check host mysite.com with address mysite.com if failed port 443 protocol https with ssl options {verify: enable} for 2 cycles then exec "/usr/bin/supervisorctl restart mysite" if 20 restarts within 60 cycles then unmonitor Is there a way to achieve unconditional actions? E.g. "even though I haven't noticed the service to transition from failed to working, restart it anyway after 60 seconds if it is still in the failed state" Any help would be much appreciated. --J. -- To unsubscribe: https://lists.nongnu.org/mailman/listinfo/monit-general