The interval between checks is 120 seconds => it can take up to ~2 minutes to detect error with this settings.
You can lower the interval to for example 5 seconds for faster error detection. Best regards, Martin > On 8 Mar 2019, at 22:00, Fant, Andrew (NIH/NIDA) [E] <andrew.f...@nih.gov> > wrote: > > In the monitrc file, I have: > > set daemon 120 > > As for the monit -vi output, it has 22 remote host checks in total. A > shortened, anonymized copy of it is: > > Adding 'allow localhost' -- host resolved to [::ffff:127.0.0.1] > Adding credentials for user 'admin' > Runtime constants: > Control file = /etc/monitrc > Log file = syslog > Pid file = /etc/monit/monit.pid > Id file = /etc/monit/monit.id <http://monit.id/> > State file = /etc/monit/monit.state > Debug = True > Log = True > Use syslog = True > Is Daemon = True > Use process engine = True > Limits = { > = programOutput: 512 B > = sendExpectBuffer: 256 B > = fileContentBuffer: 512 B > = httpContentBuffer: 1 MB > = networkTimeout: 5 s > = programTimeout: 5 m > = stopTimeout: 30 s > = startTimeout: 30 s > = restartTimeout: 30 s > = } > On reboot = start > Poll time = 120 seconds with start delay 0 seconds > Event queue = base directory /var/monitor with 1000 slots > M/Monit(s) = http://[host1.local]:8080/collector > <http://[host1.local]:8080/collector> with timeout 5 s with credentials > Start monit httpd = True > httpd bind address = localhost > httpd portnumber = 2812 > httpd signature = Enabled > httpd auth. style = Basic Authentication and Host/Net allow list > > The service list contains the following entries: > > System Name = host1 > Monitoring mode = active > On reboot = start > > Remote Host Name = host2_ping > Address = 192.168.1.2 > Monitoring mode = active > On reboot = start > Ping = if failed [count 3 size 64 with timeout 5 s] then > alert > > ------------------------------------------------------------------------------- > > Hopefully this will be of some use. > > > -- > Andrew Fant | Systems Administrator > andrew.f...@nih.gov <mailto:andrew.f...@nih.gov> | Lei Shi Lab , > NIH/NIDA/IRP > (443)740-2849 | > > From: "mart...@tildeslash.com <mailto:mart...@tildeslash.com>" > <mart...@tildeslash.com <mailto:mart...@tildeslash.com>> > Reply-To: This is the general mailing list for monit > <monit-general@nongnu.org <mailto:monit-general@nongnu.org>> > Date: Friday, March 8, 2019 at 3:26 PM > To: This is the general mailing list for monit <monit-general@nongnu.org > <mailto:monit-general@nongnu.org>> > Subject: Re: monit not catching failed ping test > > Hello, > > monit checks the service in intervals given by the "set daemon <x>" settings. > If the interval between checks is long or the check is blocked by some > service timeout/action, then the interval can be longer. > > Please can you check the "set daemon" settings and run monit in debug mode?: > > 1.) stop monit > 2.) monit -vI > > Best regards, > Martin > > > > On 8 Mar 2019, at 16:49, Fant, Andrew (NIH/NIDA) [E] <andrew.f...@nih.gov > <mailto:andrew.f...@nih.gov>> wrote: > > Good morning. > I have a small monitoring setup with m/monit 3.7.2, using monit 5.25.2 > as the agent. There are a couple of systems that I cannot install monit on > that I still need to be aware of any downtime, so I have added them as ping > checks in the monitrc on the host where I installed m/monit. Yesterday, one > of those remote systems went down, but monit and m/monit didn’t report an > alert for it and still have its status as OK. Using anonymized information, > the entry in the monitrc on host1 is: > > CHECK HOST host2_ping with ADDRESS 192.168.1.2 > IF FAILED ping THEN ALERT > > And from the command line on host1: > > host1% monit status host2_ping > Monit 5.25.2 uptime: 48d 19h 8m > > Remote Host 'host2_ping' > status OK > monitoring status Monitored > monitoring mode active > on reboot start > ping response time - > data collected Fri, 08 Mar 2019 10:41:33 > > But: > > host1% ping host2 > PING host2.example.org <http://host2.example.org/> (192.168.1.2) 56(84) bytes > of data. > From host1.example.org <http://host1.example.org/> (192.168.1.1) icmp_seq=1 > Destination Host Unreachable > From host1.example.org <http://host1.example.org/> (192.168.1.1) icmp_seq=2 > Destination Host Unreachable > From host1.example.org <http://host1.example.org/> (192.168.1.1) icmp_seq=3 > Destination Host Unreachable > > Clearly there is a disconnect between the OS-provided ping utility and what > monit is seeing. I’m sure that it’s probably a simple error in > configuration, but I am not seeing what I did wrong. Can someone please set > me on the correct path? > > Thank you > > -- > Andrew Fant | Systems Administrator > andrew.f...@nih.gov <mailto:andrew.f...@nih.gov> | Lei Shi Lab , > NIH/NIDA/IRP > (443)740-2849 | > -- > To unsubscribe: > https://lists.nongnu.org/mailman/listinfo/monit-general > <https://lists.nongnu.org/mailman/listinfo/monit-general> > > -- > To unsubscribe: > https://lists.nongnu.org/mailman/listinfo/monit-general > <https://lists.nongnu.org/mailman/listinfo/monit-general>
-- To unsubscribe: https://lists.nongnu.org/mailman/listinfo/monit-general