On Sat, Apr 9, 2011 at 7:34 PM, Terry <[email protected]> wrote: > On Sat, Apr 9, 2011 at 6:06 PM, Terry <[email protected]> wrote: >> Hello, >> >> I am seeing a weird condition where host checks stop after the first >> notification. Here's the config: >> >> execute_host_checks=1 >> >> define host{ >> name generic-host >> check_command check-host-alive >> check_period 24x7 >> notification_interval 30 >> notification_options d,r >> notifications_enabled 1 >> event_handler_enabled 1 >> flap_detection_enabled 1 >> failure_prediction_enabled 1 >> process_perf_data 1 >> retain_status_information 1 >> retain_nonstatus_information 1 >> register 0 >> } >> define host{ >> name generic-host-10 >> use generic-host >> notification_period 24x7 >> check_interval 5 >> retry_interval 1 >> max_check_attempts 3 >> register 0 >> } >> define host{ >> name foo-10 >> use generic-host-10 >> contact_groups +foo_primary >> register 0 >> } >> define host{ >> use foo-10 >> host_name testpage >> hostgroups windows,vmguest_windows >> notification_interval 5 >> parents firewall >> address 10.235.235.235 >> } >> >> define hostescalation{ >> hostgroup_name z-allhosts >> contacts [email protected],support-email-critical >> first_notification 1 >> last_notification 1 >> notification_interval 0 >> escalation_options d >> } >> define hostescalation{ >> hostgroup_name z-allhosts >> contact_groups +foo_secondary >> first_notification 3 >> last_notification 4 >> notification_interval 30 >> escalation_options d,r >> } >> define hostescalation{ >> hostgroup_name z-allhosts >> contact_groups +foo_tertiary,foo_secondary >> first_notification 5 >> last_notification 0 >> notification_interval 30 >> escalation_options d,r >> } >> >> >> Here's a log of the activity. You see the first notification, then nothing >> >> [1302388222] HOST ALERT: testpage;DOWN;SOFT;1;CRITICAL - Plugin timed >> out after 10 seconds >> [1302388296] HOST ALERT: testpage;DOWN;SOFT;2;PING CRITICAL - Packet loss = >> 100% >> [1302388346] SERVICE ALERT: testpage;cpu - >> nrpe;CRITICAL;HARD;1;CHECK_NRPE: Socket timeout after 50 seconds. >> [1302388376] HOST ALERT: testpage;DOWN;HARD;3;PING CRITICAL - Packet loss = >> 100% >> [1302388376] HOST NOTIFICATION: >> joe-epager;testpage;DOWN;host-notify-by-epager;PING CRITICAL - Packet >> loss = 100% >> [1302388376] HOST NOTIFICATION: >> [email protected];testpage;DOWN;host-notify-by-email;PING CRITICAL - Packet >> loss = 100% >> [1302388377] HOST NOTIFICATION: >> support-email-critical;testpage;DOWN;host-notify-by-email;PING >> CRITICAL - Packet loss = 100% >> [1302388446] SERVICE ALERT: testpage;disk drives - >> nrpe;CRITICAL;HARD;1;CHECK_NRPE: Socket timeout after 50 seconds. >> [1302388547] SERVICE ALERT: testpage;memory - page - >> nrpe;CRITICAL;HARD;1;CHECK_NRPE: Socket timeout after 50 seconds. >> [1302388657] SERVICE ALERT: testpage;memory - physical - >> nrpe;CRITICAL;HARD;1;CHECK_NRPE: Socket timeout after 50 seconds. >> [1302388757] SERVICE ALERT: >> testpage;nrpeclient;CRITICAL;HARD;1;CHECK_NRPE: Socket timeout after >> 50 seconds. >> [1302389057] SERVICE ALERT: >> testpage;nrpeclient;CRITICAL;HARD;1;CHECK_NRPE: Socket timeout after >> 50 seconds. >> [1302389357] SERVICE ALERT: >> testpage;nrpeclient;CRITICAL;HARD;1;CHECK_NRPE: Socket timeout after >> 50 seconds. >> [1302389658] SERVICE ALERT: >> testpage;nrpeclient;CRITICAL;HARD;1;CHECK_NRPE: Socket timeout after >> 50 seconds. >> >> >> I appreciate the help. >> > > More info: > > [04-09-2011 19:29:17] SERVICE ALERT: > testpage;nrpeclient;CRITICAL;HARD;1;CHECK_NRPE: Socket timeout after > 50 seconds. > > I get this event every 5 minutes. It's just a service on this box. I > thought if the host was down, service checks were suppressed. Is that > not the case? >
Sorry for continually replying to my own thread. I guess checks are happening. Notifications are not happening and then of course escalations are not either. 1. host gets checked 3 times 2. alert gets sent 3. check happens again after 5 minutes 4. Current attempt goes back to 1/3 (HARD state) 5. No notifications thereafter Confused. ------------------------------------------------------------------------------ Xperia(TM) PLAY It's a major breakthrough. An authentic gaming smartphone on the nation's most reliable network. And it wants your games. http://p.sf.net/sfu/verizon-sfdev _______________________________________________ Nagios-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
