Re: [Nagios-users] Nagios is ignoring the retry_interval setting
Re-tested after changing the max file size of the debug file. This one should contain everything from the moment I started Nagios to the moment I stopped it during testing (approx. 10 minutes) http://dl.dropbox.com/u/895609/nagios.debug Thankyou -Original Message- From: FTL Nagios [mailto:ftlnag...@gmail.com] Sent: 07 December 2012 10:56 To: 'zarre...@linux.it'; 'Nagios Users List' Subject: RE: [Nagios-users] Nagios is ignoring the retry_interval setting Hi, Apologies for the delay, been very busy with other things. Right I have put Nagios into Debug this morning and rerun the tests. I let it get a couple of successful pings to the server then pulled the network cable from it. Behaviour is completely different this morning The host check is behaving now and rechecking every 3 minutes as its told too in the host template. I got my text and email alert to say the host was down when I expected it! But now its the service check that is running every 1 minute now, which its not told too when in problem state. My service template clearly states when in problem state to retry_interval of 3 minutes: define service{ name service-server; The name of this host template (used above in the checks) check_period server_24x7; Server are monitored at all times check_interval 1; Server are checked every 1 minute when in OK state retry_interval 3; Server checked every 3 minutes if in problem state max_check_attempts 3; Server checked 3 times to determine if its Up or Down state notification_period server_24x7; Emails and Text are sent out any time of day notification_interval 3; Resend Notifications every 3 minutes notification_options c,r; Only send alerts for servers in CRITICAL or RECOVERY state notifications_enabled 0; Notifications are disabled contact_groups servers email, servers sms; Alerts sent to contacts in these groups event_handler_enabled 1; Host event handler is enabled process_perf_data 1; Performace data is processed retain_status_information1; Status Info is kept between server restarts retain_nonstatus_information 1; Non-Status information is kept between server restarts passive_checks_enabled 0; Passive Checks are disabled obsess_over_service 0 ; We do not obsess over the server if in problem state check_freshness 0 ; We do not check this server for freshness flap_detection_enabled 0; Flap Detection is disabled failure_prediction_enabled 0; We will wait for it to actually fail thankyou!! } And even though its checking every minute, it went straight to Hard State on the first check it detected it down and has stayed on check 1/3 Hard State throughout I really don't understand what is happening here. The only thing different between this setup and my old nagios box is the version - old box was 3.31, this new server is 3.4.1, I am using the same config files that worked fine before. Here is the debug logfiles of the above testing. http://dl.dropbox.com/u/895609/nagios.debug1 http://dl.dropbox.com/u/895609/nagios.debug2 If you see anything please let me know, im getting angry with all the alerts!!! :-) Thankyou -Original Message- From: Giorgio Zarrelli [mailto:zarre...@linux.it] Sent: 29 November 2012 19:24 To: Nagios Users List Subject: Re: [Nagios-users] Nagios is ignoring the retry_interval setting Hi, do not seee anything wrong. Could you set debug=-1 repeat the problem and put the log online? Giorgio > Hi Georgio, > > The whole test cfg I am using to try troubleshoot this can be found at: > > http://dl.dropbox.com/u/895609/test.cfg > > This is a direct copy of my main servers config but with the rest of > the servers and some templates for other server checks taken out > > > > Kind Regards > Andrew > > From: Andrew Thompson > Sent: 29 November 2012 16:11 > To: nagios-users@lists.sourceforge.net > Subject: Nagios is ignoring the retry_interval setting > > Hi, > > My nagios box has decided to stop listening to the retry_interval > entry in my templates. > > My server template reads: > > define host{ > name host-server > check_period server_24x7 > check_interval1 > retry_interval3 > max_check_attempts3 > notification_period server_24x7 > notification_interval 3 > notification_options d,r &
Re: [Nagios-users] Nagios is ignoring the retry_interval setting
Hi, Apologies for the delay, been very busy with other things. Right I have put Nagios into Debug this morning and rerun the tests. I let it get a couple of successful pings to the server then pulled the network cable from it. Behaviour is completely different this morning The host check is behaving now and rechecking every 3 minutes as its told too in the host template. I got my text and email alert to say the host was down when I expected it! But now its the service check that is running every 1 minute now, which its not told too when in problem state. My service template clearly states when in problem state to retry_interval of 3 minutes: define service{ name service-server; The name of this host template (used above in the checks) check_period server_24x7; Server are monitored at all times check_interval 1; Server are checked every 1 minute when in OK state retry_interval 3; Server checked every 3 minutes if in problem state max_check_attempts 3; Server checked 3 times to determine if its Up or Down state notification_period server_24x7; Emails and Text are sent out any time of day notification_interval 3; Resend Notifications every 3 minutes notification_options c,r; Only send alerts for servers in CRITICAL or RECOVERY state notifications_enabled 0; Notifications are disabled contact_groups servers email, servers sms; Alerts sent to contacts in these groups event_handler_enabled 1; Host event handler is enabled process_perf_data 1; Performace data is processed retain_status_information1; Status Info is kept between server restarts retain_nonstatus_information 1; Non-Status information is kept between server restarts passive_checks_enabled 0; Passive Checks are disabled obsess_over_service 0 ; We do not obsess over the server if in problem state check_freshness 0 ; We do not check this server for freshness flap_detection_enabled 0; Flap Detection is disabled failure_prediction_enabled 0; We will wait for it to actually fail thankyou!! } And even though its checking every minute, it went straight to Hard State on the first check it detected it down and has stayed on check 1/3 Hard State throughout I really don't understand what is happening here. The only thing different between this setup and my old nagios box is the version - old box was 3.31, this new server is 3.4.1, I am using the same config files that worked fine before. Here is the debug logfiles of the above testing. http://dl.dropbox.com/u/895609/nagios.debug1 http://dl.dropbox.com/u/895609/nagios.debug2 If you see anything please let me know, im getting angry with all the alerts!!! :-) Thankyou -Original Message- From: Giorgio Zarrelli [mailto:zarre...@linux.it] Sent: 29 November 2012 19:24 To: Nagios Users List Subject: Re: [Nagios-users] Nagios is ignoring the retry_interval setting Hi, do not seee anything wrong. Could you set debug=-1 repeat the problem and put the log online? Giorgio > Hi Georgio, > > The whole test cfg I am using to try troubleshoot this can be found at: > > http://dl.dropbox.com/u/895609/test.cfg > > This is a direct copy of my main servers config but with the rest of > the servers and some templates for other server checks taken out > > > > Kind Regards > Andrew > > From: Andrew Thompson > Sent: 29 November 2012 16:11 > To: nagios-users@lists.sourceforge.net > Subject: Nagios is ignoring the retry_interval setting > > Hi, > > My nagios box has decided to stop listening to the retry_interval > entry in my templates. > > My server template reads: > > define host{ > name host-server > check_period server_24x7 > check_interval1 > retry_interval3 > max_check_attempts3 > notification_period server_24x7 > notification_interval 3 > notification_options d,r > notifications_enabled 1 > contact_groupsservers email, servers sms > event_handler_enabled 1 > process_perf_data 1 > retain_status_information1 > retain_nonstatus_information 1 > passive_checks_enabled 0 > obsess_over_host 0 > check_freshness 0 > flap_detection_enabled 0 > failure_prediction_enabled 0 > } > > Now this is what h
Re: [Nagios-users] Nagios is ignoring the retry_interval setting
Hi Georgio, Apologies for the delay, I am doing this first thing tomorrow morning (Tue 4th Dec)- I will post the debug log then. Thankyou -Original Message- From: Giorgio Zarrelli [mailto:zarre...@linux.it] Sent: 29 November 2012 19:24 To: Nagios Users List Subject: Re: [Nagios-users] Nagios is ignoring the retry_interval setting Hi, do not seee anything wrong. Could you set debug=-1 repeat the problem and put the log online? Giorgio > Hi Georgio, > > The whole test cfg I am using to try troubleshoot this can be found at: > > http://dl.dropbox.com/u/895609/test.cfg > > This is a direct copy of my main servers config but with the rest of > the servers and some templates for other server checks taken out > > > > Kind Regards > Andrew > > From: Andrew Thompson > Sent: 29 November 2012 16:11 > To: nagios-users@lists.sourceforge.net > Subject: Nagios is ignoring the retry_interval setting > > Hi, > > My nagios box has decided to stop listening to the retry_interval > entry in my templates. > > My server template reads: > > define host{ > name host-server > check_period server_24x7 > check_interval1 > retry_interval3 > max_check_attempts3 > notification_period server_24x7 > notification_interval 3 > notification_options d,r > notifications_enabled 1 > contact_groupsservers email, servers sms > event_handler_enabled 1 > process_perf_data 1 > retain_status_information1 > retain_nonstatus_information 1 > passive_checks_enabled 0 > obsess_over_host 0 > check_freshness 0 > flap_detection_enabled 0 > failure_prediction_enabled 0 > } > > Now this is what happens: > > > * Server goes down at 1pm. > > * I check the next scheduled check and it clearly states 1.03pm > > * But at 1.01pm it checks again and then spits out an email and > text message saying the server is down. > > Completely ignoring the retry_interval setting!!! > > Id expect from the above: > > > * 1pm server goes down > > * 1.03pm check 2 is done > > * 1.06pm check 3 is done and determined hard state. > > * At 1.06pm the notification should be sent out. > > Why is this, is something in my config wrong? > > Ubuntu 12.04 desktop and Nagios 3.4.1 > > Thanks > > > -- > Keep yourself connected to Go Parallel: > VERIFY Test and improve your parallel project with help from experts > and peers. > http://goparallel.sourceforge.net_ > __ > Nagios-users mailing list > Nagios-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS when > reporting any issue. > ::: Messages without supporting info will risk being sent to /dev/null -- Keep yourself connected to Go Parallel: VERIFY Test and improve your parallel project with help from experts and peers. http://goparallel.sourceforge.net ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- Keep yourself connected to Go Parallel: BUILD Helping you discover the best ways to construct your parallel projects. http://goparallel.sourceforge.net ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Nagios is ignoring the retry_interval setting
Hi, do not seee anything wrong. Could you set debug=-1 repeat the problem and put the log online? Giorgio > Hi Georgio, > > The whole test cfg I am using to try troubleshoot this can be found at: > > http://dl.dropbox.com/u/895609/test.cfg > > This is a direct copy of my main servers config but with the rest of the > servers and some templates for other server checks taken out > > > > Kind Regards > Andrew > > From: Andrew Thompson > Sent: 29 November 2012 16:11 > To: nagios-users@lists.sourceforge.net > Subject: Nagios is ignoring the retry_interval setting > > Hi, > > My nagios box has decided to stop listening to the retry_interval entry in > my templates. > > My server template reads: > > define host{ > name host-server > check_period server_24x7 > check_interval1 > retry_interval3 > max_check_attempts3 > notification_period server_24x7 > notification_interval 3 > notification_options d,r > notifications_enabled 1 > contact_groupsservers email, servers sms > event_handler_enabled 1 > process_perf_data 1 > retain_status_information1 > retain_nonstatus_information 1 > passive_checks_enabled 0 > obsess_over_host 0 > check_freshness 0 > flap_detection_enabled 0 > failure_prediction_enabled 0 > } > > Now this is what happens: > > > * Server goes down at 1pm. > > * I check the next scheduled check and it clearly states 1.03pm > > * But at 1.01pm it checks again and then spits out an email and > text message saying the server is down. > > Completely ignoring the retry_interval setting!!! > > Id expect from the above: > > > * 1pm server goes down > > * 1.03pm check 2 is done > > * 1.06pm check 3 is done and determined hard state. > > * At 1.06pm the notification should be sent out. > > Why is this, is something in my config wrong? > > Ubuntu 12.04 desktop and Nagios 3.4.1 > > Thanks > > > -- > Keep yourself connected to Go Parallel: > VERIFY Test and improve your parallel project with help from experts > and peers. > http://goparallel.sourceforge.net___ > Nagios-users mailing list > Nagios-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS when > reporting any issue. > ::: Messages without supporting info will risk being sent to /dev/null -- Keep yourself connected to Go Parallel: VERIFY Test and improve your parallel project with help from experts and peers. http://goparallel.sourceforge.net ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Nagios is ignoring the retry_interval setting
Hi Georgio, The whole test cfg I am using to try troubleshoot this can be found at: http://dl.dropbox.com/u/895609/test.cfg This is a direct copy of my main servers config but with the rest of the servers and some templates for other server checks taken out Kind Regards Andrew From: Andrew Thompson Sent: 29 November 2012 16:11 To: nagios-users@lists.sourceforge.net Subject: Nagios is ignoring the retry_interval setting Hi, My nagios box has decided to stop listening to the retry_interval entry in my templates. My server template reads: define host{ name host-server check_period server_24x7 check_interval1 retry_interval3 max_check_attempts3 notification_period server_24x7 notification_interval 3 notification_options d,r notifications_enabled 1 contact_groupsservers email, servers sms event_handler_enabled 1 process_perf_data 1 retain_status_information1 retain_nonstatus_information 1 passive_checks_enabled 0 obsess_over_host 0 check_freshness 0 flap_detection_enabled 0 failure_prediction_enabled 0 } Now this is what happens: * Server goes down at 1pm. * I check the next scheduled check and it clearly states 1.03pm * But at 1.01pm it checks again and then spits out an email and text message saying the server is down. Completely ignoring the retry_interval setting!!! Id expect from the above: * 1pm server goes down * 1.03pm check 2 is done * 1.06pm check 3 is done and determined hard state. * At 1.06pm the notification should be sent out. Why is this, is something in my config wrong? Ubuntu 12.04 desktop and Nagios 3.4.1 Thanks -- Keep yourself connected to Go Parallel: VERIFY Test and improve your parallel project with help from experts and peers. http://goparallel.sourceforge.net___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Nagios is ignoring the retry_interval setting
Hi, write here the host actual definition. Moreover, if the define host you wrote in you email is a template, why I do not see "register 0"? > Hi, > > My nagios box has decided to stop listening to the retry_interval entry in > my templates. > > My server template reads: > > define host{ > name host-server > check_period server_24x7 > check_interval1 > retry_interval3 > max_check_attempts3 > notification_period server_24x7 > notification_interval 3 > notification_options d,r > notifications_enabled 1 > contact_groupsservers email, servers sms > event_handler_enabled 1 > process_perf_data 1 > retain_status_information1 > retain_nonstatus_information 1 > passive_checks_enabled 0 > obsess_over_host 0 > check_freshness 0 > flap_detection_enabled 0 > failure_prediction_enabled 0 > } > > Now this is what happens: > > > * Server goes down at 1pm. > > * I check the next scheduled check and it clearly states 1.03pm > > * But at 1.01pm it checks again and then spits out an email and > text message saying the server is down. > > Completely ignoring the retry_interval setting!!! > > Id expect from the above: > > > * 1pm server goes down > > * 1.03pm check 2 is done > > * 1.06pm check 3 is done and determined hard state. > > * At 1.06pm the notification should be sent out. > > Why is this, is something in my config wrong? > > Ubuntu 12.04 desktop and Nagios 3.4.1 > > Thanks > > > -- > Keep yourself connected to Go Parallel: > VERIFY Test and improve your parallel project with help from experts > and peers. > http://goparallel.sourceforge.net___ > Nagios-users mailing list > Nagios-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS when > reporting any issue. > ::: Messages without supporting info will risk being sent to /dev/null -- Keep yourself connected to Go Parallel: VERIFY Test and improve your parallel project with help from experts and peers. http://goparallel.sourceforge.net ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Nagios is ignoring the retry_interval setting
Hi, wrong. retry interval comes in when there a state change. check_interval is the interval for "normal" checks. When there is a status change, the retry_interval comes in ** until ** max_check_attempts is reached, then check_interval kicks in again. > -- > Keep yourself connected to Go Parallel: > VERIFY Test and improve your parallel project with help from experts > and peers. > http://goparallel.sourceforge.net___ > Nagios-users mailing list > Nagios-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS when > reporting any issue. > ::: Messages without supporting info will risk being sent to /dev/null -- Keep yourself connected to Go Parallel: VERIFY Test and improve your parallel project with help from experts and peers. http://goparallel.sourceforge.net ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Nagios is ignoring the retry_interval setting
Your check_interval is set to 1, that takes precedence over retry_interval g.; On Nov 29, 2012, at 9:10 AM, Andrew Thompson wrote: > Hi, > > My nagios box has decided to stop listening to the retry_interval entry in my > templates. > > My server template reads: > > define host{ > name host-server > check_period server_24x7 > check_interval1 > retry_interval3 > max_check_attempts3 > notification_period server_24x7 > notification_interval 3 > notification_options d,r > notifications_enabled 1 > contact_groupsservers email, servers sms > event_handler_enabled 1 > process_perf_data 1 > retain_status_information1 > retain_nonstatus_information 1 > passive_checks_enabled 0 > obsess_over_host 0 > check_freshness 0 > flap_detection_enabled 0 > failure_prediction_enabled 0 > } > > Now this is what happens: > > · Server goes down at 1pm. > · I check the next scheduled check and it clearly states 1.03pm > · But at 1.01pm it checks again and then spits out an email and text > message saying the server is down. > > Completely ignoring the retry_interval setting!!! > > Id expect from the above: > > · 1pm server goes down > · 1.03pm check 2 is done > · 1.06pm check 3 is done and determined hard state. > · At 1.06pm the notification should be sent out. > > Why is this, is something in my config wrong? > > Ubuntu 12.04 desktop and Nagios 3.4.1 > > Thanks > > > -- > Keep yourself connected to Go Parallel: > VERIFY Test and improve your parallel project with help from experts > and peers. > http://goparallel.sourceforge.net___ > Nagios-users mailing list > Nagios-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS when reporting > any issue. > ::: Messages without supporting info will risk being sent to /dev/null -- Keep yourself connected to Go Parallel: VERIFY Test and improve your parallel project with help from experts and peers. http://goparallel.sourceforge.net___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] Nagios is ignoring the retry_interval setting
Hi, My nagios box has decided to stop listening to the retry_interval entry in my templates. My server template reads: define host{ name host-server check_period server_24x7 check_interval1 retry_interval3 max_check_attempts3 notification_period server_24x7 notification_interval 3 notification_options d,r notifications_enabled 1 contact_groupsservers email, servers sms event_handler_enabled 1 process_perf_data 1 retain_status_information1 retain_nonstatus_information 1 passive_checks_enabled 0 obsess_over_host 0 check_freshness 0 flap_detection_enabled 0 failure_prediction_enabled 0 } Now this is what happens: * Server goes down at 1pm. * I check the next scheduled check and it clearly states 1.03pm * But at 1.01pm it checks again and then spits out an email and text message saying the server is down. Completely ignoring the retry_interval setting!!! Id expect from the above: * 1pm server goes down * 1.03pm check 2 is done * 1.06pm check 3 is done and determined hard state. * At 1.06pm the notification should be sent out. Why is this, is something in my config wrong? Ubuntu 12.04 desktop and Nagios 3.4.1 Thanks -- Keep yourself connected to Go Parallel: VERIFY Test and improve your parallel project with help from experts and peers. http://goparallel.sourceforge.net___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null