Re: [Nagios-users] Nagios is ignoring the retry_interval setting

2012-12-07 Thread FTL Nagios
Re-tested after changing the max file size of the debug file.

This one should contain everything from the moment I started Nagios to the
moment I stopped it during testing (approx. 10 minutes)

http://dl.dropbox.com/u/895609/nagios.debug

Thankyou

-Original Message-
From: FTL Nagios [mailto:ftlnag...@gmail.com] 
Sent: 07 December 2012 10:56
To: 'zarre...@linux.it'; 'Nagios Users List'
Subject: RE: [Nagios-users] Nagios is ignoring the retry_interval setting

Hi,

Apologies for the delay, been very busy with other things.

Right I have put Nagios into Debug this morning and rerun the tests.

I let it get a couple of successful pings to the server then pulled the
network cable from it.

Behaviour is completely different this morning

The host check is behaving now and rechecking every 3 minutes as its told
too in the host template. I got my text and email alert to say the host was
down when I expected it!

But now its the service check that is running every 1 minute now, which its
not told too when in problem state.

My service template clearly states  when in problem state to retry_interval
of 3 minutes:

define service{
name service-server; The name of this host
template (used above in the checks)
check_period server_24x7; Server are monitored at
all times
check_interval 1; Server are checked every 1
minute when in OK state
retry_interval 3; Server checked every 3
minutes if in problem state
max_check_attempts 3; Server checked 3 times to
determine if its Up or Down state
notification_period server_24x7; Emails and Text are
sent out any time of day
notification_interval 3; Resend Notifications
every 3 minutes
notification_options c,r; Only send alerts for
servers in CRITICAL or RECOVERY state
notifications_enabled 0; Notifications are
disabled
contact_groups servers email, servers sms; Alerts sent
to contacts in these groups
event_handler_enabled 1; Host event handler is
enabled
process_perf_data 1; Performace data is
processed
retain_status_information1; Status Info is kept
between server restarts
retain_nonstatus_information 1; Non-Status information
is kept between server restarts
passive_checks_enabled 0; Passive Checks are
disabled
obsess_over_service 0 ; We do not obsess over
the server if in problem state
check_freshness  0 ; We do not check this
server for freshness
flap_detection_enabled 0; Flap Detection is
disabled
failure_prediction_enabled   0; We will wait for it to
actually fail thankyou!!
}

And even though its checking every minute, it went straight to Hard State on
the first check it detected it down and has stayed on check 1/3 Hard State
throughout


I really don't understand what is happening here.

The only thing different between this setup and my old nagios box is the
version - old box was 3.31, this new server is 3.4.1, I am using the same
config files that worked fine before.

Here is the debug logfiles of the above testing.

http://dl.dropbox.com/u/895609/nagios.debug1
http://dl.dropbox.com/u/895609/nagios.debug2


If you see anything please let me know, im getting angry with all the
alerts!!! :-)

Thankyou









-Original Message-
From: Giorgio Zarrelli [mailto:zarre...@linux.it]
Sent: 29 November 2012 19:24
To: Nagios Users List
Subject: Re: [Nagios-users] Nagios is ignoring the retry_interval setting

Hi,

do not seee anything wrong. Could you set debug=-1

repeat the problem and put the log online?

Giorgio


> Hi Georgio,
>
> The whole test cfg I am using to try troubleshoot this can be found at:
>
> http://dl.dropbox.com/u/895609/test.cfg
>
> This is a direct copy of my main servers config but with the rest of 
> the servers and some templates for other server checks taken out
>
>
>
> Kind Regards
> Andrew
>
> From: Andrew Thompson
> Sent: 29 November 2012 16:11
> To: nagios-users@lists.sourceforge.net
> Subject: Nagios is ignoring the retry_interval setting
>
> Hi,
>
> My nagios box has decided to stop listening to the retry_interval 
> entry in my templates.
>
> My server template reads:
>
> define host{
>  name   host-server
>  check_period  server_24x7
>  check_interval1
>  retry_interval3
>  max_check_attempts3
>  notification_period   server_24x7
>  notification_interval  3
>  notification_options  d,r
&

Re: [Nagios-users] Nagios is ignoring the retry_interval setting

2012-12-07 Thread FTL Nagios
Hi,

Apologies for the delay, been very busy with other things.

Right I have put Nagios into Debug this morning and rerun the tests.

I let it get a couple of successful pings to the server then pulled the
network cable from it.

Behaviour is completely different this morning

The host check is behaving now and rechecking every 3 minutes as its told
too in the host template. I got my text and email alert to say the host was
down when I expected it!

But now its the service check that is running every 1 minute now, which its
not told too when in problem state.

My service template clearly states  when in problem state to retry_interval
of 3 minutes:

define service{
name service-server; The name of this host
template (used above in the checks)
check_period server_24x7; Server are monitored at
all times
check_interval 1; Server are checked every 1
minute when in OK state
retry_interval 3; Server checked every 3
minutes if in problem state
max_check_attempts 3; Server checked 3 times to
determine if its Up or Down state
notification_period server_24x7; Emails and Text are
sent out any time of day
notification_interval 3; Resend Notifications
every 3 minutes
notification_options c,r; Only send alerts for
servers in CRITICAL or RECOVERY state
notifications_enabled 0; Notifications are
disabled
contact_groups servers email, servers sms; Alerts sent
to contacts in these groups
event_handler_enabled 1; Host event handler is
enabled
process_perf_data 1; Performace data is
processed
retain_status_information1; Status Info is kept
between server restarts
retain_nonstatus_information 1; Non-Status information
is kept between server restarts
passive_checks_enabled 0; Passive Checks are
disabled
obsess_over_service 0 ; We do not obsess over
the server if in problem state
check_freshness  0 ; We do not check this
server for freshness
flap_detection_enabled 0; Flap Detection is
disabled
failure_prediction_enabled   0; We will wait for it to
actually fail thankyou!!
}

And even though its checking every minute, it went straight to Hard State on
the first check it detected it down and has stayed on check 1/3 Hard State
throughout


I really don't understand what is happening here.

The only thing different between this setup and my old nagios box is the
version - old box was 3.31, this new server is 3.4.1, I am using the same
config files that worked fine before.

Here is the debug logfiles of the above testing.

http://dl.dropbox.com/u/895609/nagios.debug1
http://dl.dropbox.com/u/895609/nagios.debug2


If you see anything please let me know, im getting angry with all the
alerts!!! :-)

Thankyou









-Original Message-
From: Giorgio Zarrelli [mailto:zarre...@linux.it] 
Sent: 29 November 2012 19:24
To: Nagios Users List
Subject: Re: [Nagios-users] Nagios is ignoring the retry_interval setting

Hi,

do not seee anything wrong. Could you set debug=-1

repeat the problem and put the log online?

Giorgio


> Hi Georgio,
>
> The whole test cfg I am using to try troubleshoot this can be found at:
>
> http://dl.dropbox.com/u/895609/test.cfg
>
> This is a direct copy of my main servers config but with the rest of 
> the servers and some templates for other server checks taken out
>
>
>
> Kind Regards
> Andrew
>
> From: Andrew Thompson
> Sent: 29 November 2012 16:11
> To: nagios-users@lists.sourceforge.net
> Subject: Nagios is ignoring the retry_interval setting
>
> Hi,
>
> My nagios box has decided to stop listening to the retry_interval 
> entry in my templates.
>
> My server template reads:
>
> define host{
>  name   host-server
>  check_period  server_24x7
>  check_interval1
>  retry_interval3
>  max_check_attempts3
>  notification_period   server_24x7
>  notification_interval  3
>  notification_options  d,r
>  notifications_enabled  1
>  contact_groupsservers email, servers sms
>  event_handler_enabled  1
>  process_perf_data 1
>  retain_status_information1
>  retain_nonstatus_information 1
>  passive_checks_enabled  0
>  obsess_over_host  0
>  check_freshness  0
>  flap_detection_enabled  0
>  failure_prediction_enabled   0
>  }
>
> Now this is what h

Re: [Nagios-users] Nagios is ignoring the retry_interval setting

2012-12-03 Thread FTL Nagios
Hi Georgio,

Apologies for the delay,

I am doing this first thing tomorrow morning (Tue 4th Dec)- I will post the
debug log then.

Thankyou


-Original Message-
From: Giorgio Zarrelli [mailto:zarre...@linux.it] 
Sent: 29 November 2012 19:24
To: Nagios Users List
Subject: Re: [Nagios-users] Nagios is ignoring the retry_interval setting

Hi,

do not seee anything wrong. Could you set debug=-1

repeat the problem and put the log online?

Giorgio


> Hi Georgio,
>
> The whole test cfg I am using to try troubleshoot this can be found at:
>
> http://dl.dropbox.com/u/895609/test.cfg
>
> This is a direct copy of my main servers config but with the rest of 
> the servers and some templates for other server checks taken out
>
>
>
> Kind Regards
> Andrew
>
> From: Andrew Thompson
> Sent: 29 November 2012 16:11
> To: nagios-users@lists.sourceforge.net
> Subject: Nagios is ignoring the retry_interval setting
>
> Hi,
>
> My nagios box has decided to stop listening to the retry_interval 
> entry in my templates.
>
> My server template reads:
>
> define host{
>  name   host-server
>  check_period  server_24x7
>  check_interval1
>  retry_interval3
>  max_check_attempts3
>  notification_period   server_24x7
>  notification_interval  3
>  notification_options  d,r
>  notifications_enabled  1
>  contact_groupsservers email, servers sms
>  event_handler_enabled  1
>  process_perf_data 1
>  retain_status_information1
>  retain_nonstatus_information 1
>  passive_checks_enabled  0
>  obsess_over_host  0
>  check_freshness  0
>  flap_detection_enabled  0
>  failure_prediction_enabled   0
>  }
>
> Now this is what happens:
>
>
> * Server goes down at 1pm.
>
> * I check the next scheduled check and it clearly states 1.03pm
>
> * But at 1.01pm it checks again and then spits out an email and
> text message saying the server is down.
>
> Completely ignoring the retry_interval setting!!!
>
> Id expect from the above:
>
>
> * 1pm server goes down
>
> * 1.03pm check 2 is done
>
> * 1.06pm check 3 is done and determined hard state.
>
> * At 1.06pm the notification should be sent out.
>
> Why is this, is something in my config wrong?
>
> Ubuntu 12.04 desktop and Nagios 3.4.1
>
> Thanks
>
>
> --
>  Keep yourself connected to Go Parallel:
> VERIFY Test and improve your parallel project with help from experts 
> and peers.
> http://goparallel.sourceforge.net_
> __
> Nagios-users mailing list
> Nagios-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when 
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null




--
Keep yourself connected to Go Parallel: 
VERIFY Test and improve your parallel project with help from experts and
peers. http://goparallel.sourceforge.net
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


--
Keep yourself connected to Go Parallel: 
BUILD Helping you discover the best ways to construct your parallel projects.
http://goparallel.sourceforge.net
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Nagios is ignoring the retry_interval setting

2012-11-29 Thread Giorgio Zarrelli
Hi,

do not seee anything wrong. Could you set debug=-1

repeat the problem and put the log online?

Giorgio


> Hi Georgio,
>
> The whole test cfg I am using to try troubleshoot this can be found at:
>
> http://dl.dropbox.com/u/895609/test.cfg
>
> This is a direct copy of my main servers config but with the rest of the
> servers and some templates for other server checks taken out
>
>
>
> Kind Regards
> Andrew
>
> From: Andrew Thompson
> Sent: 29 November 2012 16:11
> To: nagios-users@lists.sourceforge.net
> Subject: Nagios is ignoring the retry_interval setting
>
> Hi,
>
> My nagios box has decided to stop listening to the retry_interval entry in
> my templates.
>
> My server template reads:
>
> define host{
>  name   host-server
>  check_period  server_24x7
>  check_interval1
>  retry_interval3
>  max_check_attempts3
>  notification_period   server_24x7
>  notification_interval  3
>  notification_options  d,r
>  notifications_enabled  1
>  contact_groupsservers email, servers sms
>  event_handler_enabled  1
>  process_perf_data 1
>  retain_status_information1
>  retain_nonstatus_information 1
>  passive_checks_enabled  0
>  obsess_over_host  0
>  check_freshness  0
>  flap_detection_enabled  0
>  failure_prediction_enabled   0
>  }
>
> Now this is what happens:
>
>
> * Server goes down at 1pm.
>
> * I check the next scheduled check and it clearly states 1.03pm
>
> * But at 1.01pm it checks again and then spits out an email and
> text message saying the server is down.
>
> Completely ignoring the retry_interval setting!!!
>
> Id expect from the above:
>
>
> * 1pm server goes down
>
> * 1.03pm check 2 is done
>
> * 1.06pm check 3 is done and determined hard state.
>
> * At 1.06pm the notification should be sent out.
>
> Why is this, is something in my config wrong?
>
> Ubuntu 12.04 desktop and Nagios 3.4.1
>
> Thanks
>
>
> --
> Keep yourself connected to Go Parallel:
> VERIFY Test and improve your parallel project with help from experts
> and peers.
> http://goparallel.sourceforge.net___
> Nagios-users mailing list
> Nagios-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null



--
Keep yourself connected to Go Parallel: 
VERIFY Test and improve your parallel project with help from experts 
and peers. http://goparallel.sourceforge.net
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Nagios is ignoring the retry_interval setting

2012-11-29 Thread Andrew Thompson
Hi Georgio,

The whole test cfg I am using to try troubleshoot this can be found at:

http://dl.dropbox.com/u/895609/test.cfg

This is a direct copy of my main servers config but with the rest of the 
servers and some templates for other server checks taken out



Kind Regards
Andrew

From: Andrew Thompson
Sent: 29 November 2012 16:11
To: nagios-users@lists.sourceforge.net
Subject: Nagios is ignoring the retry_interval setting

Hi,

My nagios box has decided to stop listening to the retry_interval entry in my 
templates.

My server template reads:

define host{
 name   host-server
 check_period  server_24x7
 check_interval1
 retry_interval3
 max_check_attempts3
 notification_period   server_24x7
 notification_interval  3
 notification_options  d,r
 notifications_enabled  1
 contact_groupsservers email, servers sms
 event_handler_enabled  1
 process_perf_data 1
 retain_status_information1
 retain_nonstatus_information 1
 passive_checks_enabled  0
 obsess_over_host  0
 check_freshness  0
 flap_detection_enabled  0
 failure_prediction_enabled   0
 }

Now this is what happens:


* Server goes down at 1pm.

* I check the next scheduled check and it clearly states 1.03pm

* But at 1.01pm it checks again and then spits out an email and text 
message saying the server is down.

Completely ignoring the retry_interval setting!!!

Id expect from the above:


* 1pm server goes down

* 1.03pm check 2 is done

* 1.06pm check 3 is done and determined hard state.

* At 1.06pm the notification should be sent out.

Why is this, is something in my config wrong?

Ubuntu 12.04 desktop and Nagios 3.4.1

Thanks


--
Keep yourself connected to Go Parallel: 
VERIFY Test and improve your parallel project with help from experts 
and peers. http://goparallel.sourceforge.net___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Nagios is ignoring the retry_interval setting

2012-11-29 Thread Giorgio Zarrelli
Hi,

write here the host actual definition.

Moreover, if the define host you wrote in you email is a template, why I
do not see "register 0"?



> Hi,
>
> My nagios box has decided to stop listening to the retry_interval entry in
> my templates.
>
> My server template reads:
>
> define host{
>  name   host-server
>  check_period  server_24x7
>  check_interval1
>  retry_interval3
>  max_check_attempts3
>  notification_period   server_24x7
>  notification_interval  3
>  notification_options  d,r
>  notifications_enabled  1
>  contact_groupsservers email, servers sms
>  event_handler_enabled  1
>  process_perf_data 1
>  retain_status_information1
>  retain_nonstatus_information 1
>  passive_checks_enabled  0
>  obsess_over_host  0
>  check_freshness  0
>  flap_detection_enabled  0
>  failure_prediction_enabled   0
>  }
>
> Now this is what happens:
>
>
> * Server goes down at 1pm.
>
> * I check the next scheduled check and it clearly states 1.03pm
>
> * But at 1.01pm it checks again and then spits out an email and
> text message saying the server is down.
>
> Completely ignoring the retry_interval setting!!!
>
> Id expect from the above:
>
>
> * 1pm server goes down
>
> * 1.03pm check 2 is done
>
> * 1.06pm check 3 is done and determined hard state.
>
> * At 1.06pm the notification should be sent out.
>
> Why is this, is something in my config wrong?
>
> Ubuntu 12.04 desktop and Nagios 3.4.1
>
> Thanks
>
>
> --
> Keep yourself connected to Go Parallel:
> VERIFY Test and improve your parallel project with help from experts
> and peers.
> http://goparallel.sourceforge.net___
> Nagios-users mailing list
> Nagios-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null



--
Keep yourself connected to Go Parallel: 
VERIFY Test and improve your parallel project with help from experts 
and peers. http://goparallel.sourceforge.net
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Nagios is ignoring the retry_interval setting

2012-11-29 Thread Giorgio Zarrelli
Hi,

wrong.

retry interval comes in when there a state change. check_interval is the
interval for "normal" checks. When there is a status change, the
retry_interval comes in ** until ** max_check_attempts is reached, then
check_interval kicks in again.




> --
> Keep yourself connected to Go Parallel:
> VERIFY Test and improve your parallel project with help from experts
> and peers.
> http://goparallel.sourceforge.net___
> Nagios-users mailing list
> Nagios-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null



--
Keep yourself connected to Go Parallel: 
VERIFY Test and improve your parallel project with help from experts 
and peers. http://goparallel.sourceforge.net
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Nagios is ignoring the retry_interval setting

2012-11-29 Thread Gary Every
Your check_interval is set to 1, that takes precedence over retry_interval

g.;

On Nov 29, 2012, at 9:10 AM, Andrew Thompson  wrote:

> Hi,
>  
> My nagios box has decided to stop listening to the retry_interval entry in my 
> templates.
>  
> My server template reads:
>  
> define host{
>  name   host-server 
>  check_period  server_24x7  
>  check_interval1
>  retry_interval3
>  max_check_attempts3
>  notification_period   server_24x7  
>  notification_interval  3
>  notification_options  d,r
>  notifications_enabled  1
>  contact_groupsservers email, servers sms
>  event_handler_enabled  1
>  process_perf_data 1
>  retain_status_information1 
>  retain_nonstatus_information 1 
>  passive_checks_enabled  0
>  obsess_over_host  0
>  check_freshness  0
>  flap_detection_enabled  0
>  failure_prediction_enabled   0 
>  }
>  
> Now this is what happens:
>  
> · Server goes down at 1pm.
> · I check the next scheduled check and it clearly states 1.03pm
> · But at 1.01pm it checks again and then spits out an email and text 
> message saying the server is down.
>  
> Completely ignoring the retry_interval setting!!!
>  
> Id expect from the above:
>  
> · 1pm server goes down
> · 1.03pm check 2 is done
> · 1.06pm check 3 is done and determined hard state.
> · At 1.06pm the notification should be sent out.
>  
> Why is this, is something in my config wrong?
>  
> Ubuntu 12.04 desktop and Nagios 3.4.1
>  
> Thanks
>  
>  
> --
> Keep yourself connected to Go Parallel: 
> VERIFY Test and improve your parallel project with help from experts 
> and peers. 
> http://goparallel.sourceforge.net___
> Nagios-users mailing list
> Nagios-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when reporting 
> any issue. 
> ::: Messages without supporting info will risk being sent to /dev/null

--
Keep yourself connected to Go Parallel: 
VERIFY Test and improve your parallel project with help from experts 
and peers. http://goparallel.sourceforge.net___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] Nagios is ignoring the retry_interval setting

2012-11-29 Thread Andrew Thompson
Hi,

My nagios box has decided to stop listening to the retry_interval entry in my 
templates.

My server template reads:

define host{
 name   host-server
 check_period  server_24x7
 check_interval1
 retry_interval3
 max_check_attempts3
 notification_period   server_24x7
 notification_interval  3
 notification_options  d,r
 notifications_enabled  1
 contact_groupsservers email, servers sms
 event_handler_enabled  1
 process_perf_data 1
 retain_status_information1
 retain_nonstatus_information 1
 passive_checks_enabled  0
 obsess_over_host  0
 check_freshness  0
 flap_detection_enabled  0
 failure_prediction_enabled   0
 }

Now this is what happens:


* Server goes down at 1pm.

* I check the next scheduled check and it clearly states 1.03pm

* But at 1.01pm it checks again and then spits out an email and text 
message saying the server is down.

Completely ignoring the retry_interval setting!!!

Id expect from the above:


* 1pm server goes down

* 1.03pm check 2 is done

* 1.06pm check 3 is done and determined hard state.

* At 1.06pm the notification should be sent out.

Why is this, is something in my config wrong?

Ubuntu 12.04 desktop and Nagios 3.4.1

Thanks


--
Keep yourself connected to Go Parallel: 
VERIFY Test and improve your parallel project with help from experts 
and peers. http://goparallel.sourceforge.net___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null