Re: [Nagios-users] Problem: Nagios service check retry interval shorted than configured.

2011-03-11 Thread Paul M. Dubuc
Jelle Smet wrote:
>> Why is there only 10 seconds between these pairs of checks?  Sometimes I see 
>> a
>> 20 or 30 second difference sometimes 60 seconds.  Most of them are less than
>> 30 seconds.  It's very inconsistent.  Any idea what could be causing this?
>
> Hi Paul,
>
> I have been looking into this myself the last couple of days.
>
> Nagios does "on demand" host checks, the reason for this is explained here
> http://nagios.sourceforge.net/docs/3_0/hostchecks.html
> It basically means Nagios executes the host check when it thinks it needs to 
> do
> so.
>
> You could alter the cached host check horizon
> (http://nagios.sourceforge.net/docs/3_0/cachedchecks.html) so Nagios does "on
> demand" checks less frequent and uses older host results instead.
>
> What I'm personally wondering is whether "on demand" checks should count as
> retries?  Because this is the case at the moment and it makes the parameter
> 'retry_interval' virtually useless.
>
> Hope this helps,
>
> Jelle Smet
> http://www.smetj.net

Thanks, I think I understand how this works.  But I'm having this problem with 
service checks, not host checks.  I do have the concurrent service check limit 
set to 30 and I wonder if that is affecting the scheduling of service check 
retries but, if so, I would think it would make the retry interval longer, not 
shorter than specified.  Does anyone know if service check retries are subject 
to the concurrency limit?

Paul Dubuc

--
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Problem: Nagios service check retry interval shorted than configured.

2011-03-11 Thread Jelle Smet
> Why is there only 10 seconds between these pairs of checks?  Sometimes I see 
> a 
> 20 or 30 second difference sometimes 60 seconds.  Most of them are less than 
> 30 seconds.  It's very inconsistent.  Any idea what could be causing this?

Hi Paul,

I have been looking into this myself the last couple of days.

Nagios does "on demand" host checks, the reason for this is explained here 
http://nagios.sourceforge.net/docs/3_0/hostchecks.html
It basically means Nagios executes the host check when it thinks it needs to do 
so.

You could alter the cached host check horizon 
(http://nagios.sourceforge.net/docs/3_0/cachedchecks.html) so Nagios does "on 
demand" checks less frequent and uses older host results instead.

What I'm personally wondering is whether "on demand" checks should count as 
retries?  Because this is the case at the moment and it makes the parameter 
'retry_interval' virtually useless.

Hope this helps,

Jelle Smet
http://www.smetj.net


--
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] Problem: Nagios service check retry interval shorted than configured.

2011-03-09 Thread Paul M. Dubuc
I have nagios core 3.2.3 built on SuSE 11.1 and I've been noticing apparent
problem with service check retries.  The normal check interval is set to 7.5
and the retry interval is set to 1 minute.  I'm seeing entries like this in
the log:

[03-02-2011 16:44:39] SERVICE ALERT:
aps11;Extra_01.20;OK;SOFT;2;SELRC OK

[03-02-2011 16:44:29] SERVICE ALERT:
aps11;Extra_01.20;UNKNOWN;SOFT;1;SELRC UNKNOWN - Timeout (130 sec.)
reached


[03-02-2011 13:28:19] SERVICE ALERT: aps14;Extra_04.15;OK;SOFT;2;SELRC OK

[03-02-2011 13:28:09] SERVICE ALERT: aps14;Extra_04.15;CRITICAL;SOFT;1;SELRC 
CRITICAL

Why is there only 10 seconds between these pairs of checks?  Sometimes I see a 
20 or 30 second difference sometimes 60 seconds.  Most of them are less than 
30 seconds.  It's very inconsistent.  Any idea what could be causing this?

Thanks,
Paul Dubuc

--
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null