[Nagios-users] Service checks for redundant hosts

2013-07-30 Thread Ben Prew
Hey,

I'm looking for some suggestions for implementing a service check on a
redundant host pair that access a shared resource.

Here's our setup:

We have N hosts that process (via delayed_job) a shared job queue
(mysql/redis).  We have several checks that are host-specific (# of workers
on that host), but we also have several checks that examine the shared job
queue (# of unprocessed jobs).

I have several possible implementations:


1. Shared Job Queue check on single processing host (current setup)
Pros:
* We only get notified once when the shared queue is high

Cons:
* If the single host goes down, we lose the shared queue check


2. Shared Job Queue check on all processing hosts

Pros:
* If a single processing host goes down, the shared queue check still
functions

Cons:
* Multiple emails from hosts when the shared check fails


3. Shared Job Queue check on job queue host (ie the DB box)

Pros:
* If the DB goes down, you can't reach the queue anyway
* Single email on failure

Cons:
* The check requires app knowledge, which requires having the app deployed
on the job queue host

How are others adding a check like this?  #2 and just bite the bullet for
multiple emails?

Thanks
--
Get your SQL database under version control now!
Version control is standard for application code, but databases havent 
caught up. So what steps can you take to put your SQL databases under 
version control? Why should you start doing it? Read more to find out.
http://pubads.g.doubleclick.net/gampad/clk?id=49501711&iu=/4140/ostg.clktrk___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] service checks running too often

2012-12-14 Thread Andreas Ericsson
On 12/14/2012 04:19 PM, Mark Keisler wrote:
> What you propose sounds acceptable.  In the meantime  I need to be careful
> about reloading nagios :).  Once I get it in that state, I have to disable
> use_retained_scheduling_info and then do a full restart.
> 

I've actually checked Nagios 4 now, and it appears we don't do this there.
I didn't test it all that thoroughly (and I probably should), but it's
friday and I'm two beers past my best-before-thinking hour, so I'll just
refrain from trying it further today.

-- 
Andreas Ericsson   andreas.erics...@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225  Fax: +46 8-230231

Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.

--
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] service checks running too often

2012-12-14 Thread Mark Keisler
What you propose sounds acceptable.  In the meantime  I need to be careful
about reloading nagios :).  Once I get it in that state, I have to disable
use_retained_scheduling_info and then do a full restart.


On Fri, Dec 14, 2012 at 3:41 AM, Andreas Ericsson  wrote:

> On 12/14/2012 05:13 AM, Mark Keisler wrote:
> > I think I found the issue.  If I happen to send a reload (HUP) to nagios
> > while a service check is in progress (fairly easy since my service check
> is
> > rather long lived), the reloaded nagios doesn't seem to know about that
> > service check and so I'll end up with another being scheduled as well as
> > the original on its schedule.  Create a dummy service check that just
> > sleeps for 30 seconds or something and issue a reload while it is running
> > and see if your nagios instance will start another sequence of service
> > checks.
> >
>
> This should be pretty easily fixed by just adding a check reaping event
> before initializing the event queue and skipping all checks that have
> already been scheduled.
>
> I'll have to add a check for it in 4.x. Since we keep workers between
> reloads, the same thing can easily happen there.
>
> That means we'll reschedule all checks like normal when we're starting,
> but if a check result comes in when a new check is already scheduled,
> we'll remove the old event and reschedule a new one according to the
> retry interval. I'd suggest doing something similar in the 3.4.x
> branch, but I'm not sure I can commit to that one without doing a new
> svn clone, and that takes at least a day.
>
> Mark; Would that be acceptable to you?
>
> Oh, and good catch :)
>
> --
> Andreas Ericsson   andreas.erics...@op5.se
> OP5 AB www.op5.se
> Tel: +46 8-230225  Fax: +46 8-230231
>
> Considering the successes of the wars on alcohol, poverty, drugs and
> terror, I think we should give some serious thought to declaring war
> on peace.
>
--
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] service checks running too often

2012-12-14 Thread Andreas Ericsson
On 12/14/2012 05:13 AM, Mark Keisler wrote:
> I think I found the issue.  If I happen to send a reload (HUP) to nagios
> while a service check is in progress (fairly easy since my service check is
> rather long lived), the reloaded nagios doesn't seem to know about that
> service check and so I'll end up with another being scheduled as well as
> the original on its schedule.  Create a dummy service check that just
> sleeps for 30 seconds or something and issue a reload while it is running
> and see if your nagios instance will start another sequence of service
> checks.
> 

This should be pretty easily fixed by just adding a check reaping event
before initializing the event queue and skipping all checks that have
already been scheduled.

I'll have to add a check for it in 4.x. Since we keep workers between
reloads, the same thing can easily happen there.

That means we'll reschedule all checks like normal when we're starting,
but if a check result comes in when a new check is already scheduled,
we'll remove the old event and reschedule a new one according to the
retry interval. I'd suggest doing something similar in the 3.4.x
branch, but I'm not sure I can commit to that one without doing a new
svn clone, and that takes at least a day.

Mark; Would that be acceptable to you?

Oh, and good catch :)

-- 
Andreas Ericsson   andreas.erics...@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225  Fax: +46 8-230231

Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.

--
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] service checks running too often

2012-12-13 Thread Mark Keisler
I think I found the issue.  If I happen to send a reload (HUP) to nagios
while a service check is in progress (fairly easy since my service check is
rather long lived), the reloaded nagios doesn't seem to know about that
service check and so I'll end up with another being scheduled as well as
the original on its schedule.  Create a dummy service check that just
sleeps for 30 seconds or something and issue a reload while it is running
and see if your nagios instance will start another sequence of service
checks.


On Thu, Dec 13, 2012 at 2:37 PM, Mike Guthrie  wrote:

>
> On 12/13/2012 12:38 PM, Mark Keisler wrote:
>
> I understand that nagios dynamically adjusts service check times, but the
> puzzling thing is that there is a check that runs every 5 minutes but then
> an extra or two in between.  And yes, the web interface shows the next
> service check as 5 mins out and yet another runs before that time hits.
>
> Is there any chance that there could be a second instance of Nagios
> running?   Look for multiple *parent* processes from the following
>
> #modify the nagios binary path to match your system
>
> ps aux | grep /bin/nagios
>
>  /etc/init.d/nagios stop
>
> killall -9 nagios
>
> /etc/init.d/nagios start
>
>
>
>
>
> On Thu, Dec 13, 2012 at 10:24 AM, Mike Guthrie wrote:
>
>>  Although some of those start times do seem close together, it's
>> important to know that the check_interval in Nagios is not necessarily a
>> hard number. Nagios is continually adjusting and recalculating the check
>> schedule, so if you need a check to run on a hard 5mn schedule, you might
>> be better off using cron, and then pushing the result to Nagios passively.
>>
>> With that said, access the service details for this service. When new
>> results come in does the scheduler set the Next Check 5mn out as expected?
>>
>>
>>
>> On 12/13/2012 9:43 AM, Mark Keisler wrote:
>>
>>  I'm running Nagios 3.4.1 on RHEL6. I have an issue where I have a
>> poller (service check) that is running too often and I am not sure why. I
>> have "service_check_timeout=180" because I had trouble with the poller
>> running long. Relevant settings for the service check:
>>
>> check_period24x7
>> max_check_attempts  1
>> normal_check_interval   5
>> retry_check_interval5
>>
>> I also set up a tracking logger in the poller to record "timestamp PID
>> started by PPID : Poll [Start|End] of poller"
>> 2012-12-12_12:26:38 19448 started by 19442 : Poll Start of poller
>> 2012-12-12_12:27:13 19448 started by 19442 : Poll End of poller
>> 2012-12-12_12:28:14 19931 started by 19930 : Poll Start of poller
>> 2012-12-12_12:30:14 19931 started by 19930 : Poll End of poller
>> 2012-12-12_12:31:37 20467 started by 20460 : Poll Start of poller
>> 2012-12-12_12:33:15 20949 started by 20946 : Poll Start of poller
>> 2012-12-12_12:33:15 20467 started by 20460 : Poll End of poller
>> 2012-12-12_12:33:41 20949 started by 20946 : Poll End of poller
>> 2012-12-12_12:36:38 21483 started by 21478 : Poll Start of poller
>> 2012-12-12_12:38:14 21971 started by 21964 : Poll Start of poller
>> 2012-12-12_12:39:17 21483 started by 21478 : Poll End of poller
>> 2012-12-12_12:39:18 21971 started by 21964 : Poll End of poller
>> 2012-12-12_12:41:38 22500 started by 22492 : Poll Start of poller
>> 2012-12-12_12:42:19 22500 started by 22492 : Poll End of poller
>> 2012-12-12_12:43:14 23003 started by 22999 : Poll Start of poller
>> 2012-12-12_12:45:20 23003 started by 22999 : Poll End of poller
>> 2012-12-12_12:46:37 23540 started by 23535 : Poll Start of poller
>> 2012-12-12_12:48:14 24025 started by 24024 : Poll Start of poller
>> 2012-12-12_12:48:20 23540 started by 23535 : Poll End of poller
>> 2012-12-12_12:48:41 24025 started by 24024 : Poll End of poller
>> 2012-12-12_12:51:38 24558 started by 24554 : Poll Start of poller
>> 2012-12-12_12:53:14 25044 started by 25041 : Poll Start of poller
>> 2012-12-12_12:54:35 25044 started by 25041 : Poll End of poller
>>
>> As you can see, I start to get overlapping pollers. I don't understand
>> why this would happen. Any hints or clues?
>>
>>
>>  
>> --
>> LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
>> Remotely access PCs and mobile devices and provide instant support
>> Improve your efficiency, and focus on delivering more value-add services
>> Discover what IT Professionals Know. Rescue 
>> delivershttp://p.sf.net/sfu/logmein_12329d2d
>>
>>
>>
>> ___
>> Nagios-users mailing 
>> listNagios-users@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/nagios-users
>> ::: Please include Nagios version, plugin version (-v) and OS when reporting 
>> any issue.
>> ::: Messages without supporting info will risk being sent to /dev/null
>>
>>
>>
>> --
>>
>>
>> Mike Guthrie
>> Technical Team
>> ___
>> Nagios Ent

Re: [Nagios-users] service checks running too often

2012-12-13 Thread Mark Keisler
There isn't a second nagios instance.  While I was watching the pollers
spawn, they all led back to the same master nagios instance.


On Thu, Dec 13, 2012 at 2:37 PM, Mike Guthrie  wrote:

>
> On 12/13/2012 12:38 PM, Mark Keisler wrote:
>
> I understand that nagios dynamically adjusts service check times, but the
> puzzling thing is that there is a check that runs every 5 minutes but then
> an extra or two in between.  And yes, the web interface shows the next
> service check as 5 mins out and yet another runs before that time hits.
>
> Is there any chance that there could be a second instance of Nagios
> running?   Look for multiple *parent* processes from the following
>
> #modify the nagios binary path to match your system
>
> ps aux | grep /bin/nagios
>
>  /etc/init.d/nagios stop
>
> killall -9 nagios
>
> /etc/init.d/nagios start
>
>
>
>
>
> On Thu, Dec 13, 2012 at 10:24 AM, Mike Guthrie wrote:
>
>>  Although some of those start times do seem close together, it's
>> important to know that the check_interval in Nagios is not necessarily a
>> hard number. Nagios is continually adjusting and recalculating the check
>> schedule, so if you need a check to run on a hard 5mn schedule, you might
>> be better off using cron, and then pushing the result to Nagios passively.
>>
>> With that said, access the service details for this service. When new
>> results come in does the scheduler set the Next Check 5mn out as expected?
>>
>>
>>
>> On 12/13/2012 9:43 AM, Mark Keisler wrote:
>>
>>  I'm running Nagios 3.4.1 on RHEL6. I have an issue where I have a
>> poller (service check) that is running too often and I am not sure why. I
>> have "service_check_timeout=180" because I had trouble with the poller
>> running long. Relevant settings for the service check:
>>
>> check_period24x7
>> max_check_attempts  1
>> normal_check_interval   5
>> retry_check_interval5
>>
>> I also set up a tracking logger in the poller to record "timestamp PID
>> started by PPID : Poll [Start|End] of poller"
>> 2012-12-12_12:26:38 19448 started by 19442 : Poll Start of poller
>> 2012-12-12_12:27:13 19448 started by 19442 : Poll End of poller
>> 2012-12-12_12:28:14 19931 started by 19930 : Poll Start of poller
>> 2012-12-12_12:30:14 19931 started by 19930 : Poll End of poller
>> 2012-12-12_12:31:37 20467 started by 20460 : Poll Start of poller
>> 2012-12-12_12:33:15 20949 started by 20946 : Poll Start of poller
>> 2012-12-12_12:33:15 20467 started by 20460 : Poll End of poller
>> 2012-12-12_12:33:41 20949 started by 20946 : Poll End of poller
>> 2012-12-12_12:36:38 21483 started by 21478 : Poll Start of poller
>> 2012-12-12_12:38:14 21971 started by 21964 : Poll Start of poller
>> 2012-12-12_12:39:17 21483 started by 21478 : Poll End of poller
>> 2012-12-12_12:39:18 21971 started by 21964 : Poll End of poller
>> 2012-12-12_12:41:38 22500 started by 22492 : Poll Start of poller
>> 2012-12-12_12:42:19 22500 started by 22492 : Poll End of poller
>> 2012-12-12_12:43:14 23003 started by 22999 : Poll Start of poller
>> 2012-12-12_12:45:20 23003 started by 22999 : Poll End of poller
>> 2012-12-12_12:46:37 23540 started by 23535 : Poll Start of poller
>> 2012-12-12_12:48:14 24025 started by 24024 : Poll Start of poller
>> 2012-12-12_12:48:20 23540 started by 23535 : Poll End of poller
>> 2012-12-12_12:48:41 24025 started by 24024 : Poll End of poller
>> 2012-12-12_12:51:38 24558 started by 24554 : Poll Start of poller
>> 2012-12-12_12:53:14 25044 started by 25041 : Poll Start of poller
>> 2012-12-12_12:54:35 25044 started by 25041 : Poll End of poller
>>
>> As you can see, I start to get overlapping pollers. I don't understand
>> why this would happen. Any hints or clues?
>>
>>
>>  
>> --
>> LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
>> Remotely access PCs and mobile devices and provide instant support
>> Improve your efficiency, and focus on delivering more value-add services
>> Discover what IT Professionals Know. Rescue 
>> delivershttp://p.sf.net/sfu/logmein_12329d2d
>>
>>
>>
>> ___
>> Nagios-users mailing 
>> listNagios-users@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/nagios-users
>> ::: Please include Nagios version, plugin version (-v) and OS when reporting 
>> any issue.
>> ::: Messages without supporting info will risk being sent to /dev/null
>>
>>
>>
>> --
>>
>>
>> Mike Guthrie
>> Technical Team
>> ___
>> Nagios Enterprises, LLC
>> Email:  mguth...@nagios.com
>> Web:www.nagios.com
>>
>>
>>
>> --
>> LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
>> Remotely access PCs and mobile devices and provide instant support
>> Improve your efficiency, and focus on delivering more value-add services
>> Di

Re: [Nagios-users] service checks running too often

2012-12-13 Thread Mike Guthrie


On 12/13/2012 12:38 PM, Mark Keisler wrote:
I understand that nagios dynamically adjusts service check times, but 
the puzzling thing is that there is a check that runs every 5 minutes 
but then an extra or two in between.  And yes, the web interface shows 
the next service check as 5 mins out and yet another runs before that 
time hits.
Is there any chance that there could be a second instance of Nagios 
running?   Look for multiple *parent* processes from the following


#modify the nagios binary path to match your system

ps aux | grep /bin/nagios

/etc/init.d/nagios stop

killall -9 nagios

/etc/init.d/nagios start






On Thu, Dec 13, 2012 at 10:24 AM, Mike Guthrie > wrote:


Although some of those start times do seem close together, it's
important to know that the check_interval in Nagios is not
necessarily a hard number. Nagios is continually adjusting and
recalculating the check schedule, so if you need a check to run on
a hard 5mn schedule, you might be better off using cron, and then
pushing the result to Nagios passively.

With that said, access the service details for this service. When
new results come in does the scheduler set the Next Check 5mn out
as expected?



On 12/13/2012 9:43 AM, Mark Keisler wrote:

I'm running Nagios 3.4.1 on RHEL6. I have an issue where I have a
poller (service check) that is running too often and I am not
sure why. I have "service_check_timeout=180" because I had
trouble with the poller running long. Relevant settings for the
service check:

check_period24x7
max_check_attempts  1
normal_check_interval   5
retry_check_interval5

I also set up a tracking logger in the poller to record
"timestamp PID started by PPID : Poll [Start|End] of poller"
2012-12-12_12:26:38 19448 started by 19442 : Poll Start of poller
2012-12-12_12:27:13 19448 started by 19442 : Poll End of poller
2012-12-12_12:28:14 19931 started by 19930 : Poll Start of poller
2012-12-12_12:30:14 19931 started by 19930 : Poll End of poller
2012-12-12_12:31:37 20467 started by 20460 : Poll Start of poller
2012-12-12_12:33:15 20949 started by 20946 : Poll Start of poller
2012-12-12_12:33:15 20467 started by 20460 : Poll End of poller
2012-12-12_12:33:41 20949 started by 20946 : Poll End of poller
2012-12-12_12:36:38 21483 started by 21478 : Poll Start of poller
2012-12-12_12:38:14 21971 started by 21964 : Poll Start of poller
2012-12-12_12:39:17 21483 started by 21478 : Poll End of poller
2012-12-12_12:39:18 21971 started by 21964 : Poll End of poller
2012-12-12_12:41:38 22500 started by 22492 : Poll Start of poller
2012-12-12_12:42:19 22500 started by 22492 : Poll End of poller
2012-12-12_12:43:14 23003 started by 22999 : Poll Start of poller
2012-12-12_12:45:20 23003 started by 22999 : Poll End of poller
2012-12-12_12:46:37 23540 started by 23535 : Poll Start of poller
2012-12-12_12:48:14 24025 started by 24024 : Poll Start of poller
2012-12-12_12:48:20 23540 started by 23535 : Poll End of poller
2012-12-12_12:48:41 24025 started by 24024 : Poll End of poller
2012-12-12_12:51:38 24558 started by 24554 : Poll Start of poller
2012-12-12_12:53:14 25044 started by 25041 : Poll Start of poller
2012-12-12_12:54:35 25044 started by 25041 : Poll End of poller

As you can see, I start to get overlapping pollers. I don't
understand why this would happen. Any hints or clues?



--
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d


___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net  

https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when 
reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null



-- 



Mike Guthrie
Technical Team
___
Nagios Enterprises, LLC
Email:mguth...@nagios.com  
Web:www.nagios.com  



--
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add
services
Discover what IT Professionals Know. Rescue delivers
http://

Re: [Nagios-users] service checks running too often

2012-12-13 Thread Mark Keisler
I understand that nagios dynamically adjusts service check times, but the
puzzling thing is that there is a check that runs every 5 minutes but then
an extra or two in between.  And yes, the web interface shows the next
service check as 5 mins out and yet another runs before that time hits.


On Thu, Dec 13, 2012 at 10:24 AM, Mike Guthrie  wrote:

>  Although some of those start times do seem close together, it's
> important to know that the check_interval in Nagios is not necessarily a
> hard number. Nagios is continually adjusting and recalculating the check
> schedule, so if you need a check to run on a hard 5mn schedule, you might
> be better off using cron, and then pushing the result to Nagios passively.
>
> With that said, access the service details for this service. When new
> results come in does the scheduler set the Next Check 5mn out as expected?
>
>
>
> On 12/13/2012 9:43 AM, Mark Keisler wrote:
>
> I'm running Nagios 3.4.1 on RHEL6. I have an issue where I have a poller
> (service check) that is running too often and I am not sure why. I have
> "service_check_timeout=180" because I had trouble with the poller running
> long. Relevant settings for the service check:
>
> check_period24x7
> max_check_attempts  1
> normal_check_interval   5
> retry_check_interval5
>
> I also set up a tracking logger in the poller to record "timestamp PID
> started by PPID : Poll [Start|End] of poller"
> 2012-12-12_12:26:38 19448 started by 19442 : Poll Start of poller
> 2012-12-12_12:27:13 19448 started by 19442 : Poll End of poller
> 2012-12-12_12:28:14 19931 started by 19930 : Poll Start of poller
> 2012-12-12_12:30:14 19931 started by 19930 : Poll End of poller
> 2012-12-12_12:31:37 20467 started by 20460 : Poll Start of poller
> 2012-12-12_12:33:15 20949 started by 20946 : Poll Start of poller
> 2012-12-12_12:33:15 20467 started by 20460 : Poll End of poller
> 2012-12-12_12:33:41 20949 started by 20946 : Poll End of poller
> 2012-12-12_12:36:38 21483 started by 21478 : Poll Start of poller
> 2012-12-12_12:38:14 21971 started by 21964 : Poll Start of poller
> 2012-12-12_12:39:17 21483 started by 21478 : Poll End of poller
> 2012-12-12_12:39:18 21971 started by 21964 : Poll End of poller
> 2012-12-12_12:41:38 22500 started by 22492 : Poll Start of poller
> 2012-12-12_12:42:19 22500 started by 22492 : Poll End of poller
> 2012-12-12_12:43:14 23003 started by 22999 : Poll Start of poller
> 2012-12-12_12:45:20 23003 started by 22999 : Poll End of poller
> 2012-12-12_12:46:37 23540 started by 23535 : Poll Start of poller
> 2012-12-12_12:48:14 24025 started by 24024 : Poll Start of poller
> 2012-12-12_12:48:20 23540 started by 23535 : Poll End of poller
> 2012-12-12_12:48:41 24025 started by 24024 : Poll End of poller
> 2012-12-12_12:51:38 24558 started by 24554 : Poll Start of poller
> 2012-12-12_12:53:14 25044 started by 25041 : Poll Start of poller
> 2012-12-12_12:54:35 25044 started by 25041 : Poll End of poller
>
> As you can see, I start to get overlapping pollers. I don't understand why
> this would happen. Any hints or clues?
>
>
> --
> LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
> Remotely access PCs and mobile devices and provide instant support
> Improve your efficiency, and focus on delivering more value-add services
> Discover what IT Professionals Know. Rescue 
> delivershttp://p.sf.net/sfu/logmein_12329d2d
>
>
>
> ___
> Nagios-users mailing 
> listNagios-users@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when reporting 
> any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
>
>
>
> --
>
>
> Mike Guthrie
> Technical Team
> ___
> Nagios Enterprises, LLC
> Email:  mguth...@nagios.com
> Web:www.nagios.com
>
>
>
> --
> LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
> Remotely access PCs and mobile devices and provide instant support
> Improve your efficiency, and focus on delivering more value-add services
> Discover what IT Professionals Know. Rescue delivers
> http://p.sf.net/sfu/logmein_12329d2d
> ___
> Nagios-users mailing list
> Nagios-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
>
--
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your 

Re: [Nagios-users] service checks running too often

2012-12-13 Thread Mike Guthrie
Although some of those start times do seem close together, it's 
important to know that the check_interval in Nagios is not necessarily a 
hard number. Nagios is continually adjusting and recalculating the check 
schedule, so if you need a check to run on a hard 5mn schedule, you 
might be better off using cron, and then pushing the result to Nagios 
passively.


With that said, access the service details for this service. When new 
results come in does the scheduler set the Next Check 5mn out as expected?



On 12/13/2012 9:43 AM, Mark Keisler wrote:
I'm running Nagios 3.4.1 on RHEL6. I have an issue where I have a 
poller (service check) that is running too often and I am not sure 
why. I have "service_check_timeout=180" because I had trouble with the 
poller running long. Relevant settings for the service check:


check_period24x7
max_check_attempts  1
normal_check_interval   5
retry_check_interval5

I also set up a tracking logger in the poller to record "timestamp PID 
started by PPID : Poll [Start|End] of poller"

2012-12-12_12:26:38 19448 started by 19442 : Poll Start of poller
2012-12-12_12:27:13 19448 started by 19442 : Poll End of poller
2012-12-12_12:28:14 19931 started by 19930 : Poll Start of poller
2012-12-12_12:30:14 19931 started by 19930 : Poll End of poller
2012-12-12_12:31:37 20467 started by 20460 : Poll Start of poller
2012-12-12_12:33:15 20949 started by 20946 : Poll Start of poller
2012-12-12_12:33:15 20467 started by 20460 : Poll End of poller
2012-12-12_12:33:41 20949 started by 20946 : Poll End of poller
2012-12-12_12:36:38 21483 started by 21478 : Poll Start of poller
2012-12-12_12:38:14 21971 started by 21964 : Poll Start of poller
2012-12-12_12:39:17 21483 started by 21478 : Poll End of poller
2012-12-12_12:39:18 21971 started by 21964 : Poll End of poller
2012-12-12_12:41:38 22500 started by 22492 : Poll Start of poller
2012-12-12_12:42:19 22500 started by 22492 : Poll End of poller
2012-12-12_12:43:14 23003 started by 22999 : Poll Start of poller
2012-12-12_12:45:20 23003 started by 22999 : Poll End of poller
2012-12-12_12:46:37 23540 started by 23535 : Poll Start of poller
2012-12-12_12:48:14 24025 started by 24024 : Poll Start of poller
2012-12-12_12:48:20 23540 started by 23535 : Poll End of poller
2012-12-12_12:48:41 24025 started by 24024 : Poll End of poller
2012-12-12_12:51:38 24558 started by 24554 : Poll Start of poller
2012-12-12_12:53:14 25044 started by 25041 : Poll Start of poller
2012-12-12_12:54:35 25044 started by 25041 : Poll End of poller

As you can see, I start to get overlapping pollers. I don't understand 
why this would happen. Any hints or clues?



--
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d


___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue.
::: Messages without supporting info will risk being sent to /dev/null



--


Mike Guthrie
Technical Team
___
Nagios Enterprises, LLC
Email:  mguth...@nagios.com
Web:www.nagios.com

--
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] service checks running too often

2012-12-13 Thread Mark Keisler
I'm running Nagios 3.4.1 on RHEL6. I have an issue where I have a poller
(service check) that is running too often and I am not sure why. I have
"service_check_timeout=180" because I had trouble with the poller running
long. Relevant settings for the service check:

check_period24x7
max_check_attempts  1
normal_check_interval   5
retry_check_interval5

I also set up a tracking logger in the poller to record "timestamp PID
started by PPID : Poll [Start|End] of poller"
2012-12-12_12:26:38 19448 started by 19442 : Poll Start of poller
2012-12-12_12:27:13 19448 started by 19442 : Poll End of poller
2012-12-12_12:28:14 19931 started by 19930 : Poll Start of poller
2012-12-12_12:30:14 19931 started by 19930 : Poll End of poller
2012-12-12_12:31:37 20467 started by 20460 : Poll Start of poller
2012-12-12_12:33:15 20949 started by 20946 : Poll Start of poller
2012-12-12_12:33:15 20467 started by 20460 : Poll End of poller
2012-12-12_12:33:41 20949 started by 20946 : Poll End of poller
2012-12-12_12:36:38 21483 started by 21478 : Poll Start of poller
2012-12-12_12:38:14 21971 started by 21964 : Poll Start of poller
2012-12-12_12:39:17 21483 started by 21478 : Poll End of poller
2012-12-12_12:39:18 21971 started by 21964 : Poll End of poller
2012-12-12_12:41:38 22500 started by 22492 : Poll Start of poller
2012-12-12_12:42:19 22500 started by 22492 : Poll End of poller
2012-12-12_12:43:14 23003 started by 22999 : Poll Start of poller
2012-12-12_12:45:20 23003 started by 22999 : Poll End of poller
2012-12-12_12:46:37 23540 started by 23535 : Poll Start of poller
2012-12-12_12:48:14 24025 started by 24024 : Poll Start of poller
2012-12-12_12:48:20 23540 started by 23535 : Poll End of poller
2012-12-12_12:48:41 24025 started by 24024 : Poll End of poller
2012-12-12_12:51:38 24558 started by 24554 : Poll Start of poller
2012-12-12_12:53:14 25044 started by 25041 : Poll Start of poller
2012-12-12_12:54:35 25044 started by 25041 : Poll End of poller

As you can see, I start to get overlapping pollers. I don't understand why
this would happen. Any hints or clues?
--
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] service checks could not be rescheduled properly.

2012-03-19 Thread Julian_Grunnell
Julian Grunnell | Unix Analyst, Infrastructure | TD Direct Investing
T: +44 (0) 113 346 2824 | M: +44 (0) 7889 352527



Andreas Ericsson  
16/03/2012 10:14
Please respond to
Nagios Users List 


To
Nagios Users List 
cc
julian_grunn...@tdwh.co.uk
Subject
Re: [Nagios-users] service checks could not be rescheduled properly.






On 03/15/2012 11:31 AM, julian_grunn...@tdwh.co.uk wrote:
> Anyone ... thought I'd go back to this. Does anyone have any ideas why I
> would get the following in the Nagios logs:
> 
> nagios-03-10-2012-00.log:[1331336760] Warning: Check of service 'DEAL
> SERVER SERVICE TCP 4099' on host 'TDUKUBS01' could not be rescheduled
> properly.  Scheduling check for next week...
> nagios-03-10-2012-00.log:[1331336760] Warning: Check of service 'DEAL
> SERVER SERVICE TCP 4099' on host 'TDUKUBS02' could not be rescheduled
> properly.  Scheduling check for next week...
> 
> As mentioed below, I've checked the config multiple times, I run 1100+
> service checks across 100+ hosts and just these two are causing a 
problem
> - they differ in that they have specific time periods defined - all
> detailed below. I've checked the various reports that NTP is at fault 
but
> it made no difference at all.
> 

>> define timeperiod{
>> timeperiod_name ubs4099hours
>> alias UBS 4099 Dealserver Monitoring Hours
>> monday 23:46-20:30
>> tuesday 23:46-20:30
>> wednesday 23:46-20:30
>> thursday 23:46-20:30
>> friday 23:46-20:30
>> }
>>

This timeperiod isn't valid. FROM 23:46 TO 20:30 on the same date,
there is no time for any checks to be executed in.

00:00-20:30,23:46-00:00

should work better, unless you meant "20:30-23:46", but I guess you
wouldn't have screwed it up if that's what you intended.

hth

-- 


Thanks Andreas - well your right, changed my times as above and the 
scheduling now works. tbh no idea why I thought it would work how I had it 
just didn't occur to me.

So thanks again - happy now.

J.


---

Consider the environment: Please don't print this e-mail unless you really need 
to.

Confidentiality:  This email and its attachments are intended for the above 
named only and may be confidential.  If they have come to you in error you must 
take no action based on them, nor must you copy or show them to anyone; please 
reply to this email and highlight the error.

Viruses:  Although we have taken steps to ensure that this email and 
attachments are free from any virus, we advise that in keeping with good 
computing practice the recipient should ensure that they are actually 
virus-free.

Brokerage services provided by TD Direct Investing (Europe) Limited (a 
subsidiary of The Toronto-Dominion Bank).  Authorised and regulated by the 
Financial Services Authority (FSA registered number 141282), member of the 
London Stock Exchange and the PLUS market. Incorporated in England and Wales 
under registration number 2101863.  Registered office: Exchange Court, Duncombe 
Street, Leeds LS1 4AX. Banking services provided by TD Bank N.V. authorised and 
regulated by De Nederlandsche Bank and the Financial Services Authority for UK 
Business (FSA registered number 216791).  Incorporated in the Netherlands and 
registered as a branch in England and Wales under branch registration number 
BR006780.


--
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here 
http://p.sf.net/sfu/sfd2d-msazure___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] service checks could not be rescheduled properly.

2012-03-16 Thread Andreas Ericsson
On 03/15/2012 11:31 AM, julian_grunn...@tdwh.co.uk wrote:
> Anyone ... thought I'd go back to this. Does anyone have any ideas why I
> would get the following in the Nagios logs:
> 
> nagios-03-10-2012-00.log:[1331336760] Warning: Check of service 'DEAL
> SERVER SERVICE TCP 4099' on host 'TDUKUBS01' could not be rescheduled
> properly.  Scheduling check for next week...
> nagios-03-10-2012-00.log:[1331336760] Warning: Check of service 'DEAL
> SERVER SERVICE TCP 4099' on host 'TDUKUBS02' could not be rescheduled
> properly.  Scheduling check for next week...
> 
> As mentioed below, I've checked the config multiple times, I run 1100+
> service checks across 100+ hosts and just these two are causing a problem
> - they differ in that they have specific time periods defined - all
> detailed below. I've checked the various reports that NTP is at fault but
> it made no difference at all.
> 

>> define timeperiod{
>> timeperiod_name ubs4099hours
>> alias UBS 4099 Dealserver Monitoring Hours
>> monday 23:46-20:30
>> tuesday 23:46-20:30
>> wednesday 23:46-20:30
>> thursday 23:46-20:30
>> friday 23:46-20:30
>> }
>>

This timeperiod isn't valid. FROM 23:46 TO 20:30 on the same date,
there is no time for any checks to be executed in.

00:00-20:30,23:46-00:00

should work better, unless you meant "20:30-23:46", but I guess you
wouldn't have screwed it up if that's what you intended.

hth

-- 
Andreas Ericsson   andreas.erics...@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225  Fax: +46 8-230231

Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.

--
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here 
http://p.sf.net/sfu/sfd2d-msazure
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] service checks could not be rescheduled properly.

2012-03-15 Thread Julian_Grunnell
Anyone ... thought I'd go back to this. Does anyone have any ideas why I 
would get the following in the Nagios logs:

nagios-03-10-2012-00.log:[1331336760] Warning: Check of service 'DEAL 
SERVER SERVICE TCP 4099' on host 'TDUKUBS01' could not be rescheduled 
properly.  Scheduling check for next week...
nagios-03-10-2012-00.log:[1331336760] Warning: Check of service 'DEAL 
SERVER SERVICE TCP 4099' on host 'TDUKUBS02' could not be rescheduled 
properly.  Scheduling check for next week...

As mentioed below, I've checked the config multiple times, I run 1100+ 
service checks across 100+ hosts and just these two are causing a problem 
- they differ in that they have specific time periods defined - all 
detailed below. I've checked the various reports that NTP is at fault but 
it made no difference at all.



And if I force a check, it works and then schedules to run again tonight 
at 23:46 when I want the time period to begin:




Thanks - Julian.

Julian Grunnell | Unix Analyst, Infrastructure | TD Direct Investing
T: +44 (0) 113 346 2824 | M: +44 (0) 7889 352527



julian_grunn...@tdwh.co.uk 
14/11/2011 14:49
Please respond to
Nagios Users List 


To
nagios-users@lists.sourceforge.net
cc

Subject
Re: [Nagios-users] service checks could not be rescheduled properly.







Hi - thanks, should have said this is one of the first posts I read. Lots 
of mentions of NTPD and it should be fixed / still doesn't work. And in my 
case whether I enable NTPD or not makes no difference. My checks for 
specific hosts are still getting rescheduled for next week. I run multiple 
Nagios instances and its only affecting this one - and annoyingly these 
checks are some of the most important but just happen to need to run at 
specific times of the day. 

Any other help / ideas would be appreciated. 

Thanks - J. 

Julian Grunnell | Unix Analyst, Infrastructure | TD Waterhouse
T: +44 (0) 113 346 2824 | M: +44 (0) 7889 352527 


Brandon Phelps  
11/11/2011 16:03 

Please respond to
Nagios Users List 


To
nagios-users@lists.sourceforge.net 
cc

Subject
Re: [Nagios-users] service checks could not be rescheduled properly.








Check out this bug report on the nagios.org bug tracker:

http://tracker.nagios.org/view.php?id=31


On 11/11/2011 10:39 AM, julian_grunn...@tdwh.co.uk wrote:
>
> Hi - does anyone know the answer to the following errors I'm getting in
> the nagios.log for a handful of hosts that have a specific timeperiod
> for checks set.
>
> [Wed Nov 9 23:46:00 2011] Warning: Check of service 'DEAL SERVER SERVICE
> TCP 4099' on host 'TDUKUBS02' could not be rescheduled properly.
> Scheduling check for next week...
> [Wed Nov 9 23:46:00 2011] Warning: Check of service 'DEAL SERVER SERVICE
> TCP 4099' on host 'TDUKUBS01' could not be rescheduled properly.
> Scheduling check for next week...
>
> Obviously this is causing major issues as checks are just not being
> carried out as expected, the hosts / services are defined as follows:
>
> define host{
> use unix_host_template ; Name of host template to use
> icon_image win40.jpg
> host_name TDUKUBS01
> address 192.168.75.84
> alias TDUKUBS01 192.168.75.84
> }
>
> define service{
> use ubs-service
> hostgroup_name ubsdealservers
> service_description DEAL SERVER SERVICE TCP 4099
> check_command check_ubs!4099
> check_period ubs4099hours
> }
>
> define service{
> name ubs-service
> is_volatile 0
> max_check_attempts 5
> normal_check_interval 1
> notification_interval 5
> retry_check_interval 1
> contact_groups dtns-ooh
> active_checks_enabled 1 ; Active service checks are enabled
> passive_checks_enabled 1 ; Passive service checks are enabled/accepted
> parallelize_check 1 ; Active service checks should be parallelized
> (disabling this can lead to major performance problems)
> obsess_over_service 1 ; We should obsess over this service (if 
necessary)
> check_freshness 0 ; Default is to NOT check service 'freshness'
> notifications_enabled 1 ; Service notifications are enabled
> event_handler_enabled 1 ; Service event handler is enabled
> flap_detection_enabled 1 ; Flap detection is enabled
> failure_prediction_enabled 1 ; Failure prediction is enabled
> process_perf_data 1 ; Process performance data
> retain_status_information 1 ; Retain status information across program
> restarts
> retain_nonstatus_information 1 ; Retain non-status information across
> program restarts
> is_volatile 0 ; The service is not volatile
> register 0
> }
>
> define timeperiod{
> timeperiod_name ubs4099hours
> alias UBS 4099 Dealserver Monitoring Hours
> monday 23:46-20:30
> tuesday 23:46-20:30
> wednesday 23:46-20:30
> thursday 23:46-20:30
> friday 23:46-20:30
> }
>
> It all look

Re: [Nagios-users] service checks could not be rescheduled properly.

2011-11-14 Thread Julian_Grunnell
Hi - thanks, should have said this is one of the first posts I read. Lots 
of mentions of NTPD and it should be fixed / still doesn't work. And in my 
case whether I enable NTPD or not makes no difference. My checks for 
specific hosts are still getting rescheduled for next week. I run multiple 
Nagios instances and its only affecting this one - and annoyingly these 
checks are some of the most important but just happen to need to run at 
specific times of the day.

Any other help / ideas would be appreciated.

Thanks - J.

Julian Grunnell | Unix Analyst, Infrastructure | TD Waterhouse
T: +44 (0) 113 346 2824 | M: +44 (0) 7889 352527



Brandon Phelps  
11/11/2011 16:03
Please respond to
Nagios Users List 


To
nagios-users@lists.sourceforge.net
cc

Subject
Re: [Nagios-users] service checks could not be rescheduled properly.






Check out this bug report on the nagios.org bug tracker:

http://tracker.nagios.org/view.php?id=31


On 11/11/2011 10:39 AM, julian_grunn...@tdwh.co.uk wrote:
>
> Hi - does anyone know the answer to the following errors I'm getting in
> the nagios.log for a handful of hosts that have a specific timeperiod
> for checks set.
>
> [Wed Nov 9 23:46:00 2011] Warning: Check of service 'DEAL SERVER SERVICE
> TCP 4099' on host 'TDUKUBS02' could not be rescheduled properly.
> Scheduling check for next week...
> [Wed Nov 9 23:46:00 2011] Warning: Check of service 'DEAL SERVER SERVICE
> TCP 4099' on host 'TDUKUBS01' could not be rescheduled properly.
> Scheduling check for next week...
>
> Obviously this is causing major issues as checks are just not being
> carried out as expected, the hosts / services are defined as follows:
>
> define host{
> use unix_host_template ; Name of host template to use
> icon_image win40.jpg
> host_name TDUKUBS01
> address 192.168.75.84
> alias TDUKUBS01 192.168.75.84
> }
>
> define service{
> use ubs-service
> hostgroup_name ubsdealservers
> service_description DEAL SERVER SERVICE TCP 4099
> check_command check_ubs!4099
> check_period ubs4099hours
> }
>
> define service{
> name ubs-service
> is_volatile 0
> max_check_attempts 5
> normal_check_interval 1
> notification_interval 5
> retry_check_interval 1
> contact_groups dtns-ooh
> active_checks_enabled 1 ; Active service checks are enabled
> passive_checks_enabled 1 ; Passive service checks are enabled/accepted
> parallelize_check 1 ; Active service checks should be parallelized
> (disabling this can lead to major performance problems)
> obsess_over_service 1 ; We should obsess over this service (if 
necessary)
> check_freshness 0 ; Default is to NOT check service 'freshness'
> notifications_enabled 1 ; Service notifications are enabled
> event_handler_enabled 1 ; Service event handler is enabled
> flap_detection_enabled 1 ; Flap detection is enabled
> failure_prediction_enabled 1 ; Failure prediction is enabled
> process_perf_data 1 ; Process performance data
> retain_status_information 1 ; Retain status information across program
> restarts
> retain_nonstatus_information 1 ; Retain non-status information across
> program restarts
> is_volatile 0 ; The service is not volatile
> register 0
> }
>
> define timeperiod{
> timeperiod_name ubs4099hours
> alias UBS 4099 Dealserver Monitoring Hours
> monday 23:46-20:30
> tuesday 23:46-20:30
> wednesday 23:46-20:30
> thursday 23:46-20:30
> friday 23:46-20:30
> }
>
> It all looks ok to me, if I manually re-schedule the check above it will
> run fine but when it comes to the check being carried out as defined by
> the timeperiod above again it will not run and then get scheduled for
> the following week.
>
> Any help would be appreciated.
>
> Thanks - Julian.
>
>
> Julian Grunnell | Unix Analyst, Infrastructure | TD Waterhouse
> T: +44 (0) 113 346 2824 | M: +44 (0) 7889 352527
>
>
> 

>
>
> Consider the environment. Please don't print this e-mail unless you
> really need to.
>
> Confidentiality: This email and its attachments are intended for the
> above named only and may be confidential. If they have come to you in
> error you must take no action based on them, nor must you copy or show
> them to anyone; please reply to this email and highlight the error.
>
> Viruses: Although we have taken steps to ensure that this email and
> attachments are free from any virus, we advise that in keeping with good
> computing practice the recipient should ensure that they are actually
> virus-free.
>
> Brokerage services provided by TD Waterhouse Investor Services (Europe)
> Limited (a subsidiary of The Toronto-Dominion Bank). Authorised and
&

Re: [Nagios-users] service checks could not be rescheduled properly.

2011-11-11 Thread Brandon Phelps
Check out this bug report on the nagios.org bug tracker:

http://tracker.nagios.org/view.php?id=31


On 11/11/2011 10:39 AM, julian_grunn...@tdwh.co.uk wrote:
>
> Hi - does anyone know the answer to the following errors I'm getting in
> the nagios.log for a handful of hosts that have a specific timeperiod
> for checks set.
>
> [Wed Nov 9 23:46:00 2011] Warning: Check of service 'DEAL SERVER SERVICE
> TCP 4099' on host 'TDUKUBS02' could not be rescheduled properly.
> Scheduling check for next week...
> [Wed Nov 9 23:46:00 2011] Warning: Check of service 'DEAL SERVER SERVICE
> TCP 4099' on host 'TDUKUBS01' could not be rescheduled properly.
> Scheduling check for next week...
>
> Obviously this is causing major issues as checks are just not being
> carried out as expected, the hosts / services are defined as follows:
>
> define host{
> use unix_host_template ; Name of host template to use
> icon_image win40.jpg
> host_name TDUKUBS01
> address 192.168.75.84
> alias TDUKUBS01 192.168.75.84
> }
>
> define service{
> use ubs-service
> hostgroup_name ubsdealservers
> service_description DEAL SERVER SERVICE TCP 4099
> check_command check_ubs!4099
> check_period ubs4099hours
> }
>
> define service{
> name ubs-service
> is_volatile 0
> max_check_attempts 5
> normal_check_interval 1
> notification_interval 5
> retry_check_interval 1
> contact_groups dtns-ooh
> active_checks_enabled 1 ; Active service checks are enabled
> passive_checks_enabled 1 ; Passive service checks are enabled/accepted
> parallelize_check 1 ; Active service checks should be parallelized
> (disabling this can lead to major performance problems)
> obsess_over_service 1 ; We should obsess over this service (if necessary)
> check_freshness 0 ; Default is to NOT check service 'freshness'
> notifications_enabled 1 ; Service notifications are enabled
> event_handler_enabled 1 ; Service event handler is enabled
> flap_detection_enabled 1 ; Flap detection is enabled
> failure_prediction_enabled 1 ; Failure prediction is enabled
> process_perf_data 1 ; Process performance data
> retain_status_information 1 ; Retain status information across program
> restarts
> retain_nonstatus_information 1 ; Retain non-status information across
> program restarts
> is_volatile 0 ; The service is not volatile
> register 0
> }
>
> define timeperiod{
> timeperiod_name ubs4099hours
> alias UBS 4099 Dealserver Monitoring Hours
> monday 23:46-20:30
> tuesday 23:46-20:30
> wednesday 23:46-20:30
> thursday 23:46-20:30
> friday 23:46-20:30
> }
>
> It all looks ok to me, if I manually re-schedule the check above it will
> run fine but when it comes to the check being carried out as defined by
> the timeperiod above again it will not run and then get scheduled for
> the following week.
>
> Any help would be appreciated.
>
> Thanks - Julian.
>
>
> Julian Grunnell | Unix Analyst, Infrastructure | TD Waterhouse
> T: +44 (0) 113 346 2824 | M: +44 (0) 7889 352527
>
>
> 
>
>
> Consider the environment. Please don't print this e-mail unless you
> really need to.
>
> Confidentiality: This email and its attachments are intended for the
> above named only and may be confidential. If they have come to you in
> error you must take no action based on them, nor must you copy or show
> them to anyone; please reply to this email and highlight the error.
>
> Viruses: Although we have taken steps to ensure that this email and
> attachments are free from any virus, we advise that in keeping with good
> computing practice the recipient should ensure that they are actually
> virus-free.
>
> Brokerage services provided by TD Waterhouse Investor Services (Europe)
> Limited (a subsidiary of The Toronto-Dominion Bank). Authorised and
> regulated by the Financial Services Authority (FSA registered number
> 141282), member of the London Stock Exchange and the PLUS market.
> Incorporated in England and Wales under registration number 2101863.
> Registered office: Exchange Court, Duncombe Street, Leeds LS1 4AX.
> Banking services provided by TD Waterhouse Bank N.V. authorised and
> regulated by De Nederlandsche Bank and the Financial Services Authority
> for UK Business (FSA registered number 216791). Incorporated in the
> Netherlands and registered as a branch in England and Wales under branch
> registration number BR006780.
>
>
> 
>
>
>
> --
> RSA(R) Conference 2012
> Save $700 by Nov 18
> Register now
> http://p.sf.net/sfu/rsa-sfdev2dev1
>
>
>
> ___
> Nagios-users mailing list
> Nagios-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when reporting 
> any issue.
> ::: Messages without supporting info will risk being sent to /dev/n

[Nagios-users] service checks could not be rescheduled properly.

2011-11-11 Thread Julian_Grunnell
Hi - does anyone know the answer to the following errors I'm getting in 
the nagios.log for a handful of hosts that have a specific timeperiod for 
checks set.

[Wed Nov  9 23:46:00 2011] Warning: Check of service 'DEAL SERVER SERVICE 
TCP 4099' on host 'TDUKUBS02' could not be rescheduled properly. 
Scheduling check for next week...
[Wed Nov  9 23:46:00 2011] Warning: Check of service 'DEAL SERVER SERVICE 
TCP 4099' on host 'TDUKUBS01' could not be rescheduled properly. 
Scheduling check for next week...

Obviously this is causing major issues as checks are just not being 
carried out as expected, the hosts / services are defined as follows:

define host{
use unix_host_template  ; Name of host 
template to use
icon_image  win40.jpg
host_name   TDUKUBS01
address 192.168.75.84
alias   TDUKUBS01 192.168.75.84
}

define service{
use ubs-service
hostgroup_name  ubsdealservers
service_description DEAL SERVER SERVICE TCP 4099
check_command   check_ubs!4099
check_periodubs4099hours
}

define service{
 nameubs-service
 is_volatile 0
 max_check_attempts  5
 normal_check_interval   1
 notification_interval   5
 retry_check_interval1
 contact_groups  dtns-ooh
 active_checks_enabled   1   ; Active 
service checks are enabled
 passive_checks_enabled  1   ; Passive 
service checks are enabled/accepted
 parallelize_check   1   ; Active 
service checks should be parallelized (disabling this can lead to major 
performance problems)
 obsess_over_service 1   ; We should 
obsess over this service (if necessary)
 check_freshness 0   ; Default is 
to NOT check service 'freshness'
 notifications_enabled   1   ; Service 
notifications are enabled
 event_handler_enabled   1   ; Service 
event handler is enabled
 flap_detection_enabled  1   ; Flap 
detection is enabled
 failure_prediction_enabled  1   ; Failure 
prediction is enabled
 process_perf_data   1   ; Process 
performance data
 retain_status_information   1   ; Retain 
status information across program restarts
 retain_nonstatus_information1   ; Retain 
non-status information across program restarts
 is_volatile 0   ; The service 
is not volatile
 register0
 }

define timeperiod{
timeperiod_name ubs4099hours
alias   UBS 4099 Dealserver Monitoring Hours
monday  23:46-20:30
tuesday 23:46-20:30
wednesday   23:46-20:30
thursday23:46-20:30
friday  23:46-20:30
}

It all looks ok to me, if I manually re-schedule the check above it will 
run fine but when it comes to the check being carried out as defined by 
the timeperiod above again it will not run and then get scheduled for the 
following week.

Any help would be appreciated.

Thanks - Julian.


Julian Grunnell | Unix Analyst, Infrastructure | TD Waterhouse
T: +44 (0) 113 346 2824 | M: +44 (0) 7889 352527
___

Consider the environment: Please don't print this e-mail unless you really need 
to.

Confidentiality:  This email and its attachments are intended for the above 
named only and may be confidential.  If they have come to you in error you must 
take no action based on them, nor must you copy or show them to anyone; please 
reply to this email and highlight the error.

Viruses:  Although we have taken steps to ensure that this email and 
attachments are free from any virus, we advise that in keeping with good 
computing practice the recipient should ensure that they are actually 
virus-free.

Brokerage services provided by TD Waterhouse Investor Services (Europe) Limited 
(a subsidiary of The Toronto-Dominion Bank).  Authorised and regulated by the 
Financial Services Authority (FSA registered number 141282), member of the 
London Stock Exchange and the PLUS market. Incorporated in England and Wales 
under registration number 2101863.  Registered office: Exchange Court, Duncombe 
Street, Leeds LS1 4AX. Banking services provided by TD Waterhouse Bank N.V. 
authorised and regulated by De Nederlandsche Bank and the Financial Services 
Authority for UK Business (FSA registered number 216791).  Incorporated in the 
Netherlands and registered as a branch in En

Re: [Nagios-users] Service Checks scheduled to next year. Service State Information mangled?

2011-07-26 Thread Jindrich Nemec
i'm not sure if my problem is related only to service restarts, when some ntp 
drift i.e. can occur, but if it is related, then this could be an acceptable 
workaround. 
thanks


- Original Message -
From: "Jindrich Nemec" 
To: nagios-users@lists.sourceforge.net
Sent: Tuesday, July 26, 2011 6:50:18 PM
Subject: Re: [Nagios-users] Service Checks scheduled to next year. Service 
State Information mangled?


# grep -r retain_state_information *
nagios.cfg:retain_state_information=1

if i got it... if the retain_state_information will be set to 0 then nagios 
process will not keep the states and reschedule checks on restart, but i've to 
schedule nagios service restarts every several hours then 

> Yueh-Hung Liu
> Tue, 26 Jul 2011 06:51:00 -0700

> do you enable retention?

>> On Tue, Jul 26, 2011 at 8:33 PM, Jindrich Nemec  
>> wrote:
>> I've found a similar thread, linked below. It seems the problem it is still 
>> unsolved.
>>


--
Magic Quadrant for Content-Aware Data Loss Prevention
Research study explores the data loss prevention market. Includes in-depth
analysis on the changes within the DLP market, and the criteria used to
evaluate the strengths and weaknesses of these DLP solutions.
http://www.accelacomm.com/jaw/sfnl/114/51385063/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Service Checks scheduled to next year. Service State Information mangled?

2011-07-26 Thread Jindrich Nemec

# grep -r retain_state_information *
nagios.cfg:retain_state_information=1

if i got it... if the retain_state_information will be set to 0 then nagios 
process will not keep the states and reschedule checks on restart, but i've to 
schedule nagios service restarts every several hours then 

> Yueh-Hung Liu
> Tue, 26 Jul 2011 06:51:00 -0700

> do you enable retention?

>> On Tue, Jul 26, 2011 at 8:33 PM, Jindrich Nemec  
>> wrote:
>> I've found a similar thread, linked below. It seems the problem it is still 
>> unsolved.
>>


--
Magic Quadrant for Content-Aware Data Loss Prevention
Research study explores the data loss prevention market. Includes in-depth
analysis on the changes within the DLP market, and the criteria used to
evaluate the strengths and weaknesses of these DLP solutions.
http://www.accelacomm.com/jaw/sfnl/114/51385063/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Service Checks scheduled to next year. Service State Information mangled?

2011-07-26 Thread Yueh-Hung Liu
do you enable retention?

On Tue, Jul 26, 2011 at 8:33 PM, Jindrich Nemec  wrote:
> I've found a similar thread, linked below. It seems the problem it is still 
> unsolved.
>
> -
>
> http://www.mail-archive.com/nagios-users@lists.sourceforge.net/msg24816.html
>
>
> For example a check that runs today will have a next scheduled check date of
> 05/03/2010 instead of 05/03/2009. There is no real connection between the
> checks, it's like a random problem. Some checks have it, others don't. a 
> manual
> reschedule of the check does solve the problem but sometimes only temporary.
>
>
> http://article.gmane.org/gmane.network.nagios.devel/5554/match=next+year
>
> --
> Magic Quadrant for Content-Aware Data Loss Prevention
> Research study explores the data loss prevention market. Includes in-depth
> analysis on the changes within the DLP market, and the criteria used to
> evaluate the strengths and weaknesses of these DLP solutions.
> http://www.accelacomm.com/jaw/sfnl/114/51385063/
> ___
> Nagios-users mailing list
> Nagios-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when reporting 
> any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
>

--
Magic Quadrant for Content-Aware Data Loss Prevention
Research study explores the data loss prevention market. Includes in-depth
analysis on the changes within the DLP market, and the criteria used to
evaluate the strengths and weaknesses of these DLP solutions.
http://www.accelacomm.com/jaw/sfnl/114/51385063/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Service Checks scheduled to next year. Service State Information mangled?

2011-07-26 Thread Jindrich Nemec
I've found a similar thread, linked below. It seems the problem it is still 
unsolved.

-

http://www.mail-archive.com/nagios-users@lists.sourceforge.net/msg24816.html


For example a check that runs today will have a next scheduled check date of 
05/03/2010 instead of 05/03/2009. There is no real connection between the 
checks, it's like a random problem. Some checks have it, others don't. a manual 
reschedule of the check does solve the problem but sometimes only temporary.


http://article.gmane.org/gmane.network.nagios.devel/5554/match=next+year

--
Magic Quadrant for Content-Aware Data Loss Prevention
Research study explores the data loss prevention market. Includes in-depth
analysis on the changes within the DLP market, and the criteria used to
evaluate the strengths and weaknesses of these DLP solutions.
http://www.accelacomm.com/jaw/sfnl/114/51385063/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] Service Checks scheduled to next year. Service State Information mangled?

2011-07-26 Thread Jindrich Nemec
Hi all,

 I'm running Nagios Core Version 3.2.3 on CentOS 5.5. rpm install (RPMforge 
repo). 

 The problem: some service checks wont execute, they are scheduled to next 
year...?!

 I'm running also Version 3.2.1 of Nagios Core installations and I've not 
encourtered this problem.

 I've the same service check template (retries, inerval, etc) for all the 
service checks. Some of the checks abruptly stop to execute waiting for 2012... 
(no spoilers) 


---Host State 
Information-
Current Status:   OK   (for 6d 16h 54m 31s) 
Status Information: USERS OK - 0 users currently logged in 
Performance Data: users=0;5;10;0 
Current Attempt: 1/7  (HARD state) 
Last Check Time: 07-19-2011 20:15:39 <-- last 
Check Type: ACTIVE 
Check Latency / Duration: 0.234 / 0.064 seconds 
Next Scheduled Check:   07-07-2012 20:19:39 <--- next scheduled to 
2012? :-/
Last State Change: 07-19-2011 19:35:39 
Last Notification: N/A (notification 0) 
Is This Service Flapping?   NO   (8.68% state change) 
In Scheduled Downtime?   NO   
Last Update: 07-26-2011 12:30:07  ( 0d 0h 0m 3s ago) 
 



the next execution is not visible in the Check Scheduling Queue for the example 
above, below is other checks which is in the queue, but also scheduled to 2012




--Check Scheduling 
Queue-
Host Service Last Check Next Check  TypeActiveChecks
Actions 
c-wifi04 07-19-2011 20:18:4107-07-2012 20:19:33 Forced  ENABLED   





---Host State 
Information-

Host Status:   UP   (for 6d 17h 9m 56s) 
Status Information: PING OK - Packet loss = 0%, RTA = 1.90 ms 
Performance Data: rta=1.905000ms;3000.00;5000.00;0.00 
pl=0%;80;100;0 
Current Attempt: 1/5  (HARD state) 
Last Check Time: 07-19-2011 20:18:41  <-- last
Check Type: ACTIVE 
Check Latency / Duration: 0.083 / 4.167 seconds 
Next Scheduled Active Check:   07-07-2012 20:19:33 <- next scheduled to 
2012?
Last State Change: 07-19-2011 19:27:11 
Last Notification: N/A (notification 0) 
Is This Host Flapping?   NO   (9.74% state change) 
In Scheduled Downtime?   NO   
Last Update: 07-26-2011 12:37:07  ( 0d 0h 0m 0s ago) 

as I was looking at the various checks it seems it has no conjunction if the 
service was flapping or not previously. 

thanks for any suggestion
jindrich


 



--
Magic Quadrant for Content-Aware Data Loss Prevention
Research study explores the data loss prevention market. Includes in-depth
analysis on the changes within the DLP market, and the criteria used to
evaluate the strengths and weaknesses of these DLP solutions.
http://www.accelacomm.com/jaw/sfnl/114/51385063/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Service Checks

2009-02-23 Thread Jim Avery
2009/2/20 Alex Dehaini :
> Hi Guys,
>
> I have about 200 hosts and 400 services. My sys spec is about P4 2.4 Ghz and
> 2Gb of RAM. I want to check each service every 20 seconds and update the CGI
> interface. Any clues how I can achieve this?


See http://nagios.sourceforge.net/docs/3_0/tuning.html
and
http://nagios.sourceforge.net/docs/3_0/configmain.html#interval_length

I can't tell you if following all this advice will actually achieve
your goal but guess it should do.  I've never tried changing the
timing interval length from 60 myself.

Cheers,

Jim

--
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] Service Checks

2009-02-20 Thread Alex Dehaini
Hi Guys,

I have about 200 hosts and 400 services. My sys spec is about P4 2.4 Ghz and
2Gb of RAM. I want to check each service every 20 seconds and update the CGI
interface. Any clues how I can achieve this?

Regards,

-- 
Alex Dehaini
Developer
Site - www.alexdehaini.com
Email - alexdeha...@gmail.com
--
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] Service Checks while Host down

2008-05-21 Thread Axel Schmalowsky
Hi list,

is there a way to temporarily disable service checks on hosts while 
they're down? That's is, how can I configure nagios to not conducting 
service checks for any hosts it recognizes in a DOWN-state?

I'm running nagios in a distributed environment with 15 distributed 
servers and one central server. Now, I got the problem that if one 
monitored host is down, the distributed server monitoring that 
particular host recognizes it's state to be DOWN (within 30s), but the 
central server does not. It seems that the central server does just one 
host check.
nagios does not conduct a second host check (but it's configured to do 
so) and neither sends out any host notification(s) ...


[central server]
active host checks enabled
retry interval 1
max check attempts 2
check interval 60

As far as I understand it, nagios should check any hosts in a not-UP 
state (at least) 2x within ~60sec and should send out a notifications if 
the host is still DOWN..
What am I missing?

Any help would be appreciated.

~axel

-
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Service checks scheduled far in future

2008-02-17 Thread Marc Powell

On Feb 17, 2008, at 11:33 AM, Dale J. Chatham wrote:

> The output of the command is below.
>
> I included the objects.cache file.  As I understand it, this is  
> created at run time from what nagios
> reads from the config files and saves.  Therefore, if I left  
> something out, it should be more obvious here.
>

> First scheduled check:  Mon Feb 18 07:00:00 2008
> Last scheduled check:   Mon Feb 18 07:00:00 2008
>
>

> define timeperiod {
>   timeperiod_name 24x7
>   alias   24 Hours A Day, 7 Days A Week
>   monday  07:00-17:00
>   tuesday 07:00-17:00
>   wednesday   07:00-17:00
>   thursday07:00-17:00
>   friday  07:00-17:00
>   }

You've modified this timeperiod to _not_ be 24x7 but are treating it  
as if it still is. Nagios is doing what you've told it to do.

> define service {
>   host_name   callisto
>   service_description PING
>   check_period24x7

--
Marc

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Service checks scheduled far in future

2008-02-17 Thread Dale J. Chatham




The output of the command is below.

I included the objects.cache file.  As I understand it, this is created
at run time from what nagios reads from the config files and saves. 
Therefore, if I left something out, it should be more obvious here.

If you need the actual files, I can tar and zip 'em, though.  I make
heavy use of config_dir and have one file per configuration item.

  Dale


[EMAIL PROTECTED] nagios]# nagios -s /etc/nagios/nagios.cfg 

Nagios 2.10
Copyright (c) 1999-2007 Ethan Galstad (http://www.nagios.org)
Last Modified: 10-21-2007
License: GPL

Projected scheduling information for host and service
checks is listed below.  This information assumes that
you are going to start running Nagios with your current
config files.

HOST SCHEDULING INFORMATION
---
Total hosts: 8
Total scheduled hosts:   0
Host inter-check delay method:   SMART
Average host check interval: 0.00 sec
Host inter-check delay:  0.00 sec
Max host check spread:   30 min
First scheduled check:   N/A
Last scheduled check:    N/A


SERVICE SCHEDULING INFORMATION
---
Total services: 15
Total scheduled services:   15
Service inter-check delay method:   SMART
Average service check interval: 600.00 sec
Inter-check delay:  12.00 sec
Interleave factor method:   SMART
Average services per host:  1.88
Service interleave factor:  2
Max service check spread:   3 min
First scheduled check:  Mon Feb 18 07:00:00 2008
Last scheduled check:   Mon Feb 18 07:00:00 2008


CHECK PROCESSING INFORMATION

Service check reaper interval:  10 sec
Max concurrent service checks:  Unlimited


PERFORMANCE SUGGESTIONS
---
I have no suggestions - things look okay.







Marc Powell wrote:

  
  
  
-Original Message-
From: [EMAIL PROTECTED] [mailto:nagios-users-
[EMAIL PROTECTED]] On Behalf Of Dale J. Chatham
Sent: Sunday, February 17, 2008 9:49 AM
To: nagios-users@lists.sourceforge.net
Subject: [Nagios-users] Service checks scheduled far in future

Service checks scheduled

I stopped nagios
I deleted /var/log/nagios/*
I started nagios

output from date command:
Sun Feb 17 09:46:30 CST 2008

 From nagios service detail page:
Service check scheduled for Mon Feb 18 07:00:00 CST 2008

  
  
Please post the service definition (including template) as well as the
output of '/path/to/nagios -s /path/to/nagios.cfg'.

--
Marc 

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null
  





#   NAGIOS OBJECT CACHE FILE
#
# THIS FILE IS AUTOMATICALLY GENERATED
# BY NAGIOS.  DO NOT MODIFY THIS FILE!
#
# Created: Sun Feb 17 09:55:30 2008


define timeperiod {
timeperiod_name 24x7
alias   24 Hours A Day, 7 Days A Week
monday  07:00-17:00
tuesday 07:00-17:00
wednesday   07:00-17:00
thursday07:00-17:00
friday  07:00-17:00
}

define timeperiod {
timeperiod_name none
alias   No Time Is A Good Time
}

define timeperiod {
timeperiod_name nonworkhours
alias   Non Work Hours
sunday  00:00-24:00
monday  00:00-07:00,17:00-24:00
tuesday 00:00-07:00,17:00-24:00
wednesday   00:00-07:00,17:00-24:00
thursday00:00-07:00,17:00-24:00
friday  00:00-07:00,17:00-24:00
saturday00:00-24:00
}

define timeperiod {
timeperiod_name workhours
alias   "Normal" Working Hours
monday  07:00-17:00
tuesday 07:00-17:00
wednesday   07:00-17:00
thursday07:00-17:00
friday  07:00-17:00
}

define command {
command_namecheck-fast-alive
command_line/usr/lib64/nagios/plugins/check_fping -H $HOSTADDRESS$
}

define command {
command_namecheck-host-alive
command_line/usr/lib64/nagios/plugins/check_ping -H $HOSTADDRESS$ 
-w 5000,100% -c 5000,100% -p 1
}

define command {
command_namecheck-imap
command_line/usr/lib64/nagios/plugins/check_imap -H $HOSTADDRESS$
}

define command {
command_nam

Re: [Nagios-users] Service checks scheduled far in future

2008-02-17 Thread Marc Powell


> -Original Message-
> From: [EMAIL PROTECTED] [mailto:nagios-users-
> [EMAIL PROTECTED] On Behalf Of Dale J. Chatham
> Sent: Sunday, February 17, 2008 9:49 AM
> To: nagios-users@lists.sourceforge.net
> Subject: [Nagios-users] Service checks scheduled far in future
> 
> Service checks scheduled
> 
> I stopped nagios
> I deleted /var/log/nagios/*
> I started nagios
> 
> output from date command:
> Sun Feb 17 09:46:30 CST 2008
> 
>  From nagios service detail page:
> Service check scheduled for Mon Feb 18 07:00:00 CST 2008

Please post the service definition (including template) as well as the
output of '/path/to/nagios -s /path/to/nagios.cfg'.

--
Marc 

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] Service checks scheduled far in future

2008-02-17 Thread Dale J. Chatham
Service checks scheduled

I stopped nagios
I deleted /var/log/nagios/*
I started nagios

output from date command:
Sun Feb 17 09:46:30 CST 2008

 From nagios service detail page:
Service check scheduled for Mon Feb 18 07:00:00 CST 2008

If I force the check, it will run OK, but the next check will happen far 
into the future.

This one is weird.

Help me obi-won kenobi, you're my only hope.

  Dale


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Service checks on hosts which are not up 24x7

2008-01-26 Thread Jan Kohnert
Hugo van der Kooij schrieb:
> If you setup dependencies correctly you should get close to the
> requested results. Say you ping them every minute and make sure that the
> ping service goes into hard state at the third failure it will just fail
> that service.
>
> Make all other service dependent on that and make sure they do not reach
> hard state before ping does.

Ah, I begin to understand. I have ping checks for the machines, so I just need 
to add some statements. :) Thanks for clearifying my mind. I was just a bit 
confused about that dependency stuff...

> Now only ping will be able to bother you an if you choose not to notify
> on that you should have no notifications when these machines are
> switched off.

I just get a note, that the host is down. This is the behavior I like to have 
and it already works that way. :-P

> Why you bother to monitor machines like this in the first place is quite
> another question. They obviously are not critical.

I am aware of that. I just want to be informed if something breaks on those 
machines (in case they are up) to be able to react as soon as possible. As I 
told: this is a small private and uncritical net. Yeah, monitoring those 
machines sounds a bit like overkill, but I like it that way. You may call me 
paranoid. :-D

> Hugo.

Thanks again. 

-- 
MfG Jan

OpenPGP Fingerprint:
0E9B 4052 C661 5018 93C3 4E46 651A 7A28 4028 FF7A


signature.asc
Description: This is a digitally signed message part.
-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Service checks on hosts which are not up 24x7

2008-01-26 Thread Hugo van der Kooij
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Patrick Morris wrote:
| On Sat, 26 Jan 2008, Jan Kohnert wrote:
|
|> Hello all,
|>
|> I'm monitoring my small private net using nagios and it works well.
But I have
|> a (hopefully simply to answer) question:
|>
|> Some of the hosts are workstations which may go down at some time due to
|> energy saving reasons. I want the services, which I monitor on those
hosts to
|> change their status to 'unknown' instead of 'critical' if those hosts are
|> down.
|>
|> Something like:
|> if $host up:
|> check services and report errors
|> else:
|> don't check and give 'unknown' status
|>
|> I'm sure this must be possible using service-dependencies but I'm not
sure how
|> to do this. Could somebody please give me a hint where to look in the
|> documentation or how this could be managable?
|>
|> If you need detailed configuration items, please let me know.
|
| Whether a service is critical, unknown or otherwise depends on how the
| plugin is implemented, and there's no global setting you can use to make
| them behave the way you're looking for.  Service dependencies also won't
| help, since all they do is determine whether or not to check or notify
| on a service based on the state of another service, but have no effect
| on the return value of the plugins.

I beg to differ.

If you setup dependencies correctly you should get close to the
requested results. Say you ping them every minute and make sure that the
ping service goes into hard state at the third failure it will just fail
that service.

Make all other service dependent on that and make sure they do not reach
hard state before ping does.

Now only ping will be able to bother you an if you choose not to notify
on that you should have no notifications when these machines are
switched off.

Why you bother to monitor machines like this in the first place is quite
another question. They obviously are not critical.

Hugo.

- --
[EMAIL PROTECTED]   http://hugo.vanderkooij.org/
PGP/GPG? Use: http://hugo.vanderkooij.org/0x58F19981.asc

A: Yes.
>Q: Are you sure?
>>A: Because it reverses the logical flow of conversation.
>>>Q: Why is top posting frowned upon?

Bored? Click on http://spamornot.org/ and rate those images.

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.7 (GNU/Linux)

iD8DBQFHmwj7BvzDRVjxmYERAtG1AJ4tkHR3BndWByqb5DUJSPhn/IQarACeMAR5
cc8jJgi7BePpU7FcOVntWms=
=fGcm
-END PGP SIGNATURE-

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Service checks on hosts which are not up 24x7

2008-01-25 Thread Patrick Morris
On Sat, 26 Jan 2008, Jan Kohnert wrote:

> Hello all,
> 
> I'm monitoring my small private net using nagios and it works well. But I have
> a (hopefully simply to answer) question:
> 
> Some of the hosts are workstations which may go down at some time due to
> energy saving reasons. I want the services, which I monitor on those hosts to
> change their status to 'unknown' instead of 'critical' if those hosts are
> down.
> 
> Something like:
> if $host up:
> check services and report errors
> else:
> don't check and give 'unknown' status
> 
> I'm sure this must be possible using service-dependencies but I'm not sure how
> to do this. Could somebody please give me a hint where to look in the
> documentation or how this could be managable?
> 
> If you need detailed configuration items, please let me know.

Whether a service is critical, unknown or otherwise depends on how the
plugin is implemented, and there's no global setting you can use to make
them behave the way you're looking for.  Service dependencies also won't
help, since all they do is determine whether or not to check or notify
on a service based on the state of another service, but have no effect
on the return value of the plugins.

To get what you're looking for, you'll probably need to come up with
some sort of wrapper script for your plugins that modifies the return
value based on the status of the host they are associated with.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] Service checks on hosts which are not up 24x7

2008-01-25 Thread Jan Kohnert
Hello all,

I'm monitoring my small private net using nagios and it works well. But I have 
a (hopefully simply to answer) question:

Some of the hosts are workstations which may go down at some time due to 
energy saving reasons. I want the services, which I monitor on those hosts to 
change their status to 'unknown' instead of 'critical' if those hosts are 
down.

Something like:
if $host up:
check services and report errors
else:
don't check and give 'unknown' status

I'm sure this must be possible using service-dependencies but I'm not sure how 
to do this. Could somebody please give me a hint where to look in the 
documentation or how this could be managable?

If you need detailed configuration items, please let me know.

TIA!

-- 
MfG Jan

OpenPGP Fingerprint:
0E9B 4052 C661 5018 93C3 4E46 651A 7A28 4028 FF7A


signature.asc
Description: This is a digitally signed message part.
-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] Service checks stopped

2008-01-22 Thread Robert Anderson
I am running nagios 3.0rc1 on solaris. I have several services being
monitored on 2 hosts. I have a regular outage on a service group that I have
configured by specifying the timeperiod
define timeperiod{
timeperiod_name myapp-prod-uptime
alias   MyApp Production Uptime
sunday  00:00-24:00 ;
monday  00:00-17:00,17:45-24:00 ;
tuesday 00:00-24:00 ;
wednesday   00:00-24:00 ;
thursday00:00-17:00,17:45-24:00 ;
friday  00:00-24:00 ;
saturday00:00-24:00 ;
}

The services for MyApp all use a template with the following
...
check_periodmyapp-prod-uptime
...

When the outage windows occur (thursday 5:00pm to 5:45pm) the service checks
for MyApp do not run. Shortly after, all service checks no longer run. The
last service check to run was on 1/17/2008 at 5:29:51 pm. In the web
interface the service checks show that they were scheduled to run but they
never did.

Restarting nagios will fix it, until the next configured window in the
timeperiod.

I have also tried to implement this outage window by using a seperate
timeperiod to define the window between 17:00 and 17:45 Monday and Thursday,
while using the "exclude" directive in the myapp-prod-uptime timeperiod
definition. I get the same result. All checks stop shortly after the window
starts.

Please help.

-- 
Rob Anderson
[EMAIL PROTECTED]
-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Service checks stop when using epn

2008-01-17 Thread Marc Powell


> -Original Message-
> From: [EMAIL PROTECTED] [mailto:nagios-users-
> [EMAIL PROTECTED] On Behalf Of Yost, Karl
> Sent: Thursday, January 17, 2008 8:29 PM
> To: nagios-users@lists.sourceforge.net
> Subject: [Nagios-users] Service checks stop when using epn
> 
> My current nagios installation has outgrown it's current server, so I
> built a new server from scratch and due to the number of perl plugins
I am
> using thought it would be in my best interested to embed perl when I
> compiled my binary. The compile and install went fine, I fired nagios
up
> and it was running fine for a few minutes (5-15) then the service
checks
> stopped happening, and there isn't any additional information in the
log.
> 
>
I don't recall seeing these symptoms before but generally speaking, not
all perl plugins are compatible with ePN. You might try testing them
with the mini-ePN distributed with Nagios (contrib dir) as well as
running nagios in debug mode to see what it's doing when it hangs.

--
Marc



-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] Service checks stop when using epn

2008-01-17 Thread Yost, Karl
My current nagios installation has outgrown it's current server, so I
built a new server from scratch and due to the number of perl plugins I
am using thought it would be in my best interested to embed perl when I
compiled my binary. The compile and install went fine, I fired nagios up
and it was running fine for a few minutes (5-15) then the service checks
stopped happening, and there isn't any additional information in the
log. 

 

I compiled a new binary this time with no perl and started things back
up and it ran perfectly and did so for some time, however at a much
higher cpu load. So I built a new binary this time I didn't add the perl
cache option, replaced the binary and it seemed to work, but then the
checks stopped a few minutes later.

 

I have reverted back to my basic nagios binary with out perl now and
things are working again, it would appear I am missing something to get
the epn to work but I can't seem to find. I thought I had seen something
in the mailing list before about this but can't seem to put my finger on
it.

 

Any advice is greatly appreciated.

 

Thanks,

Karl Yost

IQOR

[EMAIL PROTECTED]

 

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Service checks

2007-11-05 Thread Hari Sekhon
You are correct, you can actually do this with a regex, I completely 
forgot about this option. Thanks for reminding me.

If you are skilled with regex you can do this, although I think that the 
OP was looking for a way around having one service definition for
all 4 web sites, and for this, I would think that separate service 
definitions are a little nicer than trying to put a square peg in a 
round hole.


-h

Hari Sekhon



Jeff Chapin wrote:
> I was under the impression that check_http has a regex option, meaning
> you can pass it a regex that ought to be able to handle this.
>
> Jeff
>
>  
>  
>  
> JEFF CHAPIN 
> SYSTEM ADMINISTRATOR 
> T8DESIGN.COM | P 319.266.7574 - x267 | 877.T8IDEAS | F 888.290.4675
>  
>
> This e-mail, including attachments, is covered by the Electronic
> Communications Privacy Act, 18 U.S.C. 2510-2521, is confidential, and
> may be legally privileged. If you are not the intended recipient, you
> are hereby notified that any retention, dissemination, distribution, or
> copying of this communication is strictly prohibited. Please reply to
> the sender that you have received the message in error, and then please
> delete it. Thank you.
>
> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of Hari
> Sekhon
> Sent: Friday, November 02, 2007 11:01 AM
> To: Jerad Riggin
> Cc: nagios-users@lists.sourceforge.net
> Subject: Re: [Nagios-users] Service checks
>
> I've just tried this, using the standard check_http plugin you are 
> using. It appears not.
>
> As soon as one string is not found it goes critical.
>
> You could always write a custom plugin to test your websites, or much 
> more easily a shell wrapper plugin to call the first, if it fails, call 
> the second, and only if that fails go critical... not perfect though, a 
> better/custom plugin would serve you better.
>
> -h
>
> Hari Sekhon
>
>
>
> Jerad Riggin wrote:
>   
>> In that case, is there a way to specify more than one possible string,
>> 
>
>   
>> so that for example if it can't find "Home", but it can find 
>> "Checkout",  it considers the host up?
>>
>> On 11/2/07, *Hari Sekhon* <[EMAIL PROTECTED] 
>> <mailto:[EMAIL PROTECTED]>> wrote:
>>
>> yes you have to use a separate one for each site, how else would
>> you be
>> able to use a separate check? Unless you expect the word "Home" on
>> each
>> site and that is your string check. Another option is using Macros
>> but I
>> suspect this may not do what you want...
>>
>> -h
>>
>> Hari Sekhon
>>
>>
>>
>> Jerad Riggin wrote:
>> > I've read the help docs, like I said I have a working
>> 
> installation
>   
>> > checking about 15 servers.  I have a PING service that pings a
>> 
> host
>   
>> > group.  I can't do that with this because I'm checking a
>> 
> different
>   
>> > string on each site, so i'm guessing it has to be separated
>> out.  Does
>> > this make sense?
>> >
>> > On 11/2/07, *Hari Sekhon* < [EMAIL PROTECTED]
>> <mailto:[EMAIL PROTECTED]>
>> > <mailto:[EMAIL PROTECTED]
>> <mailto:[EMAIL PROTECTED]>>> wrote:
>> >
>> > You need to reread the docs, this is the most basic of
>> questions.
>> >
>> >
>> 
> http://nagios.sourceforge.net/docs/2_0/xodtemplate.html#service
>   
>> >
>> > Hint: You are missing a service_description in the last
>> block for one.
>> > You should not have name in that last block either, you need
>> > host_name.
>> >
>> > Read docs pls.
>> >
>> > -h
>> >
>> > Hari Sekhon
>> >
>> >
>> >
>> > Jerad Riggin wrote:
>> > > I have a functioning nagios setup but I have a quick
>> question.  I am
>> > > going through and adding website string checks so we can
>> 
> keep
>   
>> > track of
>> > > availability on one of our webservers.
>> > >
>> > > So in services.cfg I have
>> > >
>> > > define service{
>> > > namegeneric-service ;
>> Generic
>

Re: [Nagios-users] Service checks

2007-11-02 Thread Jeff Chapin
I was under the impression that check_http has a regex option, meaning
you can pass it a regex that ought to be able to handle this.

Jeff

 
 
 
JEFF CHAPIN 
SYSTEM ADMINISTRATOR 
T8DESIGN.COM | P 319.266.7574 - x267 | 877.T8IDEAS | F 888.290.4675
 

This e-mail, including attachments, is covered by the Electronic
Communications Privacy Act, 18 U.S.C. 2510-2521, is confidential, and
may be legally privileged. If you are not the intended recipient, you
are hereby notified that any retention, dissemination, distribution, or
copying of this communication is strictly prohibited. Please reply to
the sender that you have received the message in error, and then please
delete it. Thank you.

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Hari
Sekhon
Sent: Friday, November 02, 2007 11:01 AM
To: Jerad Riggin
Cc: nagios-users@lists.sourceforge.net
Subject: Re: [Nagios-users] Service checks

I've just tried this, using the standard check_http plugin you are 
using. It appears not.

As soon as one string is not found it goes critical.

You could always write a custom plugin to test your websites, or much 
more easily a shell wrapper plugin to call the first, if it fails, call 
the second, and only if that fails go critical... not perfect though, a 
better/custom plugin would serve you better.

-h

Hari Sekhon



Jerad Riggin wrote:
> In that case, is there a way to specify more than one possible string,

> so that for example if it can't find "Home", but it can find 
> "Checkout",  it considers the host up?
>
> On 11/2/07, *Hari Sekhon* <[EMAIL PROTECTED] 
> <mailto:[EMAIL PROTECTED]>> wrote:
>
> yes you have to use a separate one for each site, how else would
> you be
> able to use a separate check? Unless you expect the word "Home" on
> each
> site and that is your string check. Another option is using Macros
> but I
> suspect this may not do what you want...
>
> -h
>
> Hari Sekhon
>
>
>
> Jerad Riggin wrote:
> > I've read the help docs, like I said I have a working
installation
> > checking about 15 servers.  I have a PING service that pings a
host
> > group.  I can't do that with this because I'm checking a
different
> > string on each site, so i'm guessing it has to be separated
> out.  Does
> > this make sense?
> >
> > On 11/2/07, *Hari Sekhon* < [EMAIL PROTECTED]
> <mailto:[EMAIL PROTECTED]>
> > <mailto:[EMAIL PROTECTED]
> <mailto:[EMAIL PROTECTED]>>> wrote:
> >
> > You need to reread the docs, this is the most basic of
> questions.
> >
> >
http://nagios.sourceforge.net/docs/2_0/xodtemplate.html#service
> >
> > Hint: You are missing a service_description in the last
> block for one.
> > You should not have name in that last block either, you need
> > host_name.
> >
> > Read docs pls.
> >
> > -h
> >
> > Hari Sekhon
> >
> >
> >
> > Jerad Riggin wrote:
> > > I have a functioning nagios setup but I have a quick
> question.  I am
> > > going through and adding website string checks so we can
keep
> > track of
> > > availability on one of our webservers.
> > >
> > > So in services.cfg I have
> > >
> > > define service{
> > > namegeneric-service ;
> Generic
> > > service name
> > > active_checks_enabled   1   ;
> Active
> > > service checks are enabled
> > > passive_checks_enabled  1   ;
> Passive
> > > service checks are enabled/accepted
> > > parallelize_check   1   ;
> Active
> > > service checks should be parallelized (Don't disable)
> > > obsess_over_service 1   ;
> We should
> > > obsess over this service (if necessary)
> > > check_freshness 0   ;
> > Default is
> > > to NOT check service 'freshness'
> > > notifications_enabled   1   ;
> Service
> > > notifications are enabled
> > > event_handler_enabled   1   ;
> Service
> &

Re: [Nagios-users] Service checks

2007-11-02 Thread Hari Sekhon
Yes I know what you've been doing, double or triple templating, and what 
I am saying is that you do not need to template for every service.

Only template the common bits and put the unique bits in the service 
definition block with the hostname.

You will need less blocks and have less redundancy in your configuration.

Templates are only supposed to save you typing in the same lines again 
multiple times, not to become your religion for every check. You are 
writing more than you need for such basic stuff.

-h

Hari Sekhon



Jerad Riggin wrote:
> I think we're losing something in translation of me trying to tell you 
> what i'm doing.  Here is what i've been doing.
>
> define service{
> use basic-service
> namecheck-site4
> notification_optionsw,u,c,r
> check_command   check_http!site.com!20!"Home"
>register0
> }
>
> and then later on in the config file
>
>
> define service{
> usecheck-site4
> service_descriptionHTTP
> contact_groupsmis
> host_namesite.com 
> }
>
>
> On 11/2/07, *Hari Sekhon* <[EMAIL PROTECTED] 
> > wrote:
>
> If you had read the docs like I said, I can't see how you could
> miss the
> fact that you have a service definition without a host_name or
> hostgroup_name!
>
> Where is this service check going to run against if you haven't
> told it
> which host you want to test?
>
> I even gave you the anchored link to the exact place where it
> shows you
> the definitions that are needed for that block...
>
> -h
>
> Hari Sekhon
>
>
>
> Jerad Riggin wrote:
> > Ok, so for example
> >
> > define service{
> > use basic-service
> > notification_options   w,u,c,r
> > check_command check_http!site.com!20!"Home"
> > Service description  CheckString
> > }
> >
> > Sorry if this seems like a newbie question.  I'm just trying to
> > backtrack and optimize the config before I get too far down the
> road.
> >
> >
> > On 11/2/07, *Hari Sekhon* <[EMAIL PROTECTED]
> 
> >  >> wrote:
> >
> > also, you are making it more trouble than it needs to be,
> instead of
> > trying to do register 0 and making the last block a template
> > block, just
> > put the host name in there and the service description as I
> hinted and
> > that is all you need, you will then have the first 2 blocks plus
> > one for
> > each service on each host with different string checks...
> >
> > -h
> >
> > Hari Sekhon
> >
> >
> >
> > Jerad Riggin wrote:
> > > I've read the help docs, like I said I have a working
> installation
> > > checking about 15 servers.  I have a PING service that
> pings a host
> > > group.  I can't do that with this because I'm checking a
> different
> > > string on each site, so i'm guessing it has to be separated
> > out.  Does
> > > this make sense?
> > >
> > > On 11/2/07, *Hari Sekhon* <[EMAIL PROTECTED]
> 
> >  >
> > >  
> >   > >
> > > You need to reread the docs, this is the most basic of
> > questions.
> > >
> > >
> >
> http://nagios.sourceforge.net/docs/2_0/xodtemplate.html#service
> 
> >
> 
> > >
> > > Hint: You are missing a service_description in the last
> > block for one.
> > > You should not have name in that last block either,
> you need
> > > host_name.
> > >
> > > Read docs pls.
> > >
> > > -h
> > >
> > > Hari Sekhon
> > >
> > >
> > >
> > > Jerad Riggin wrote:
> > > > I have a functioning nagios setup but I have a quick
> > question.  I am
> > > > going through and adding website string checks so we
> can keep
> > > track of
> > > > availability on one of our webservers.
> > > >
> > > > So in services.cfg I have
> > > >
> > > > define service{
> > > >
> name 

Re: [Nagios-users] Service checks

2007-11-02 Thread Jerad Riggin
I wish I could, but I'm not a programmer and have limited experience with
Linux.  I usually just manage Windows servers.

Thanks for your help though.  I condensed the string check into one instead
of double/triple templating.  Thanks.

On 11/2/07, Hari Sekhon <[EMAIL PROTECTED]> wrote:
>
> I've just tried this, using the standard check_http plugin you are
> using. It appears not.
>
> As soon as one string is not found it goes critical.
>
> You could always write a custom plugin to test your websites, or much
> more easily a shell wrapper plugin to call the first, if it fails, call
> the second, and only if that fails go critical... not perfect though, a
> better/custom plugin would serve you better.
>
> -h
>
> Hari Sekhon
>
>
>
> Jerad Riggin wrote:
> > In that case, is there a way to specify more than one possible string,
> > so that for example if it can't find "Home", but it can find
> > "Checkout",  it considers the host up?
> >
> > On 11/2/07, *Hari Sekhon* <[EMAIL PROTECTED]
> > > wrote:
> >
> > yes you have to use a separate one for each site, how else would
> > you be
> > able to use a separate check? Unless you expect the word "Home" on
> > each
> > site and that is your string check. Another option is using Macros
> > but I
> > suspect this may not do what you want...
> >
> > -h
> >
> > Hari Sekhon
> >
> >
> >
> > Jerad Riggin wrote:
> > > I've read the help docs, like I said I have a working installation
> > > checking about 15 servers.  I have a PING service that pings a
> host
> > > group.  I can't do that with this because I'm checking a different
> > > string on each site, so i'm guessing it has to be separated
> > out.  Does
> > > this make sense?
> > >
> > > On 11/2/07, *Hari Sekhon* < [EMAIL PROTECTED]
> > 
> > >  > >> wrote:
> > >
> > > You need to reread the docs, this is the most basic of
> > questions.
> > >
> > >
> http://nagios.sourceforge.net/docs/2_0/xodtemplate.html#service
> > >
> > > Hint: You are missing a service_description in the last
> > block for one.
> > > You should not have name in that last block either, you need
> > > host_name.
> > >
> > > Read docs pls.
> > >
> > > -h
> > >
> > > Hari Sekhon
> > >
> > >
> > >
> > > Jerad Riggin wrote:
> > > > I have a functioning nagios setup but I have a quick
> > question.  I am
> > > > going through and adding website string checks so we can
> keep
> > > track of
> > > > availability on one of our webservers.
> > > >
> > > > So in services.cfg I have
> > > >
> > > > define service{
> > > > namegeneric-service ;
> > Generic
> > > > service name
> > > > active_checks_enabled   1   ;
> > Active
> > > > service checks are enabled
> > > > passive_checks_enabled  1   ;
> > Passive
> > > > service checks are enabled/accepted
> > > > parallelize_check   1   ;
> > Active
> > > > service checks should be parallelized (Don't disable)
> > > > obsess_over_service 1   ;
> > We should
> > > > obsess over this service (if necessary)
> > > > check_freshness 0   ;
> > > Default is
> > > > to NOT check service 'freshness'
> > > > notifications_enabled   1   ;
> > Service
> > > > notifications are enabled
> > > > event_handler_enabled   1   ;
> > Service
> > > > event handler is enabled
> > > > flap_detection_enabled  1   ;
> > Flap
> > > > detection is enabled
> > > > process_perf_data   1   ;
> > Process
> > > > performance data
> > > > retain_status_information   1   ;
> > Retain
> > > > status information across program restarts
> > > > retain_nonstatus_information1   ;
> > Retain
> > > > non-status information across program restarts
> > > > register0   ;
> > DONT
> > > > REGISTER THIS DEFINITION - NOT A REAL SERVICE, JUST A
> > TEMPLATE!
> > > > }
> > > >
> > > > define service{
> > > > use generic-service
> > > > namebasic-service
> > > > is_volatile 0
> > > >   

Re: [Nagios-users] Service checks

2007-11-02 Thread Jerad Riggin
I think we're losing something in translation of me trying to tell you what
i'm doing.  Here is what i've been doing.

define service{
use basic-service
namecheck-site4
notification_optionsw,u,c,r
check_command   check_http!site.com!20!"Home"
   register0
}

and then later on in the config file


define service{
usecheck-site4
service_descriptionHTTP
contact_groupsmis
host_namesite.com
}


On 11/2/07, Hari Sekhon <[EMAIL PROTECTED]> wrote:
>
> If you had read the docs like I said, I can't see how you could miss the
> fact that you have a service definition without a host_name or
> hostgroup_name!
>
> Where is this service check going to run against if you haven't told it
> which host you want to test?
>
> I even gave you the anchored link to the exact place where it shows you
> the definitions that are needed for that block...
>
> -h
>
> Hari Sekhon
>
>
>
> Jerad Riggin wrote:
> > Ok, so for example
> >
> > define service{
> > use basic-service
> > notification_options   w,u,c,r
> > check_command check_http!site.com!20!"Home"
> > Service description  CheckString
> > }
> >
> > Sorry if this seems like a newbie question.  I'm just trying to
> > backtrack and optimize the config before I get too far down the road.
> >
> >
> > On 11/2/07, *Hari Sekhon* <[EMAIL PROTECTED]
> > > wrote:
> >
> > also, you are making it more trouble than it needs to be, instead of
> > trying to do register 0 and making the last block a template
> > block, just
> > put the host name in there and the service description as I hinted
> and
> > that is all you need, you will then have the first 2 blocks plus
> > one for
> > each service on each host with different string checks...
> >
> > -h
> >
> > Hari Sekhon
> >
> >
> >
> > Jerad Riggin wrote:
> > > I've read the help docs, like I said I have a working installation
> > > checking about 15 servers.  I have a PING service that pings a
> host
> > > group.  I can't do that with this because I'm checking a different
> > > string on each site, so i'm guessing it has to be separated
> > out.  Does
> > > this make sense?
> > >
> > > On 11/2/07, *Hari Sekhon* <[EMAIL PROTECTED]
> > 
> > >  > >> wrote:
> > >
> > > You need to reread the docs, this is the most basic of
> > questions.
> > >
> > >
> > http://nagios.sourceforge.net/docs/2_0/xodtemplate.html#service
> > 
> > >
> > > Hint: You are missing a service_description in the last
> > block for one.
> > > You should not have name in that last block either, you need
> > > host_name.
> > >
> > > Read docs pls.
> > >
> > > -h
> > >
> > > Hari Sekhon
> > >
> > >
> > >
> > > Jerad Riggin wrote:
> > > > I have a functioning nagios setup but I have a quick
> > question.  I am
> > > > going through and adding website string checks so we can
> keep
> > > track of
> > > > availability on one of our webservers.
> > > >
> > > > So in services.cfg I have
> > > >
> > > > define service{
> > > > namegeneric-service ;
> > Generic
> > > > service name
> > > > active_checks_enabled   1   ;
> > Active
> > > > service checks are enabled
> > > > passive_checks_enabled  1   ;
> > Passive
> > > > service checks are enabled/accepted
> > > > parallelize_check   1   ;
> > Active
> > > > service checks should be parallelized (Don't disable)
> > > > obsess_over_service 1   ;
> > We should
> > > > obsess over this service (if necessary)
> > > > check_freshness 0   ;
> > > Default is
> > > > to NOT check service 'freshness'
> > > > notifications_enabled   1   ;
> > Service
> > > > notifications are enabled
> > > > event_handler_enabled   1   ;
> > Service
> > > > event handler is enabled
> > > > flap_detection_enabled  1   ;
> > Flap
> > > > detection is enabled
> > > > process_perf_data   1   ;
> > Process
> > > > performance data
> > >

Re: [Nagios-users] Service checks

2007-11-02 Thread Hari Sekhon
If you had read the docs like I said, I can't see how you could miss the 
fact that you have a service definition without a host_name or 
hostgroup_name!

Where is this service check going to run against if you haven't told it 
which host you want to test?

I even gave you the anchored link to the exact place where it shows you 
the definitions that are needed for that block...

-h

Hari Sekhon



Jerad Riggin wrote:
> Ok, so for example
>
> define service{
> use basic-service
> notification_options   w,u,c,r
> check_command check_http!site.com!20!"Home"
> Service description  CheckString
> }
>
> Sorry if this seems like a newbie question.  I'm just trying to 
> backtrack and optimize the config before I get too far down the road.
>
>
> On 11/2/07, *Hari Sekhon* <[EMAIL PROTECTED] 
> > wrote:
>
> also, you are making it more trouble than it needs to be, instead of
> trying to do register 0 and making the last block a template
> block, just
> put the host name in there and the service description as I hinted and
> that is all you need, you will then have the first 2 blocks plus
> one for
> each service on each host with different string checks...
>
> -h
>
> Hari Sekhon
>
>
>
> Jerad Riggin wrote:
> > I've read the help docs, like I said I have a working installation
> > checking about 15 servers.  I have a PING service that pings a host
> > group.  I can't do that with this because I'm checking a different
> > string on each site, so i'm guessing it has to be separated
> out.  Does
> > this make sense?
> >
> > On 11/2/07, *Hari Sekhon* <[EMAIL PROTECTED]
> 
> >  >> wrote:
> >
> > You need to reread the docs, this is the most basic of
> questions.
> >
> >
> http://nagios.sourceforge.net/docs/2_0/xodtemplate.html#service
> 
> >
> > Hint: You are missing a service_description in the last
> block for one.
> > You should not have name in that last block either, you need
> > host_name.
> >
> > Read docs pls.
> >
> > -h
> >
> > Hari Sekhon
> >
> >
> >
> > Jerad Riggin wrote:
> > > I have a functioning nagios setup but I have a quick
> question.  I am
> > > going through and adding website string checks so we can keep
> > track of
> > > availability on one of our webservers.
> > >
> > > So in services.cfg I have
> > >
> > > define service{
> > > namegeneric-service ;
> Generic
> > > service name
> > > active_checks_enabled   1   ;
> Active
> > > service checks are enabled
> > > passive_checks_enabled  1   ;
> Passive
> > > service checks are enabled/accepted
> > > parallelize_check   1   ;
> Active
> > > service checks should be parallelized (Don't disable)
> > > obsess_over_service 1   ;
> We should
> > > obsess over this service (if necessary)
> > > check_freshness 0   ;
> > Default is
> > > to NOT check service 'freshness'
> > > notifications_enabled   1   ;
> Service
> > > notifications are enabled
> > > event_handler_enabled   1   ;
> Service
> > > event handler is enabled
> > > flap_detection_enabled  1   ;
> Flap
> > > detection is enabled
> > > process_perf_data   1   ;
> Process
> > > performance data
> > > retain_status_information   1   ;
> Retain
> > > status information across program restarts
> > > retain_nonstatus_information1   ;
> Retain
> > > non-status information across program restarts
> > > register0   ;
> DONT
> > > REGISTER THIS DEFINITION - NOT A REAL SERVICE, JUST A
> TEMPLATE!
> > > }
> > >
> > > define service{
> > > use generic-service
> > > namebasic-service
> > > is_volatile 0
> > > check_period24x7
> > > max_check_attempts  5
> > > normal_check_interval   3
> > 

Re: [Nagios-users] Service checks

2007-11-02 Thread Hari Sekhon
I've just tried this, using the standard check_http plugin you are 
using. It appears not.

As soon as one string is not found it goes critical.

You could always write a custom plugin to test your websites, or much 
more easily a shell wrapper plugin to call the first, if it fails, call 
the second, and only if that fails go critical... not perfect though, a 
better/custom plugin would serve you better.

-h

Hari Sekhon



Jerad Riggin wrote:
> In that case, is there a way to specify more than one possible string, 
> so that for example if it can't find "Home", but it can find 
> "Checkout",  it considers the host up?
>
> On 11/2/07, *Hari Sekhon* <[EMAIL PROTECTED] 
> > wrote:
>
> yes you have to use a separate one for each site, how else would
> you be
> able to use a separate check? Unless you expect the word "Home" on
> each
> site and that is your string check. Another option is using Macros
> but I
> suspect this may not do what you want...
>
> -h
>
> Hari Sekhon
>
>
>
> Jerad Riggin wrote:
> > I've read the help docs, like I said I have a working installation
> > checking about 15 servers.  I have a PING service that pings a host
> > group.  I can't do that with this because I'm checking a different
> > string on each site, so i'm guessing it has to be separated
> out.  Does
> > this make sense?
> >
> > On 11/2/07, *Hari Sekhon* < [EMAIL PROTECTED]
> 
> >  >> wrote:
> >
> > You need to reread the docs, this is the most basic of
> questions.
> >
> > http://nagios.sourceforge.net/docs/2_0/xodtemplate.html#service
> >
> > Hint: You are missing a service_description in the last
> block for one.
> > You should not have name in that last block either, you need
> > host_name.
> >
> > Read docs pls.
> >
> > -h
> >
> > Hari Sekhon
> >
> >
> >
> > Jerad Riggin wrote:
> > > I have a functioning nagios setup but I have a quick
> question.  I am
> > > going through and adding website string checks so we can keep
> > track of
> > > availability on one of our webservers.
> > >
> > > So in services.cfg I have
> > >
> > > define service{
> > > namegeneric-service ;
> Generic
> > > service name
> > > active_checks_enabled   1   ;
> Active
> > > service checks are enabled
> > > passive_checks_enabled  1   ;
> Passive
> > > service checks are enabled/accepted
> > > parallelize_check   1   ;
> Active
> > > service checks should be parallelized (Don't disable)
> > > obsess_over_service 1   ;
> We should
> > > obsess over this service (if necessary)
> > > check_freshness 0   ;
> > Default is
> > > to NOT check service 'freshness'
> > > notifications_enabled   1   ;
> Service
> > > notifications are enabled
> > > event_handler_enabled   1   ;
> Service
> > > event handler is enabled
> > > flap_detection_enabled  1   ;
> Flap
> > > detection is enabled
> > > process_perf_data   1   ;
> Process
> > > performance data
> > > retain_status_information   1   ;
> Retain
> > > status information across program restarts
> > > retain_nonstatus_information1   ;
> Retain
> > > non-status information across program restarts
> > > register0   ;
> DONT
> > > REGISTER THIS DEFINITION - NOT A REAL SERVICE, JUST A
> TEMPLATE!
> > > }
> > >
> > > define service{
> > > use generic-service
> > > namebasic-service
> > > is_volatile 0
> > > check_period24x7
> > > max_check_attempts  5
> > > normal_check_interval   3
> > > retry_check_interval1
> > > notification_interval   15
> > > notification_period 24x7
> > > register0
> > > }
> > >
> > > I then have as just one example:
> > >
> > > define service{
> > > 

Re: [Nagios-users] Service checks

2007-11-02 Thread Hari Sekhon
also, you are making it more trouble than it needs to be, instead of 
trying to do register 0 and making the last block a template block, just 
put the host name in there and the service description as I hinted and 
that is all you need, you will then have the first 2 blocks plus one for 
each service on each host with different string checks...

-h

Hari Sekhon



Jerad Riggin wrote:
> I've read the help docs, like I said I have a working installation 
> checking about 15 servers.  I have a PING service that pings a host 
> group.  I can't do that with this because I'm checking a different 
> string on each site, so i'm guessing it has to be separated out.  Does 
> this make sense?
>
> On 11/2/07, *Hari Sekhon* <[EMAIL PROTECTED] 
> > wrote:
>
> You need to reread the docs, this is the most basic of questions.
>
> http://nagios.sourceforge.net/docs/2_0/xodtemplate.html#service
>
> Hint: You are missing a service_description in the last block for one.
> You should not have name in that last block either, you need
> host_name.
>
> Read docs pls.
>
> -h
>
> Hari Sekhon
>
>
>
> Jerad Riggin wrote:
> > I have a functioning nagios setup but I have a quick question.  I am
> > going through and adding website string checks so we can keep
> track of
> > availability on one of our webservers.
> >
> > So in services.cfg I have
> >
> > define service{
> > namegeneric-service ; Generic
> > service name
> > active_checks_enabled   1   ; Active
> > service checks are enabled
> > passive_checks_enabled  1   ; Passive
> > service checks are enabled/accepted
> > parallelize_check   1   ; Active
> > service checks should be parallelized (Don't disable)
> > obsess_over_service 1   ; We should
> > obsess over this service (if necessary)
> > check_freshness 0   ;
> Default is
> > to NOT check service 'freshness'
> > notifications_enabled   1   ; Service
> > notifications are enabled
> > event_handler_enabled   1   ; Service
> > event handler is enabled
> > flap_detection_enabled  1   ; Flap
> > detection is enabled
> > process_perf_data   1   ; Process
> > performance data
> > retain_status_information   1   ; Retain
> > status information across program restarts
> > retain_nonstatus_information1   ; Retain
> > non-status information across program restarts
> > register0   ; DONT
> > REGISTER THIS DEFINITION - NOT A REAL SERVICE, JUST A TEMPLATE!
> > }
> >
> > define service{
> > use generic-service
> > namebasic-service
> > is_volatile 0
> > check_period24x7
> > max_check_attempts  5
> > normal_check_interval   3
> > retry_check_interval1
> > notification_interval   15
> > notification_period 24x7
> > register0
> > }
> >
> > I then have as just one example:
> >
> > define service{
> > use basic-service
> > namecheck-site4
> > notification_optionsw,u,c,r
> > check_command  
> check_http!site.com!20!"Home"
> > register0
> > }
> >
> >
> > My question is, you notice that I have the name as check-site4, and
> > then later on in the services.cfg I call up that
> checksite-4.  Is this
> > the correct way?  Do I need to define a service for each host
> and then
> > later on call it by name to execute the service check?  Is this
> a bad
> > way of going about it?
> >
> > Thanks,
> >
> > Jerad
> >
> 
> >
> >
> -
>
> > This SF.net email is sponsored by: Splunk Inc.
> > Still grepping through log files to find problems?  Stop.
> > Now Search log events and configuration files using AJAX and a
> browser.
> > Download your FREE copy of Splunk now >> http://get.splunk.com/
> >
> 
> >
> > ___

Re: [Nagios-users] Service checks

2007-11-02 Thread Jerad Riggin
In that case, is there a way to specify more than one possible string, so
that for example if it can't find "Home", but it can find "Checkout",  it
considers the host up?

On 11/2/07, Hari Sekhon <[EMAIL PROTECTED]> wrote:
>
> yes you have to use a separate one for each site, how else would you be
> able to use a separate check? Unless you expect the word "Home" on each
> site and that is your string check. Another option is using Macros but I
> suspect this may not do what you want...
>
> -h
>
> Hari Sekhon
>
>
>
> Jerad Riggin wrote:
> > I've read the help docs, like I said I have a working installation
> > checking about 15 servers.  I have a PING service that pings a host
> > group.  I can't do that with this because I'm checking a different
> > string on each site, so i'm guessing it has to be separated out.  Does
> > this make sense?
> >
> > On 11/2/07, *Hari Sekhon* <[EMAIL PROTECTED]
> > > wrote:
> >
> > You need to reread the docs, this is the most basic of questions.
> >
> > http://nagios.sourceforge.net/docs/2_0/xodtemplate.html#service
> >
> > Hint: You are missing a service_description in the last block for
> one.
> > You should not have name in that last block either, you need
> > host_name.
> >
> > Read docs pls.
> >
> > -h
> >
> > Hari Sekhon
> >
> >
> >
> > Jerad Riggin wrote:
> > > I have a functioning nagios setup but I have a quick question.  I
> am
> > > going through and adding website string checks so we can keep
> > track of
> > > availability on one of our webservers.
> > >
> > > So in services.cfg I have
> > >
> > > define service{
> > > namegeneric-service ; Generic
> > > service name
> > > active_checks_enabled   1   ; Active
> > > service checks are enabled
> > > passive_checks_enabled  1   ; Passive
> > > service checks are enabled/accepted
> > > parallelize_check   1   ; Active
> > > service checks should be parallelized (Don't disable)
> > > obsess_over_service 1   ; We
> should
> > > obsess over this service (if necessary)
> > > check_freshness 0   ;
> > Default is
> > > to NOT check service 'freshness'
> > > notifications_enabled   1   ; Service
> > > notifications are enabled
> > > event_handler_enabled   1   ; Service
> > > event handler is enabled
> > > flap_detection_enabled  1   ; Flap
> > > detection is enabled
> > > process_perf_data   1   ; Process
> > > performance data
> > > retain_status_information   1   ; Retain
> > > status information across program restarts
> > > retain_nonstatus_information1   ; Retain
> > > non-status information across program restarts
> > > register0   ; DONT
> > > REGISTER THIS DEFINITION - NOT A REAL SERVICE, JUST A TEMPLATE!
> > > }
> > >
> > > define service{
> > > use generic-service
> > > namebasic-service
> > > is_volatile 0
> > > check_period24x7
> > > max_check_attempts  5
> > > normal_check_interval   3
> > > retry_check_interval1
> > > notification_interval   15
> > > notification_period 24x7
> > > register0
> > > }
> > >
> > > I then have as just one example:
> > >
> > > define service{
> > > use basic-service
> > > namecheck-site4
> > > notification_optionsw,u,c,r
> > > check_command
> > check_http!site.com!20!"Home"
> > > register0
> > > }
> > >
> > >
> > > My question is, you notice that I have the name as check-site4,
> and
> > > then later on in the services.cfg I call up that
> > checksite-4.  Is this
> > > the correct way?  Do I need to define a service for each host
> > and then
> > > later on call it by name to execute the service check?  Is this
> > a bad
> > > way of going about it?
> > >
> > > Thanks,
> > >
> > > Jerad
> > >
> >
> 
> > >
> > >
> >
> -
> >
> > > This SF.net email is sponsored by: Splunk Inc.
> >  

Re: [Nagios-users] Service checks

2007-11-02 Thread Hari Sekhon
yes you have to use a separate one for each site, how else would you be 
able to use a separate check? Unless you expect the word "Home" on each 
site and that is your string check. Another option is using Macros but I 
suspect this may not do what you want...

-h

Hari Sekhon



Jerad Riggin wrote:
> I've read the help docs, like I said I have a working installation 
> checking about 15 servers.  I have a PING service that pings a host 
> group.  I can't do that with this because I'm checking a different 
> string on each site, so i'm guessing it has to be separated out.  Does 
> this make sense?
>
> On 11/2/07, *Hari Sekhon* <[EMAIL PROTECTED] 
> > wrote:
>
> You need to reread the docs, this is the most basic of questions.
>
> http://nagios.sourceforge.net/docs/2_0/xodtemplate.html#service
>
> Hint: You are missing a service_description in the last block for one.
> You should not have name in that last block either, you need
> host_name.
>
> Read docs pls.
>
> -h
>
> Hari Sekhon
>
>
>
> Jerad Riggin wrote:
> > I have a functioning nagios setup but I have a quick question.  I am
> > going through and adding website string checks so we can keep
> track of
> > availability on one of our webservers.
> >
> > So in services.cfg I have
> >
> > define service{
> > namegeneric-service ; Generic
> > service name
> > active_checks_enabled   1   ; Active
> > service checks are enabled
> > passive_checks_enabled  1   ; Passive
> > service checks are enabled/accepted
> > parallelize_check   1   ; Active
> > service checks should be parallelized (Don't disable)
> > obsess_over_service 1   ; We should
> > obsess over this service (if necessary)
> > check_freshness 0   ;
> Default is
> > to NOT check service 'freshness'
> > notifications_enabled   1   ; Service
> > notifications are enabled
> > event_handler_enabled   1   ; Service
> > event handler is enabled
> > flap_detection_enabled  1   ; Flap
> > detection is enabled
> > process_perf_data   1   ; Process
> > performance data
> > retain_status_information   1   ; Retain
> > status information across program restarts
> > retain_nonstatus_information1   ; Retain
> > non-status information across program restarts
> > register0   ; DONT
> > REGISTER THIS DEFINITION - NOT A REAL SERVICE, JUST A TEMPLATE!
> > }
> >
> > define service{
> > use generic-service
> > namebasic-service
> > is_volatile 0
> > check_period24x7
> > max_check_attempts  5
> > normal_check_interval   3
> > retry_check_interval1
> > notification_interval   15
> > notification_period 24x7
> > register0
> > }
> >
> > I then have as just one example:
> >
> > define service{
> > use basic-service
> > namecheck-site4
> > notification_optionsw,u,c,r
> > check_command  
> check_http!site.com!20!"Home"
> > register0
> > }
> >
> >
> > My question is, you notice that I have the name as check-site4, and
> > then later on in the services.cfg I call up that
> checksite-4.  Is this
> > the correct way?  Do I need to define a service for each host
> and then
> > later on call it by name to execute the service check?  Is this
> a bad
> > way of going about it?
> >
> > Thanks,
> >
> > Jerad
> >
> 
> >
> >
> -
>
> > This SF.net email is sponsored by: Splunk Inc.
> > Still grepping through log files to find problems?  Stop.
> > Now Search log events and configuration files using AJAX and a
> browser.
> > Download your FREE copy of Splunk now >> http://get.splunk.com/
> >
> 
> >
> > ___
> > Nagios-users mailing list
> > Nagios-users@lists.sourc

Re: [Nagios-users] Service checks

2007-11-02 Thread Jerad Riggin
I've read the help docs, like I said I have a working installation checking
about 15 servers.  I have a PING service that pings a host group.  I can't
do that with this because I'm checking a different string on each site, so
i'm guessing it has to be separated out.  Does this make sense?

On 11/2/07, Hari Sekhon <[EMAIL PROTECTED]> wrote:
>
> You need to reread the docs, this is the most basic of questions.
>
> http://nagios.sourceforge.net/docs/2_0/xodtemplate.html#service
>
> Hint: You are missing a service_description in the last block for one.
> You should not have name in that last block either, you need host_name.
>
> Read docs pls.
>
> -h
>
> Hari Sekhon
>
>
>
> Jerad Riggin wrote:
> > I have a functioning nagios setup but I have a quick question.  I am
> > going through and adding website string checks so we can keep track of
> > availability on one of our webservers.
> >
> > So in services.cfg I have
> >
> > define service{
> > namegeneric-service ; Generic
> > service name
> > active_checks_enabled   1   ; Active
> > service checks are enabled
> > passive_checks_enabled  1   ; Passive
> > service checks are enabled/accepted
> > parallelize_check   1   ; Active
> > service checks should be parallelized (Don't disable)
> > obsess_over_service 1   ; We should
> > obsess over this service (if necessary)
> > check_freshness 0   ; Default is
> > to NOT check service 'freshness'
> > notifications_enabled   1   ; Service
> > notifications are enabled
> > event_handler_enabled   1   ; Service
> > event handler is enabled
> > flap_detection_enabled  1   ; Flap
> > detection is enabled
> > process_perf_data   1   ; Process
> > performance data
> > retain_status_information   1   ; Retain
> > status information across program restarts
> > retain_nonstatus_information1   ; Retain
> > non-status information across program restarts
> > register0   ; DONT
> > REGISTER THIS DEFINITION - NOT A REAL SERVICE, JUST A TEMPLATE!
> > }
> >
> > define service{
> > use generic-service
> > namebasic-service
> > is_volatile 0
> > check_period24x7
> > max_check_attempts  5
> > normal_check_interval   3
> > retry_check_interval1
> > notification_interval   15
> > notification_period 24x7
> > register0
> > }
> >
> > I then have as just one example:
> >
> > define service{
> > use basic-service
> > namecheck-site4
> > notification_optionsw,u,c,r
> > check_command   check_http!site.com!20!"Home"
> > register0
> > }
> >
> >
> > My question is, you notice that I have the name as check-site4, and
> > then later on in the services.cfg I call up that checksite-4.  Is this
> > the correct way?  Do I need to define a service for each host and then
> > later on call it by name to execute the service check?  Is this a bad
> > way of going about it?
> >
> > Thanks,
> >
> > Jerad
> > 
> >
> >
> -
> > This SF.net email is sponsored by: Splunk Inc.
> > Still grepping through log files to find problems?  Stop.
> > Now Search log events and configuration files using AJAX and a browser.
> > Download your FREE copy of Splunk now >> http://get.splunk.com/
> > 
> >
> > ___
> > Nagios-users mailing list
> > Nagios-users@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/nagios-users
> > ::: Please include Nagios version, plugin version (-v) and OS when
> reporting any issue.
> > ::: Messages without supporting info will risk being sent to /dev/null
>
-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin v

Re: [Nagios-users] Service checks

2007-11-02 Thread Hari Sekhon
You need to reread the docs, this is the most basic of questions.

http://nagios.sourceforge.net/docs/2_0/xodtemplate.html#service

Hint: You are missing a service_description in the last block for one. 
You should not have name in that last block either, you need host_name.

Read docs pls.

-h

Hari Sekhon



Jerad Riggin wrote:
> I have a functioning nagios setup but I have a quick question.  I am 
> going through and adding website string checks so we can keep track of 
> availability on one of our webservers. 
>
> So in services.cfg I have
>
> define service{
> namegeneric-service ; Generic 
> service name
> active_checks_enabled   1   ; Active 
> service checks are enabled
> passive_checks_enabled  1   ; Passive 
> service checks are enabled/accepted
> parallelize_check   1   ; Active 
> service checks should be parallelized (Don't disable)
> obsess_over_service 1   ; We should 
> obsess over this service (if necessary)
> check_freshness 0   ; Default is 
> to NOT check service 'freshness'
> notifications_enabled   1   ; Service 
> notifications are enabled
> event_handler_enabled   1   ; Service 
> event handler is enabled
> flap_detection_enabled  1   ; Flap 
> detection is enabled
> process_perf_data   1   ; Process 
> performance data
> retain_status_information   1   ; Retain 
> status information across program restarts
> retain_nonstatus_information1   ; Retain 
> non-status information across program restarts
> register0   ; DONT 
> REGISTER THIS DEFINITION - NOT A REAL SERVICE, JUST A TEMPLATE!
> }
>
> define service{
> use generic-service
> namebasic-service
> is_volatile 0
> check_period24x7
> max_check_attempts  5
> normal_check_interval   3
> retry_check_interval1
> notification_interval   15
> notification_period 24x7
> register0
> }
>
> I then have as just one example:
>
> define service{
> use basic-service
> namecheck-site4
> notification_optionsw,u,c,r
> check_command   check_http!site.com!20!"Home"
> register0
> }
>
>
> My question is, you notice that I have the name as check-site4, and 
> then later on in the services.cfg I call up that checksite-4.  Is this 
> the correct way?  Do I need to define a service for each host and then 
> later on call it by name to execute the service check?  Is this a bad 
> way of going about it? 
>
> Thanks,
>
> Jerad
> 
>
> -
> This SF.net email is sponsored by: Splunk Inc.
> Still grepping through log files to find problems?  Stop.
> Now Search log events and configuration files using AJAX and a browser.
> Download your FREE copy of Splunk now >> http://get.splunk.com/
> 
>
> ___
> Nagios-users mailing list
> Nagios-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when reporting 
> any issue. 
> ::: Messages without supporting info will risk being sent to /dev/null

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] Service checks

2007-11-02 Thread Jerad Riggin
I have a functioning nagios setup but I have a quick question.  I am going
through and adding website string checks so we can keep track of
availability on one of our webservers.

So in services.cfg I have

define service{
namegeneric-service ; Generic service
name
active_checks_enabled   1   ; Active service
checks are enabled
passive_checks_enabled  1   ; Passive service
checks are enabled/accepted
parallelize_check   1   ; Active service
checks should be parallelized (Don't disable)
obsess_over_service 1   ; We should obsess
over this service (if necessary)
check_freshness 0   ; Default is to NOT
check service 'freshness'
notifications_enabled   1   ; Service
notifications are enabled
event_handler_enabled   1   ; Service event
handler is enabled
flap_detection_enabled  1   ; Flap detection is
enabled
process_perf_data   1   ; Process
performance data
retain_status_information   1   ; Retain status
information across program restarts
retain_nonstatus_information1   ; Retain non-status
information across program restarts
register0   ; DONT REGISTER THIS
DEFINITION - NOT A REAL SERVICE, JUST A TEMPLATE!
}

define service{
use generic-service
namebasic-service
is_volatile 0
check_period24x7
max_check_attempts  5
normal_check_interval   3
retry_check_interval1
notification_interval   15
notification_period 24x7
register0
}

I then have as just one example:

define service{
use basic-service
namecheck-site4
notification_optionsw,u,c,r
check_command   check_http!site.com!20!"Home"
register0
}


My question is, you notice that I have the name as check-site4, and then
later on in the services.cfg I call up that checksite-4.  Is this the
correct way?  Do I need to define a service for each host and then later on
call it by name to execute the service check?  Is this a bad way of going
about it?

Thanks,

Jerad
-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Service checks in hosts.cfg?

2007-02-24 Thread Hugo van der Kooij
On Sat, 24 Feb 2007, chiel wrote:

> Thank you for your comments. I understand this way nagios must be
> configured.
>
> define service{
> use mydyndns-template
> host_name
> ns2.mydyndns.org,ns3.mydyndns.org,ns4.mydyndns.org,ns5.mydyndns.org
> service_description PING
> check_command   check_ping!300,20%!1000,60%
> contact_groups  mydyndns-org
> }
>
>
> In the above example you define all your hosts that you want to check with
> ping on one line. Let say I want to check over a 100 hosts with this
> services, must they al go on 1 line (?!).
> I can't put the hostgroup here because not al the hosts in that hostgroup
> respond to ping.
> (Of course I split up the config files as much as I can with creating
> cfg_dir's)

Can't recall if you can have multiple host_name lines. I have not needed 
it yet. But it will take you only minutes to find out. Won't it?

Please, don't top post:

A: Yes.
>Q: Are you sure?
>>A: Because it reverses the logical flow of conversation.
>>>Q: Why is top posting frowned upon?

Hugo.

-- 
[EMAIL PROTECTED]   http://hvdkooij.xs4all.nl/
This message is using 100% recycled electrons.

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Service checks in hosts.cfg?

2007-02-24 Thread chiel
Hi Hugo,

Thank you for your comments. I understand this way nagios must be 
configured.

define service{
 use mydyndns-template
 host_name 
ns2.mydyndns.org,ns3.mydyndns.org,ns4.mydyndns.org,ns5.mydyndns.org
 service_description PING
 check_command   check_ping!300,20%!1000,60%
 contact_groups  mydyndns-org
}


In the above example you define all your hosts that you want to check with 
ping on one line. Let say I want to check over a 100 hosts with this 
services, must they al go on 1 line (?!).
I can't put the hostgroup here because not al the hosts in that hostgroup 
respond to ping.
(Of course I split up the config files as much as I can with creating 
cfg_dir's)


Michiel


- Original Message - 
From: "Hugo van der Kooij" <[EMAIL PROTECTED]>
To: "Nagios Users mailinglist" 
Sent: Saturday, February 24, 2007 3:06 PM
Subject: Re: [Nagios-users] Service checks in hosts.cfg?


> On Sat, 24 Feb 2007, chiel wrote:
>
>> I have been working with Nagios for a couple of days now and I'm just 
>> beginning to understand the principle of the config (.cfg) files.
>> I understand that you create a service (let say Ping) and put all your 
>> hosts (or hostgroups) in there that you want to check with ping.
>
> First off. Print the manual in full or whatever allows you to read it
> cover to cover. But go over the full manual at least once from cover to
> cover.
>
>> But is it also possible to define these checks in the hosts.cfg file??
>> So I would become something like this:
>>
>> define host{
>>host_name  server1
>>alias   Linux server 1
>>address 10.0.0.1
>>contact_groups  network_team
>>checks  check_ping, check_load, check_snmp <--- This line
>> }
>
> This is rather completely wrong. I think you need to reverse your way of
> thinking.
>
> Think of hosts as machines. You can add a check on the machine but it is
> just to see if the hardware is there. For this most people use the default
> check but in some cases you need to deviate. That is where the checks on a
> host end.
>
> You care about service provided. You only check the host to make sure they
> remain active because you need them to run the service. In this regard you
> can think of CPU and memory usages as services on which other services
> rely. (If your CPU is 0% idle your MySQL service is likely to be running
> poorly.)
>
> So you need to add a 100 hosts if you have a 100 machines. But read the
> part on templates very very carefully as it will save a lot of time and
> work.
>
> Then you define service which may run on some hosts, mosts hosts or all
> hosts.
>
> So define a service named POSTFIX for example and use the check_smtp check
> to test it.
>
> If in doubt go over the manual a few times and study the mailinglist
> archives.
>
> For example my 4 backup DNS servers:
>
> # 4 backup servers:
> define hostgroup {
> hostgroup_name  mydyndns-org
> alias   mydyndns.org servers
> members ns2.mydyndns.org
> members ns3.mydyndns.org
> members ns4.mydyndns.org
> members ns5.mydyndns.org
> }
>
> # Template for these hosts
> define host {
> namemydyndns-template
> register0
> check_command   check-host-alive
> max_check_attempts  3
> active_checks_enabled   1
> passive_checks_enabled  0
> check_period24x7
> retain_status_information   1
> retain_nonstatus_information1
> notification_interval   60
> notification_period 24x7
> notification_optionsd,u,r,f
> notifications_enabled   1
> }
>
> # The 4 machines:
> define host{
> use mydyndns-template
> host_name   ns2.mydyndns.org
> alias   DNS server 2
> address 204.13.249.82
> parents transip-switch
> hostgroups  mydyndns-org
> contact_groups  mydyndns-org
> }
>
> define host{
> use mydyndns-template
> host_name   ns3.mydyndns.org
> a

Re: [Nagios-users] Service checks in hosts.cfg?

2007-02-24 Thread Thomas Guyot-Sionnest
On 24/02/07 08:21 AM, chiel wrote:
>  Hello,
> 
> I have been working with Nagios for a couple of days now and I'm just
> beginning to understand the principle of the config (.cfg) files.
> I understand that you create a service (let say Ping) and put all your
> hosts (or hostgroups) in there that you want to check with ping.
> 
> But is it also possible to define these checks in the hosts.cfg file??
> So I would become something like this:
>  
> define host{
> host_name  server1
> alias   Linux server 1
> address 10.0.0.1
> contact_groups  network_team
> checks  check_ping, check_load, check_snmp <--- This line
> }
>  
>  
> This would make more sense to me, because if you got over a 100 hosts to
> check, and you don't want to do it with a hostgroup, then all these host
> most be in the services.cfg on one line (do I got this right?).

Well, first keep in mind that any object configuration file is only a
configuration file. The name is irrelevant. If you're building a huge
host you should use cfg_dir so that you can config files simple by
dropinf one in cfg_dir on one of its subdirectory.

The best way to achieve your result is using templates. So let's say you
have, among hostgroups, web1 web2 and web3 that all needs HTTP checks.
You can create a template (I'm not looking at the doc so syntax may be
wrong...):

define service{
  name   web_service
  hostgroup_name web1 web2 web3
  [optionally other options]
  usegeneric_service
# ^^^ Only if you have a generic_service template with default options.
  register   0
}

Then to define the HTTP check all you have to do is:

define service {
  service_description   HTTP Check
  host_name 
  check_command check_http!args
  use   web_service
}

If you build everything over whell-organized templates your config will
be much smaller. Here's the comparisos of number of lines between
objects config lines and  the fully parsed objects cache on one of my hosts:

# find /path/to/nagios/cfg -name \*.cfg|xargs cat|egrep -v '^.*?#' \
  |egrep -v '^[[:space:]]*$'|wc -l
1755
# cat /path/to/var/objects.cache|egrep -v '^.*?#' \
  |egrep -v '^[[:space:]]*$'|wc -l
29970

Thomas

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Service checks in hosts.cfg?

2007-02-24 Thread chiel
Hi Hugo,

Thank you for your comments. I understand this way.
But what I'm trying to say is that





- Original Message - 
From: "MANUEL CANSECO GARCIA" <[EMAIL PROTECTED]>
To: 
Sent: Saturday, February 24, 2007 3:07 PM
Subject: Re: [Nagios-users] Service checks in hosts.cfg?


>  Mensaje Automatico ***
> Este usuario no se encuentra operativo, para cualquier asunto le ruego
> se pongan en contacto con Leandro Gayango [EMAIL PROTECTED]
>
> ***
>
>>>> nagios-users 02/24/07 15:06 >>>
>
> On Sat, 24 Feb 2007, chiel wrote:
>
>> I have been working with Nagios for a couple of days now and I'm just
> beginning to understand the principle of the config (.cfg) files.
>> I understand that you create a service (let say Ping) and put all your
> hosts (or hostgroups) in there that you want to check with ping.
>
> First off. Print the manual in full or whatever allows you to read it
> cover to cover. But go over the full manual at least once from cover to
> cover.
>
>> But is it also possible to define these checks in the hosts.cfg file??
>> So I would become something like this:
>>
>> define host{
>>host_name  server1
>>alias   Linux server 1
>>address 10.0.0.1
>>contact_groups  network_team
>>checks  check_ping, check_load, check_snmp <--- This
> line
>> }
>
> This is rather completely wrong. I think you need to reverse your way of
>
> thinking.
>
> Think of hosts as machines. You can add a check on the machine but it is
>
> just to see if the hardware is there. For this most people use the
> default
> check but in some cases you need to deviate. That is where the checks on
> a
> host end.
>
> You care about service provided. You only check the host to make sure
> they
> remain active because you need them to run the service. In this regard
> you
> can think of CPU and memory usages as services on which other services
> rely. (If your CPU is 0% idle your MySQL service is likely to be running
>
> poorly.)
>
> So you need to add a 100 hosts if you have a 100 machines. But read the
> part on templates very very carefully as it will save a lot of time and
> work.
>
> Then you define service which may run on some hosts, mosts hosts or all
> hosts.
>
> So define a service named POSTFIX for example and use the check_smtp
> check
> to test it.
>
> If in doubt go over the manual a few times and study the mailinglist
> archives.
>
> For example my 4 backup DNS servers:
>
> # 4 backup servers:
> define hostgroup {
> hostgroup_name  mydyndns-org
> alias   mydyndns.org servers
> members ns2.mydyndns.org
> members ns3.mydyndns.org
> members ns4.mydyndns.org
> members ns5.mydyndns.org
> }
>
> # Template for these hosts
> define host {
> namemydyndns-template
> register0
> check_command   check-host-alive
> max_check_attempts  3
> active_checks_enabled   1
> passive_checks_enabled  0
> check_period24x7
> retain_status_information   1
> retain_nonstatus_information1
> notification_interval   60
> notification_period 24x7
> notification_optionsd,u,r,f
> notifications_enabled   1
> }
>
> # The 4 machines:
> define host{
> use mydyndns-template
> host_name   ns2.mydyndns.org
> alias   DNS server 2
> address 204.13.249.82
> parents transip-switch
> hostgroups  mydyndns-org
> contact_groups  mydyndns-org
> }
>
> define host{
> use mydyndns-template
> host_name   ns3.mydyndns.org
> alias   DNS server 3
> address 204.13.250.82
> #   address 63.209.15.211
> parents transip-switch
> hostgroups  mydyndns-org
> contact_groups  mydyndns-org
> }
>
> de

Re: [Nagios-users] Service checks in hosts.cfg?

2007-02-24 Thread MANUEL CANSECO GARCIA
 Mensaje Automatico ***
Este usuario no se encuentra operativo, para cualquier asunto le ruego
se pongan en contacto con Leandro Gayango [EMAIL PROTECTED]

***

>>> nagios-users 02/24/07 15:06 >>>

On Sat, 24 Feb 2007, chiel wrote:

> I have been working with Nagios for a couple of days now and I'm just
beginning to understand the principle of the config (.cfg) files.
> I understand that you create a service (let say Ping) and put all your
hosts (or hostgroups) in there that you want to check with ping.

First off. Print the manual in full or whatever allows you to read it 
cover to cover. But go over the full manual at least once from cover to 
cover.

> But is it also possible to define these checks in the hosts.cfg file??
> So I would become something like this:
>
> define host{
>host_name  server1
>alias   Linux server 1
>address 10.0.0.1
>contact_groups  network_team
>checks  check_ping, check_load, check_snmp <--- This
line
> }

This is rather completely wrong. I think you need to reverse your way of

thinking.

Think of hosts as machines. You can add a check on the machine but it is

just to see if the hardware is there. For this most people use the
default 
check but in some cases you need to deviate. That is where the checks on
a 
host end.

You care about service provided. You only check the host to make sure
they 
remain active because you need them to run the service. In this regard
you 
can think of CPU and memory usages as services on which other services 
rely. (If your CPU is 0% idle your MySQL service is likely to be running

poorly.)

So you need to add a 100 hosts if you have a 100 machines. But read the 
part on templates very very carefully as it will save a lot of time and 
work.

Then you define service which may run on some hosts, mosts hosts or all 
hosts.

So define a service named POSTFIX for example and use the check_smtp
check 
to test it.

If in doubt go over the manual a few times and study the mailinglist 
archives.

For example my 4 backup DNS servers:

#   4 backup servers:
define hostgroup {
 hostgroup_name  mydyndns-org
 alias   mydyndns.org servers
 members ns2.mydyndns.org
 members ns3.mydyndns.org
 members ns4.mydyndns.org
 members ns5.mydyndns.org
}

#   Template for these hosts
define host {
 namemydyndns-template
 register0
 check_command   check-host-alive
 max_check_attempts  3
 active_checks_enabled   1
 passive_checks_enabled  0
 check_period24x7
 retain_status_information   1
 retain_nonstatus_information1
 notification_interval   60
 notification_period 24x7
 notification_optionsd,u,r,f
 notifications_enabled   1
}

#   The 4 machines:
define host{
 use mydyndns-template
 host_name   ns2.mydyndns.org
 alias   DNS server 2
 address 204.13.249.82
 parents transip-switch
 hostgroups  mydyndns-org
 contact_groups  mydyndns-org
 }

define host{
 use mydyndns-template
 host_name   ns3.mydyndns.org
 alias   DNS server 3
 address 204.13.250.82
#   address 63.209.15.211
 parents transip-switch
 hostgroups  mydyndns-org
 contact_groups  mydyndns-org
 }

define host{
 use mydyndns-template
 host_name   ns4.mydyndns.org
 alias   DNS server 4
 address 213.155.150.206
 parents transip-switch
 hostgroups  mydyndns-org
 contact_groups  mydyndns-org
 }

define host{
 use mydyndns-template
 host_name   ns5.mydyndns.org
 alias   DNS server 5
 address 63.208.196.93
 parents transip-switch
 hostgroups  mydyndns-org
 contact_groups  mydyndns-org
 }

#   Now group the services on the 4 hosts

Re: [Nagios-users] Service checks in hosts.cfg?

2007-02-24 Thread Hugo van der Kooij
On Sat, 24 Feb 2007, chiel wrote:

> I have been working with Nagios for a couple of days now and I'm just 
> beginning to understand the principle of the config (.cfg) files.
> I understand that you create a service (let say Ping) and put all your hosts 
> (or hostgroups) in there that you want to check with ping.

First off. Print the manual in full or whatever allows you to read it 
cover to cover. But go over the full manual at least once from cover to 
cover.

> But is it also possible to define these checks in the hosts.cfg file??
> So I would become something like this:
>
> define host{
>host_name  server1
>alias   Linux server 1
>address 10.0.0.1
>contact_groups  network_team
>checks  check_ping, check_load, check_snmp <--- This line
> }

This is rather completely wrong. I think you need to reverse your way of 
thinking.

Think of hosts as machines. You can add a check on the machine but it is 
just to see if the hardware is there. For this most people use the default 
check but in some cases you need to deviate. That is where the checks on a 
host end.

You care about service provided. You only check the host to make sure they 
remain active because you need them to run the service. In this regard you 
can think of CPU and memory usages as services on which other services 
rely. (If your CPU is 0% idle your MySQL service is likely to be running 
poorly.)

So you need to add a 100 hosts if you have a 100 machines. But read the 
part on templates very very carefully as it will save a lot of time and 
work.

Then you define service which may run on some hosts, mosts hosts or all 
hosts.

So define a service named POSTFIX for example and use the check_smtp check 
to test it.

If in doubt go over the manual a few times and study the mailinglist 
archives.

For example my 4 backup DNS servers:

#   4 backup servers:
define hostgroup {
 hostgroup_name  mydyndns-org
 alias   mydyndns.org servers
 members ns2.mydyndns.org
 members ns3.mydyndns.org
 members ns4.mydyndns.org
 members ns5.mydyndns.org
}

#   Template for these hosts
define host {
 namemydyndns-template
 register0
 check_command   check-host-alive
 max_check_attempts  3
 active_checks_enabled   1
 passive_checks_enabled  0
 check_period24x7
 retain_status_information   1
 retain_nonstatus_information1
 notification_interval   60
 notification_period 24x7
 notification_optionsd,u,r,f
 notifications_enabled   1
}

#   The 4 machines:
define host{
 use mydyndns-template
 host_name   ns2.mydyndns.org
 alias   DNS server 2
 address 204.13.249.82
 parents transip-switch
 hostgroups  mydyndns-org
 contact_groups  mydyndns-org
 }

define host{
 use mydyndns-template
 host_name   ns3.mydyndns.org
 alias   DNS server 3
 address 204.13.250.82
#   address 63.209.15.211
 parents transip-switch
 hostgroups  mydyndns-org
 contact_groups  mydyndns-org
 }

define host{
 use mydyndns-template
 host_name   ns4.mydyndns.org
 alias   DNS server 4
 address 213.155.150.206
 parents transip-switch
 hostgroups  mydyndns-org
 contact_groups  mydyndns-org
 }

define host{
 use mydyndns-template
 host_name   ns5.mydyndns.org
 alias   DNS server 5
 address 63.208.196.93
 parents transip-switch
 hostgroups  mydyndns-org
 contact_groups  mydyndns-org
 }

#   Now group the services on the 4 hosts:
define servicegroup {
 servicegroup_name   mydyndns-org
 alias   mydyndns.org servers
 members ns2.mydyndns.org, *
 members ns3.mydyndns.org, *
 members  

[Nagios-users] Service checks in hosts.cfg?

2007-02-24 Thread chiel
 Hello,

I have been working with Nagios for a couple of days now and I'm just beginning 
to understand the principle of the config (.cfg) files.
I understand that you create a service (let say Ping) and put all your hosts 
(or hostgroups) in there that you want to check with ping. 

But is it also possible to define these checks in the hosts.cfg file?? 
So I would become something like this:

define host{
host_name  server1
alias   Linux server 1
address 10.0.0.1
contact_groups  network_team
checks  check_ping, check_load, check_snmp <--- This line
}


This would make more sense to me, because if you got over a 100 hosts to check, 
and you don't want to do it with a hostgroup, then all these host most be in 
the services.cfg on one line (do I got this right?).

Please comment on this.
-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Service checks when host is down

2006-10-16 Thread Andreas Ericsson
Az wrote:
> Hari Sekhon wrote:
>> Indeed, that would be ok too, since then you can disable the host 
>> check and rely entirely on service checks so you get all the service 
>> checks.
>>
>> Almost as good. Of course it won't say host down, but it will say ping 
>> failed so you can tell from that.
>>
>> Another down side is that it will pollute your web interface with 
>> loads of trivial services, whereas it nice at the moment when all you 
>> see is the intended service.
>>
>> It would kind of spoil the web interface in my opinion.
>>
>> I think I will just wait for this to be done at the developer side, so 
>> that it is part of the normal configuration options.
> You may also find that it will blow out your service check execution 
> times if you use a ping (service) check of, say, 5 packets @ 1 second 
> apart. It doesnt sound like much but if you have a large number of hosts 
> it *might* cause you problems.
> 

I can most definitely say that it won't. A typical ICMP packet is 64 
bytes, all headers included. If you're thinking about load issues, 99% 
of the time check_icmp, /bin/ping and other ping-like programs use are 
spent waiting for data from the host being pinged. 0.8% of the time is 
spent by the system setting up the stack and creating memory pages for 
the program, and 0.2% is spent sending packets, calculating 
response-times and things like that. I can imagine that this will cause 
problems if you're monitoring tens of thousands of hosts, but if you do 
that without a robust clustering solution or some insanely beefy 
hardware and network line you'll be toast anyways.

check_http, or any other plugin that generally fetches >1KiB of data, 
possibly decrypts it, and parses a bit of text to be able to return a 
result are easily a hundred times heavier on both systemload and network 
traffic than ping checks.

-- 
Andreas Ericsson   [EMAIL PROTECTED]
OP5 AB www.op5.se
Tel: +46 8-230225  Fax: +46 8-230231

-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Service checks when host is down

2006-10-13 Thread Az




Hari Sekhon wrote:

  
Indeed, that would be ok too, since then you can disable the host check
and rely entirely on service checks so you get all the service checks.
  
Almost as good. Of course it won't say host down, but it will say ping
failed so you can tell from that.
  
Another down side is that it will pollute your web interface with loads
of trivial services, whereas it nice at the moment when all you see is
the intended service.
  
It would kind of spoil the web interface in my opinion.
  
I think I will just wait for this to be done at the developer side, so
that it is part of the normal configuration options.

You may also find that it will blow out your service check execution
times if you use a ping (service) check of, say, 5 packets @ 1 second
apart. It doesnt sound like much but if you have a large number of
hosts it *might* cause you problems.


-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Service checks when host is down

2006-10-13 Thread Hari Sekhon




Indeed, that would be ok too, since then you can disable the host check
and rely entirely on service checks so you get all the service checks.

Almost as good. Of course it won't say host down, but it will say ping
failed so you can tell from that.

Another down side is that it will pollute your web interface with loads
of trivial services, whereas it nice at the moment when all you see is
the intended service.

It would kind of spoil the web interface in my opinion.

I think I will just wait for this to be done at the developer side, so
that it is part of the normal configuration options.

-h
Hari Sekhon


Morris, Patrick wrote:

  
well, think about it, if you have service down alerts you may 
think that the service is broken or crashed or switched off. 
If you see both service and host alerts then you know which 
services are broken and you can also immediately see that it 
is due to the machine being down, so you don't waste time 
trying to remotely connect to it. It's more informative.

  
  
There's nothing to stop you from having a service check, such as a ping
check, that tells you whether the host is up or not.

  



-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Service checks when host is down

2006-10-13 Thread Morris, Patrick
> well, think about it, if you have service down alerts you may 
> think that the service is broken or crashed or switched off. 
> If you see both service and host alerts then you know which 
> services are broken and you can also immediately see that it 
> is due to the machine being down, so you don't waste time 
> trying to remotely connect to it. It's more informative.

There's nothing to stop you from having a service check, such as a ping
check, that tells you whether the host is up or not.

-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Service checks when host is down

2006-10-13 Thread Hari Sekhon




well, think about it, if you have service down alerts you may think
that the service is broken or crashed or switched off. If you see both
service and host alerts then you know which services are broken and you
can also immediately see that it is due to the machine being down, so
you don't waste time trying to remotely connect to it. It's more
informative.

The problem with host alerts is it is easy to miss what services are
broken, since you can have many services on one host. If you have, for
example, 9 things being monitored on a server, and the server goes
down, you may only be thinking of 5 or 6 services that it hosts, so you
would not be aware that another 3-4 services are also down.

It's about awareness (and wanting to spam your inbox with more emails  ;-)  ).

-h
Hari Sekhon


Morris, Patrick wrote:

  
-Original Message-
From: [EMAIL PROTECTED] 
[mailto:[EMAIL PROTECTED]] On Behalf 
Of Hari Sekhon
Sent: Friday, October 13, 2006 8:36 AM
To: vex
Cc: nagios-users@lists.sourceforge.net
Subject: Re: [Nagios-users] Service checks when host is down

I agree that it is a very good to have both, so if anyone has 
any good ideas on this then please send them this way

  
  
Why would you need both?  The sole purpose of host checks is to keep you
from being alerted on services when the host is down.  If you want
service alerts on a down host, you don't want a host check.  It's really
pretty simple.

  



-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Service checks when host is down

2006-10-13 Thread Morris, Patrick
> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf 
> Of Hari Sekhon
> Sent: Friday, October 13, 2006 8:36 AM
> To: vex
> Cc: nagios-users@lists.sourceforge.net
> Subject: Re: [Nagios-users] Service checks when host is down
> 
> I agree that it is a very good to have both, so if anyone has 
> any good ideas on this then please send them this way

Why would you need both?  The sole purpose of host checks is to keep you
from being alerted on services when the host is down.  If you want
service alerts on a down host, you don't want a host check.  It's really
pretty simple.

-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Service checks when host is down

2006-10-13 Thread Hari Sekhon
I agree that it is a very good to have both, so if anyone has any good 
ideas on this then please send them this way

-h

Hari Sekhon



vex wrote:
> On 10/13/06, Hari Sekhon <[EMAIL PROTECTED]> wrote:
>>
>>  good question, it's sometimes more important to know what services are
>> down, since you may not remember what that hostname actually does (or it
>> could have loads of services on it that you may not be able to remember
>> everything that isn't working).
>>
>>  I would also like to consider doing this actually. How should I got 
>> about
>> this?
>>
>>  If I don't put in a check command in the host definition, will I 
>> continue
>> to receive service down notifications and not host down notifications?
>
> If you don't define a check command in the host definition you have
> found a good workaround to manage your (my) problem. But for several
> (boring) reasons I would like to have both the notifications for hosts
> and services ... and I'm very afraid that's not possible, is it?
>
> A.
>
>

-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Service checks when host is down

2006-10-13 Thread Hari Sekhon




good question, it's sometimes more important to know what services are
down, since you may not remember what that hostname actually does (or
it could have loads of services on it that you may not be able to
remember everything that isn't working).

I would also like to consider doing this actually. How should I got
about this?

If I don't put in a check command in the host definition, will I
continue to receive service down notifications and not host down
notifications?

-h
Hari Sekhon


Morris, Patrick wrote:

  
I need that Nagios checks the service status and sends 
service notifications even if the host status is DOWN. Is it 
possible? Does Nagios already do that?

  
  
No. If you need this, you probably don't want a host check.

-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null

  



-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Service checks when host is down

2006-10-13 Thread Morris, Patrick
> I need that Nagios checks the service status and sends 
> service notifications even if the host status is DOWN. Is it 
> possible? Does Nagios already do that?

No. If you need this, you probably don't want a host check.

-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] Service checks when host is down

2006-10-13 Thread vex
Hi
I need that Nagios checks the service status and sends service notifications
even if the host status is DOWN. Is it possible? Does Nagios already do that?

Thank you very much in advance.

-- 
GPG Key ID: 0x54F0EC40 - Available on: http://pgp.mit.edu
FPrint: BE76 A8B3 D5DF 6A50 63DC 3AA5 89C1 D0E8 54F0 EC40

Everybody knows what's best for you.

-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Service Checks in Distributed mode

2006-08-21 Thread Steve Shipway
> My 2 current nagios servers are running on Dell 1750's. Each 
> has a 2.4GHz Xeon Processor and 2 Gigs of Ram. What type of 
> specs would be needed if I were to add a central server that 
> only deals with passive checks?  I am pretty sure I could 
> come up with a smaller server; would it be better to use that 
> for the centralized server, or move one the 1750's into that role?

We're running a single twin 2.4GHz Xeon server with 2GB of memory and
it's happily handling 3450 active checks and 133 passive checks.  I
suspect we could get up to 6000 at least before it started to struggle.

Steve

--
Steve Shipway
ITSS, University of Auckland 
(09) 3737 599 x 86487
[EMAIL PROTECTED]


-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Service Checks in Distributed mode

2006-08-21 Thread Demetri Mouratis

On Mon, 21 Aug 2006, Ian Marks wrote:

> My 2 current nagios servers are running on Dell 1750's. Each has a
> 2.4GHz Xeon Processor and 2 Gigs of Ram. What type of specs would be
> needed if I were to add a central server that only deals with passive
> checks?  I am pretty sure I could come up with a smaller server; would
> it be better to use that for the centralized server, or move one the
> 1750's into that role?
>
> Thanks,
> Ian


Ian,

I'm familiar with the 1750s having used them in my past job.  I had a 1750 
as my Central getting updates from about 20 different Distributeds and 
accounting for about 1000 total services.  The 1750 was definitely up to 
that task and probably had RAM/CPU cycles to spare.  You could probably 
have an even smaller box, maybe a Dell 850 server as your Central.

Hope that helps!

-D

-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Service Checks in Distributed mode

2006-08-21 Thread Ian Marks
My 2 current nagios servers are running on Dell 1750's. Each has a 
2.4GHz Xeon Processor and 2 Gigs of Ram. What type of specs would be 
needed if I were to add a central server that only deals with passive 
checks?  I am pretty sure I could come up with a smaller server; would 
it be better to use that for the centralized server, or move one the 
1750's into that role?

Thanks,
Ian

Demetri Mouratis wrote:
> On Mon, 21 Aug 2006, Ian Marks wrote:
>
>   
>> I have 2 nagios 2.5 servers running; both are doing active checks, but
>> one acts as the "central" server and receives passive checks from the
>> other.  I have it set up this way so our analysts will only have to
>> monitor "one" server.  I am seeing major delays with service checks on
>> the central server.  I am assuming my server is getting backed up trying
>> to process all the passive checks, so the active checks are only being
>> executed every 30-45 minutes.  Is this possible?  What would be the best
>> way to solve this problem, removing the passive checks?
>> 
>
> My suggestion in a Distributed environment is to have your Central server 
> perform only passive checks.  This cleans up the configuration a bit and 
> allows for the "one page to view" feature you are after.  If, for network 
> topology issues, you need to have two Distributed active Nagios boxes, I 
> would add a third, Central and have each Distributed send results over to 
> this third box.  It helps to rsync or otherwise copy the files from 
> Distributed to Central to avoid having multiple configuration points.
>
> If your total number of services monitored is above 1,000, I would suggest 
> a slight modification from what the docs suggest.  That is to use global 
> service event handlers to notify of changes versus ocsp.  This will cut 
> down on the number of updates to your Central server dramatically and 
> reduce the lag from a change on Distributed to that change being reflected 
> on Central.
>
> Any questions, please ask.
>
> -D
>
> -
> Using Tomcat but need to do more? Need to support web services, security?
> Get stuff done quickly with pre-integrated technology to make your job easier
> Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
> ___
> Nagios-users mailing list
> Nagios-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when reporting 
> any issue. 
> ::: Messages without supporting info will risk being sent to /dev/null
>
>   

-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Service Checks in Distributed mode

2006-08-21 Thread Demetri Mouratis
On Mon, 21 Aug 2006, Ian Marks wrote:

> I have 2 nagios 2.5 servers running; both are doing active checks, but
> one acts as the "central" server and receives passive checks from the
> other.  I have it set up this way so our analysts will only have to
> monitor "one" server.  I am seeing major delays with service checks on
> the central server.  I am assuming my server is getting backed up trying
> to process all the passive checks, so the active checks are only being
> executed every 30-45 minutes.  Is this possible?  What would be the best
> way to solve this problem, removing the passive checks?

My suggestion in a Distributed environment is to have your Central server 
perform only passive checks.  This cleans up the configuration a bit and 
allows for the "one page to view" feature you are after.  If, for network 
topology issues, you need to have two Distributed active Nagios boxes, I 
would add a third, Central and have each Distributed send results over to 
this third box.  It helps to rsync or otherwise copy the files from 
Distributed to Central to avoid having multiple configuration points.

If your total number of services monitored is above 1,000, I would suggest 
a slight modification from what the docs suggest.  That is to use global 
service event handlers to notify of changes versus ocsp.  This will cut 
down on the number of updates to your Central server dramatically and 
reduce the lag from a change on Distributed to that change being reflected 
on Central.

Any questions, please ask.

-D

-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] Service Checks in Distributed mode

2006-08-21 Thread Ian Marks
I have 2 nagios 2.5 servers running; both are doing active checks, but 
one acts as the "central" server and receives passive checks from the 
other.  I have it set up this way so our analysts will only have to 
monitor "one" server.  I am seeing major delays with service checks on 
the central server.  I am assuming my server is getting backed up trying 
to process all the passive checks, so the active checks are only being 
executed every 30-45 minutes.  Is this possible?  What would be the best 
way to solve this problem, removing the passive checks?

Here are the stats:

##Central server##

Program Running Time: 0d 4h 31m 26s

Total Services:   812
Services Checked: 812
Services Scheduled:   439
Active Service Checks:439
Passive Service Checks:   373
Total Service State Change:   0.000 / 34.210 / 0.121 %
Active Service Latency:   449.349 / 1996.148 / 1329.618 %
Active Service Execution Time:0.049 / 60.556 / 7.779 sec
Active Service State Change:  0.000 / 11.580 / 0.065 %
Active Services Last 1/5/15/60 min:   0 / 0 / 427 / 439
Passive Service State Change: 0.000 / 34.210 / 0.188 %
Passive Services Last 1/5/15/60 min:  0 / 0 / 94 / 367
Services Ok/Warn/Unk/Crit:720 / 8 / 18 / 66
Services Flapping:0
Services In Downtime: 0

Total Hosts:  243
Hosts Checked:243
Hosts Scheduled:  0
Active Host Checks:   178
Passive Host Checks:  65
Total Host State Change:  0.000 / 27.760 / 0.149 %
Active Host Latency:  0.000 / 2751.689 / 25.196 %
Active Host Execution Time:   0.016 / 10.015 / 1.171 sec
Active Host State Change: 0.000 / 0.000 / 0.000 %
Active Hosts Last 1/5/15/60 min:  2 / 5 / 17 / 19
Passive Host State Change:0.000 / 27.760 / 0.557 %
Passive Hosts Last 1/5/15/60 min: 1 / 32 / 62 / 63
Hosts Up/Down/Unreach:212 / 31 / 0
Hosts Flapping:   0
Hosts In Downtime:0

##Server Submitting Passive Checks##

Program Running Time: 3d 20h 24m 19s

Total Services:   367
Services Checked: 367
Services Scheduled:   367
Active Service Checks:367
Passive Service Checks:   0
Total Service State Change:   0.000 / 6.250 / 0.017 %
Active Service Latency:   233.540 / 349.122 / 284.721 %
Active Service Execution Time:0.023 / 34.478 / 4.352 sec
Active Service State Change:  0.000 / 6.250 / 0.017 %
Active Services Last 1/5/15/60 min:   0 / 151 / 367 / 367
Passive Service State Change: 0.000 / 0.000 / 0.000 %
Passive Services Last 1/5/15/60 min:  0 / 0 / 0 / 0
Services Ok/Warn/Unk/Crit:329 / 6 / 18 / 14
Services Flapping:0
Services In Downtime: 0

Total Hosts:  63
Hosts Checked:63
Hosts Scheduled:  0
Active Host Checks:   63
Passive Host Checks:  0
Total Host State Change:  0.000 / 0.000 / 0.000 %
Active Host Latency:  0.000 / 381.349 / 309.423 %
Active Host Execution Time:   0.110 / 18.916 / 3.671 sec
Active Host State Change: 0.000 / 0.000 / 0.000 %
Active Hosts Last 1/5/15/60 min:  0 / 35 / 63 / 63
Passive Host State Change:0.000 / 0.000 / 0.000 %
Passive Hosts Last 1/5/15/60 min: 0 / 0 / 0 / 0
Hosts Up/Down/Unreach:57 / 6 / 0
Hosts Flapping:   0
Hosts In Downtime:0



Thanks,
Ian


-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null