from:"\"Jonathan Call\""

Re: [Nagios-users] NRPE/NSCA replacement thoughts?

2010-02-19 Thread Jonathan Call

Here is my $0.02:

I have a distributed Nagios2 system with 24,000+ service checks and 4000+ 
hosts. I rely heavily on NSCA to get the results from the slaves to the master. 
My issue seems to be with Nagios since I can't get a Nagios slave to process a 
mere thousand service checks using the documented method specified for NSCA 
before is starts overwhelming the server. I've had to resort to using the 
OCP_daemon method instead. No complaints about what NSCA does just with how 
poorly it seems to work within Nagios itself.


> -Original Message-
> From: Michael Medin [mailto:mich...@medin.name]
> Sent: Thursday, February 18, 2010 11:26 AM
> To: nagios-users
> Subject: [Nagios-users] NRPE/NSCA replacement thoughts?
> 
> Hello
> 
> Since I am pondering a replacement for the NSCA and NRPE protocol I
> thought I would get some thoughts from the community?
> So this is pretty much an "open floor" kind of thing to get some sense
> of what people actually need and would want (if anything at all).


This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

--
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] When to HUP and when to restart?

2009-12-24 Thread Jonathan Call

If you’re using the embedded Perl interpreter a restart is probably better 
since the interpreter leaks memory.

If you have a very large solution (thousands of service checks) a restart will 
take a considerable amount of time so a HUP would probably be wise in that 
situation.

Jonathan

> -Original Message-
> From: Jim Avery [mailto:avery...@gmail.com]
> Sent: Thursday, December 24, 2009 6:32 AM
> To: nagios List
> Subject: [Nagios-users] When to HUP and when to restart?
> 
> Thanks to Patrick mentioning you can send a HUP to get Nagios to
> reload it's config, (how on earth did I now know that??), it got me
> wondering...
> 
> When, if at all, do I need to do a full restart of the Nagios daemon?
> 
> Cheers and Happy Christmas everyone.
> 
> Jim
> 
> (p.s. I'm sorry if this is the second time you've seen this. I've been
> getting bounce notifications when posting to the nagios-users list so
> am trying again from my gmail address).
> 
> ---
> ---
> This SF.Net email is sponsored by the Verizon Developer Community
> Take advantage of Verizon's best-in-class app development support
> A streamlined, 14 day to market process makes app distribution fast and
> easy
> Join now and get one step closer to millions of Verizon customers
> http://p.sf.net/sfu/verizon-dev2dev
> ___
> Nagios-users mailing list
> Nagios-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null


This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

--
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Nagios2 process overwhelmed by NSCA daemon?

2009-12-14 Thread Jonathan Call

See responses inline:

> -Original Message-
> From: Thomas Guyot-Sionnest [mailto:derm...@aei.ca]
> Sent: Sunday, December 13, 2009 9:23 PM
> To: Jonathan Call
> Cc: nagios-user Mailinglist
> Subject: Re: [Nagios-users] Nagios2 process overwhelmed by NSCA
daemon?
> 
> On 09/12/09 06:06 PM, Jonathan Call wrote:
> > I recently added two new slaves to a distributed Nagios system. The
> > central server now passively processes 17,000+ service checks on
> 3000+
> > servers.
> >
> > It's been over an hour and a half since I brought those new slaves
> > online and I have about 150 hosts still stuck in 'Pending' and about
> > 1300 services in the same state. In addition to that it seems that
> the
> > service check results from the other slaves that were working
> normally
> > are now arbitrarily disappearing. For example, on one host three of
> the
> > service checks have been updated relatively recently (i.e. 5-30
> minutes
> > ago) but three other service checks haven't been updated for almost
> an
> > hour. The slaves all appear operational and the hosts are being
> checked
> > on time. Is it possible I've overwhelmed Nagios' ability to process
> data
> > from the NSCA daemon or struck some internal Nagios bottleneck? Any
> > suggestions would be appreciated.
> 
> Hu Very interesting. Which Nagios version are you using?

Nagios 2.12 (May 19, 2008) on FreeBSD 6.3

> 
> This sounds a lot like a problem I encountered a few years ago with
> passive checks. I had about 50-60 servers returning cron-scheduled
> check
> results to the Nagios server. 120 results ain't that much, but is
> seemed
> that with all the servers fully time-synced (using NTP) out of these
> ~120 results I was often missing some of them, which would eventually
> cause false-alarm due to stale services.
> 
> I could easily reproduce the problem by feeding lots of results to
> Nagios right when I was expecting a batch of passive results - this
> would cause random results to be dropped. I spent some time trying to
> debug this but I couldn't figure our where commands were dropped. My
> primary target was the ring buffer used by the command reaper. As far
> as
> I can remember I tested with version of Nagios ranging from 2.3 to
2.5;
> I never tried with recent version
> 
> If you're running a recent version of nagios what do you get for
> "Used/High/Total Command Buffers" in the "nagiostats" command output?
> (you can also get these numbers from the web interface, "Performance
> Info" in the left bar.). If it seems to be maxed out, you may try
> setting "command_check_interval" to "-1" and raising the
> "external_command_buffer_slots" option in nagios.cfg.
>

Buffer report from Nagiostats:
Used/High/Total Command Buffers:  25 / 4096 / 4096
Used/High/Total Check Result Buffers: 0 / 4096 / 4096

Nagios config:
command_check_interval=-1
external_command_buffer_slots=4096

 
> 
> If you're still having this problem with Nagios v3 and up I might try
> to
> reproduce this as well, and maybe I'll be able to figure out what's
> wrong this time.

Upgrading to Nagios v3 is being considered but isn't possible at this
time.

As I mentioned to someone else on this thread, it seems that having a
large number of queries (status.cgi) being run against the web interface
seems to provoke poor performance from the central server, this is even
after we switched the main objects.cache and status.dat files to a
memory disk.

Jonathan



This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Nagios2 process overwhelmed by NSCA daemon?

2009-12-10 Thread Jonathan Call

Yes, Full Nagios is running on the slaves. They use OCP_daemon to pass on data 
to the central server since the NSCA client can't hack the load. They seem to 
be sending data properly to the NSCA daemon. 

Part of the issue I've tracked down to the status.cgi. The central server 
appears to be underpowered when it comes to both having Nagios process data AND 
have several people pounding out host/service status queries from the web 
interface. I will be adding another CPU to see if this helps, however I'm 
dismayed that Nagios on the central server doesn't seem to be reporting any 
errors, or indicating that there is any problem processing passive results. 
Nagios just starts to lose the data at a certain point.

Jonathan 

> -Original Message-
> From: Greg Pangrazio [mailto:pangr...@gmail.com]
> Sent: Thursday, December 10, 2009 7:26 AM
> To: Jonathan Call
> Cc: nagios-user Mailinglist
> Subject: Re: [Nagios-users] Nagios2 process overwhelmed by NSCA daemon?
> 
> Are you running the full nagios on the "slaves"?  Do the checks seem
> to be working on those hosts?
> 
> Greg Pangrazio
> pangr...@gmail.com
> 
> 
> 
> 
> 
> On Wed, Dec 9, 2009 at 5:06 PM, Jonathan Call  wrote:
> > I recently added two new slaves to a distributed Nagios system. The
> > central server now passively processes 17,000+ service checks on
> 3000+
> > servers.
> >
> > It's been over an hour and a half since I brought those new slaves
> > online and I have about 150 hosts still stuck in 'Pending' and about
> > 1300 services in the same state. In addition to that it seems that
> the
> > service check results from the other slaves that were working
> normally
> > are now arbitrarily disappearing. For example, on one host three of
> the
> > service checks have been updated relatively recently (i.e. 5-30
> minutes
> > ago) but three other service checks haven't been updated for almost
> an
> > hour. The slaves all appear operational and the hosts are being
> checked
> > on time. Is it possible I've overwhelmed Nagios' ability to process
> data
> > from the NSCA daemon or struck some internal Nagios bottleneck? Any
> > suggestions would be appreciated.
> >
> > Jonathan
> >
> >
> > This email message is intended for the use of the person to whom it
> has been sent, and may contain information that is confidential or
> legally protected. If you are not the intended recipient or have
> received this message in error, you are not authorized to copy,
> distribute, or otherwise use this message or its attachments. Please
> notify the sender immediately by return e-mail and permanently delete
> this message and any attachments. Verio, Inc. makes no warranty that
> this email is error or virus free.  Thank you.
> >
> > -
> -
> > Return on Information:
> > Google Enterprise Search pays you back
> > Get the facts.
> > http://p.sf.net/sfu/google-dev2dev
> > ___
> > Nagios-users mailing list
> > Nagios-users@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/nagios-users
> > ::: Please include Nagios version, plugin version (-v) and OS when
> reporting any issue.
> > ::: Messages without supporting info will risk being sent to
> /dev/null
> >

This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] Nagios2 process overwhelmed by NSCA daemon?

2009-12-09 Thread Jonathan Call

I recently added two new slaves to a distributed Nagios system. The
central server now passively processes 17,000+ service checks on 3000+
servers. 

It's been over an hour and a half since I brought those new slaves
online and I have about 150 hosts still stuck in 'Pending' and about
1300 services in the same state. In addition to that it seems that the
service check results from the other slaves that were working normally
are now arbitrarily disappearing. For example, on one host three of the
service checks have been updated relatively recently (i.e. 5-30 minutes
ago) but three other service checks haven't been updated for almost an
hour. The slaves all appear operational and the hosts are being checked
on time. Is it possible I've overwhelmed Nagios' ability to process data
from the NSCA daemon or struck some internal Nagios bottleneck? Any
suggestions would be appreciated.

Jonathan


This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Lilac 1.0.3 is Released! (A Nagios ConfigurationTool)

2009-10-12 Thread Jonathan Call

Any timetable for a STABLE release (i.e. not beta)?

Any timetable for supporting distributed deployments?

Jonathan

> -Original Message-
> From: Taylor Dondich [mailto:tdond...@gmail.com]
> Sent: Monday, October 12, 2009 1:10 PM
> To: nagios-user Mailinglist
> Subject: [Nagios-users] Lilac 1.0.3 is Released! (A Nagios
> ConfigurationTool)
> 
> Lilac, the most popular Nagios Configuration tool has just released
> version 1.0.3.  This is a bugfix release which fixes over 30 bugs with
> improvements made by it's users.  As always, Lilac Configuration has
> the following features:
> 
> * Advanced Nagios 3.x Timeperiod Support
> * Advanced Host and Service Templates (Even cooler than what Nagios
> supports by default!)
> * Flexible importer to import existing Nagios 2.x and 3.x
> configurations
> * Auto-discovery system powered by NMAP to quickly bring in new hosts.
> 
> You can download the latest version of Lilac at
> http://www.lilacplatform.com/downloads
> 
> Interesting note:  Lilac 1.0.2 which was released on 4/17/2009 was
> downloaded 5720 times!  That's an average of 32 times a day.  Thanks
> to the vibrant Lilac community!
> 
> --
> Taylor Dondich
> Check out Lilac, a configuration tool for Nagios 3 at
> http://www.lilacplatform.com
> 
> Check out my Shortcut with O'Reilly Press:
> Network Monitoring with Nagios:
> http://oreilly.com/catalog/9780596528195/index.html
> 
> ---
> ---
> Come build with us! The BlackBerry(R) Developer Conference in SF, CA
> is the only developer event you need to attend this year. Jumpstart
> your
> developing skills, take BlackBerry mobile applications to market and
> stay
> ahead of the curve. Join us from November 9 - 12, 2009. Register now!
> http://p.sf.net/sfu/devconference
> ___
> Nagios-users mailing list
> Nagios-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null


This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

--
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] NSCA speed problem

2009-09-10 Thread Jonathan Call

Have you considered OCP_daemon?

http://wiki.nagios.org/index.php/OCP_Daemon

> -Original Message-
> From: d...@chatham.org [mailto:d...@chatham.org]
> Sent: Tuesday, September 08, 2009 1:00 PM
> To: nagios-users@lists.sourceforge.net
> Subject: [Nagios-users] NSCA speed problem
> 
> I have a Nagios setup that is monitoring ~ 1000 hosts and ~ 13,000
> services.  The active checks are run on a Sun box with 128 CPUs/cores.
> Since it appeared that status.cgi could only be single threaded, it
> meant
> that the Sun box was slow in putting a page together, so all checks
> were
> forwarded to a fast Intel machine which puts together the page in about
> 2
> seconds instead of about 16 on the SPARC.
> 
> However, NSCA is now slowing the process, either on the sending or the
> receiving end.  There are only two NSCA processes running, so I suspect
> that this is the problem.
> 
> I can think of a number of alternatives.  One would be to load up
> ndoutils, which looks like a fine solution, but I'm a but under the gun
> here and I'd really like to find something that works quickly.
> 
> An alternative might be to use syslog to get the data from one machine
> to
> another.
> 
> Any ideas, suggestions?
> 
> ---
> ---
> Let Crystal Reports handle the reporting - Free Crystal Reports 2008
> 30-Day
> trial. Simplify your report design, integration and deployment - and
> focus on
> what you do best, core application coding. Discover what's new with
> Crystal Reports now.  http://p.sf.net/sfu/bobj-july
> ___
> Nagios-users mailing list
> Nagios-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null


This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] Quick and easy way to monitor Nagios itself?

2009-09-04 Thread Jonathan Call

Since I have a large Nagios distributed system the possibility of a
Nagios process going AWOL on one of my many servers is a serious
concern. Has anyone come up with a sure way to confirm (i.e. a cron job)
that Nagios is processing checks properly? 

For example, I had one OCP_daemon process die, as a result the Nagios
process hung for quite some time before it was discovered. Freshness
checking is not an option because many hosts are behind firewalls or on
private networks and so the central server has active checks disabled
globally. 

Jonathan


This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] nagios 3.0.3 on FreeBSD defunct process

2009-04-02 Thread Jonathan Call

That sounds very familiar to the locking/contention issue FreeBSD 7.x has with 
Nagios 2.x. It has to do with how Nagios and FreeBSD handle threading. 
Unfortunately I don’t have any answers on how to fix it. I’ve had to leave my 
Nagios deployment on FreeBSD 6/Nagios 2 for the same reason: anything newer 
would lock up due to defunct Nagios processes. 

 

Jonathan

 

From: Gian Paolo Buono [mailto:gpbu...@gmail.com] 
Sent: Tuesday, March 31, 2009 5:45 AM
To: nagios-users@lists.sourceforge.net; lei chen
Subject: Re: [Nagios-users] nagios 3.0.3 on FreeBSD defunct process

 

Hi, 

I haven't NDOUtils and enable_embedded_perl is disable (enable_embedded_perl=0) 
:(.. any idea ? 

bye... 

2009/3/31 lei chen 

Are you use NDOUtils here？
Or use enable_embedded_perl option？

2009/3/27 Gian Paolo Buono :

> Hi, Ihave a server with FreeBSD 7.1-RELEASE-p2 with 950 host and 4900
> service, Nagios 3.0.3
>
> Sometimes nagios don't update the status and when i try to stop nagios don't
> dies, i try to kill -9 the process but don't dies, there are many  defunct
> process of nagios  so I have to reboot the server. I haven't any log.
>
> Any idea ? thank you for the support bye..
>

> --
>
> ___
> Nagios-users mailing list
> Nagios-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when reporting
> any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
>



--
Thanks,

Chenlei & 石头++

MSN Messenger: c...@163.com

 



This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.
--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Lilac, a Nagios 3.x Configuration Tool, has released 1.0 Release Candidate 1.

2009-03-10 Thread Jonathan Call

Our implementation is pretty much right out of the Nagios documentation. The 
only thing that might be 'special' is that one of our slave servers is actually 
running two instances of Nagios; each instance is considered a 'slave' to the 
master.


> -Original Message-
> From: Taylor Dondich [mailto:tdond...@gmail.com]
> Sent: Monday, March 09, 2009 10:28 AM
> To: Jonathan Call
> Cc: nagios-user Mailinglist
> Subject: Re: [Nagios-users] Lilac, a Nagios 3.x Configuration Tool,has
> released 1.0 Release Candidate 1.
> 
> That's on the map for 1.2.  The first thing to determine is, what is
> the best way to handle distributed environments properly.  Do we have
> a single master with multiple slave monitoring servers?  How do most
> people do distributed monitoring?
> 
> Taylor
> 
> On Mon, Mar 9, 2009 at 7:46 AM, Jonathan Call  wrote:
> > I don't see it mentioned anywhere so I thought I would ask,
> >
> > Does Lilac support distributed Nagios deployments?
> >
> > Jonathan
> >
> >> -Original Message-
> >> From: Taylor Dondich [mailto:tdond...@gmail.com]
> >> Sent: Sunday, March 08, 2009 10:12 PM
> >> To: nagios-user Mailinglist
> >> Subject: [Nagios-users] Lilac, a Nagios 3.x Configuration Tool,has
> >> released 1.0 Release Candidate 1.
> >>
> >> Lilac, the Nagios configuration tool with the MOST coverage of 3.x
> >> features, has released 1.0 release candidate 1.  This version
> >> features:
> >>  - Multiple Template Inheritance
> >>  - Advanced Timeperiod Definitions
> >>  - Enhanced Templates (Attach services, dependencies, escalations to
> >> host templates, something you CAN'T do with regular Nagios config
> >> files)
> >>  - Robust Auto-Discovery system
> >>  - Import existing Nagios 2.x and Nagios 3.x configurations
> >>  - Import configurations from existing Fruity installations
> >>  - Export to Nagios 3.x, perform pre-flight checks and restart Nagios
> > at
> >> will
> >>  - Background Import/Export/Auto-Discovery processes (no need to wait
> >> at the browser for your exports/imports/discovery processes to take
> >> place)
> >>
> >> Take a look, join the community, and help build the most powerful
> >> configuration tool for Nagios out there!
> >>
> >> Downloads and Documentation is available at www.lilacplatform.com
> >>
> >> --
> >> Taylor Dondich
> >> Check out Lilac, a configuration tool for Nagios 3 at
> >> http://www.lilacplatform.com
> >>
> >> Check out my Shortcut with O'Reilly Press:
> >> Network Monitoring with Nagios:
> >> http://oreilly.com/catalog/9780596528195/index.html
> >>
> >>
> > 
> > --
> >> 
> >> Open Source Business Conference (OSBC), March 24-25, 2009, San
> > Francisco,
> >> CA
> >> -OSBC tackles the biggest issue in open source: Open Sourcing the
> >> Enterprise
> >> -Strategies to boost innovation and cut costs with open source
> >> participation
> >> -Receive a $600 discount off the registration fee with the source
> > code:
> >> SFAD
> >> http://p.sf.net/sfu/XcvMzF8H
> >> ___
> >> Nagios-users mailing list
> >> Nagios-users@lists.sourceforge.net
> >> https://lists.sourceforge.net/lists/listinfo/nagios-users
> >> ::: Please include Nagios version, plugin version (-v) and OS when
> >> reporting any issue.
> >> ::: Messages without supporting info will risk being sent to /dev/null
> >
> >
> > This email message is intended for the use of the person to whom it has
> been sent, and may contain information that is confidential or legally
> protected. If you are not the intended recipient or have received this
> message in error, you are not authorized to copy, distribute, or otherwise
> use this message or its attachments. Please notify the sender immediately
> by return e-mail and permanently delete this message and any attachments.
> Verio, Inc. makes no warranty that this email is error or virus free.
>  Thank you.
> >
> 
> 
> 
> --
> Taylor Dondich
> Check out Lilac, a configuration tool for Nagios 3 at
> http://www.lilacplatform.com
> 
> Check out my Shortcut with O'Reilly Press:
> Network Monitoring with Nagios:
> http://oreilly.com/catalog/9780596528195/index.html


This email message is intended for the us

Re: [Nagios-users] Lilac, a Nagios 3.x Configuration Tool, has released 1.0 Release Candidate 1.

2009-03-09 Thread Jonathan Call

I don't see it mentioned anywhere so I thought I would ask,

Does Lilac support distributed Nagios deployments?

Jonathan

> -Original Message-
> From: Taylor Dondich [mailto:tdond...@gmail.com]
> Sent: Sunday, March 08, 2009 10:12 PM
> To: nagios-user Mailinglist
> Subject: [Nagios-users] Lilac, a Nagios 3.x Configuration Tool,has
> released 1.0 Release Candidate 1.
> 
> Lilac, the Nagios configuration tool with the MOST coverage of 3.x
> features, has released 1.0 release candidate 1.  This version
> features:
>  - Multiple Template Inheritance
>  - Advanced Timeperiod Definitions
>  - Enhanced Templates (Attach services, dependencies, escalations to
> host templates, something you CAN'T do with regular Nagios config
> files)
>  - Robust Auto-Discovery system
>  - Import existing Nagios 2.x and Nagios 3.x configurations
>  - Import configurations from existing Fruity installations
>  - Export to Nagios 3.x, perform pre-flight checks and restart Nagios
at
> will
>  - Background Import/Export/Auto-Discovery processes (no need to wait
> at the browser for your exports/imports/discovery processes to take
> place)
> 
> Take a look, join the community, and help build the most powerful
> configuration tool for Nagios out there!
> 
> Downloads and Documentation is available at www.lilacplatform.com
> 
> --
> Taylor Dondich
> Check out Lilac, a configuration tool for Nagios 3 at
> http://www.lilacplatform.com
> 
> Check out my Shortcut with O'Reilly Press:
> Network Monitoring with Nagios:
> http://oreilly.com/catalog/9780596528195/index.html
> 
>

--
> 
> Open Source Business Conference (OSBC), March 24-25, 2009, San
Francisco,
> CA
> -OSBC tackles the biggest issue in open source: Open Sourcing the
> Enterprise
> -Strategies to boost innovation and cut costs with open source
> participation
> -Receive a $600 discount off the registration fee with the source
code:
> SFAD
> http://p.sf.net/sfu/XcvMzF8H
> ___
> Nagios-users mailing list
> Nagios-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null


This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

--
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Nagios 3.0.6 on 10.5.6 Server

2009-01-16 Thread Jonathan Call

> -Original Message-
> From: Randall R. Saeks [mailto:rsa...@district30.k12.il.us]
> Sent: Friday, January 16, 2009 1:55 PM
> To: nagios-users@lists.sourceforge.net
> Subject: [Nagios-users] Nagios 3.0.6 on 10.5.6 Server
> 
> Ever since I upgraded my server running Nagios 3.0.6 to 10.5.6, I
> can't get Nagios to launch.  When I try to start it via the CLI
> command, the following gets returned in the nagios.log:
> 
> [1232136591] Nagios 3.0.6 starting... (PID=27107)
> [1232136591] Local time is Fri Jan 16 14:09:51 CST 2009
> [1232136591] LOG VERSION: 2.0
> [1232136591] Finished daemonizing... (New PID=27109)
> [1232136591] Error: Could not create external command file
'/opt/local/
> var/nagios/rw/nagios.cmd' as named pipe: (22) -> Invalid argument.  If
> this file already exists and you are sure that another copy of Nagios
> is not running, you should delete this file.
> [1232136591] Bailing out due to errors encountered while trying to
> initialize the external command file... (PID=27109)
> 
> I've deleted said file, but no such luck in getting it to run.
> 
> While the web-interface is there and running, if I go to tactical
> overview or any of the other menus say that Nagios isn't running
> (which makes sense since the app bails).
> 
> Does anyone have any ideas on this and something to check out / try?
> I have installed this through Macports.
> 
> Thanks
> 
> 
> Randy Saeks, ACSA
> Network & Server Administrator
> Northbrook / Glenview School District 30
> 

Is it me or do you have a space in that file path?

'/opt/local/ var/nagios/rw/nagios.cmd'

Looks like a typo in your nagios.cfg file.




This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

--
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Hosts report 'DOWN, HARD' after first attempt.

2009-01-16 Thread Jonathan Call

> -Original Message-
> From: Marc Powell [mailto:m...@ena.com]
> Sent: Friday, January 16, 2009 1:20 PM
> To: nagios-users Mailinglist
> Subject: Re: [Nagios-users] Hosts report 'DOWN, HARD' after first
attempt.
> 
> 
> On Jan 16, 2009, at 12:40 PM, Patrick Morris wrote:
> 
> > The max_check_attempts only applies to active checks, not the
passive
> > ones you're sending the central server (at least I assume when you
> > said
> > max_retry_interval you meant max_check_attempts)  -- and you may
note
> > that SOFT and HARD are only relative to the server doing the
checking;
> > they probably aren't passed as part of the passive check submission
> > process.
> 
> Correct, all passive host checks are assumed to be HARD states. Note
> that this is addressed in nagios-3 --
> 
>
http://nagios.sourceforge.net/docs/3_0/configmain.html#passive_host_chec
ks
> _are_soft
> 
> --
> Marc
> 

If they're all assumed to be SOFT, then a host failure would never
trigger a notification?

Another potential option, if you're not using NSCA (like those using the
OCP_daemon) is to have the slave servers send out the notification
emails instead of the central one. The slaves would be active monitors
and would honor the host's max_check_attempts variable. This of course
introduces other problems if the slave is behind a restrictive firewall.

Jonathan

This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

--
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Hosts report 'DOWN, HARD' after first attempt.

2009-01-16 Thread Jonathan Call



> -Original Message-
> From: Patrick Morris [mailto:patrick.mor...@hp.com]
> Sent: Friday, January 16, 2009 11:40 AM
> To: Jonathan Call
> Cc: nagios-users@lists.sourceforge.net
> Subject: Re: [Nagios-users] Hosts report 'DOWN, HARD' after first
attempt.
> 

...

> 
> I'm not sure exactly how you're passing check results to the central
> server, but you may want to consider modifying the process to only
send
> host check results when they are in a hard state.

That sounds like an excellent recommendation. Here is my host check
command:
$USER1$/custom/submit_host_check_result.sh $HOSTNAME$ $HOSTSTATEID$
'$HOSTOUTPUT$'

I'll need to modify it to be like this:

$USER1$/custom/submit_host_check_result.sh $HOSTNAME$ $HOSTSTATEID$
'$HOSTOUTPUT$' '$HOSTSTATETYPE$'

And then my NSCA host script would then become:

--
#!/bin/sh

# Arguments and corresponding NAGIOS API variable
#  $1 = $HOSTNAME$
#  $2 = $HOSTSTATEID$
#  $3 = $HOSTOUTPUT$
#  $4 = $HOSTSTATETYPE$
#
# The variables must be piped in as tab delimited variables
# with a newline termination

if [ "$4" = "HARD" ]; then
   /usr/bin/printf "%s\t%s\t%s\n" "$1" "$2" "$3" |
/usr/local/sbin/send_nsca XXX.XXX.XXX.XXX -c
/usr/local/etc/send_nsca.cfg
fi

# Do nothing for SOFT

--

Thank you,

Jonathan


This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

--
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] Hosts report 'DOWN, HARD' after first attempt.

2009-01-16 Thread Jonathan Call

I am running a distributed monitoring system using Nagios 2.11 on
FreeBSD 6.3. I use NSCA to send host and services events to the central
server from the slave servers and have always had the following problem:

A distributed server notices a host service is "non-Ok" and fires off
check-host-alive. I have it set up to do check_ICMP and so it fires off
five ICMP packets. Since the network isn't always perfect those five
packets get dropped. However, I have my max_retry_interval set to 3 so
it fires off another check_ICMP which completes just fine. As a result I
see the following events take place on the slave server:

[01-16-2009 15:18:46] HOST ALERT: s3200.blah.net;UP;SOFT;2;OK -
10.XX.XX.XX: rta 100.294ms, lost 0%
[01-16-2009 15:18:46] HOST ALERT: s3200.blah.net;DOWN;SOFT;1;CRITICAL -
10.XX.XX.XX: rta nan, lost 100%

However on the central server I see the following:

[01-16-2009 15:19:02] HOST NOTIFICATION:
NOC-email;s3200.blah.net;UP;host-notify-by-email;OK - 10.XX.XX.XX: rta
100.294ms, lost 0%
 [01-16-2009 15:19:01] HOST ALERT: s3200.blah.net;UP;HARD;1;OK -
10.XX.XX.XX: rta 100.294ms, lost 0%
[01-16-2009 15:19:01] HOST NOTIFICATION:
NOC-email;s3200.blah.net;DOWN;host-notify-by-email;CRITICAL -
10.XX.XX.XX: rta nan, lost 100%
[01-16-2009 15:19:01] HOST ALERT: s3200.blah.net;DOWN;HARD;1;CRITICAL -
10.XX.XX.XX: rta nan, lost 100%

The central server is immediately flagging the host as DOWN, HARD in
spite of having the same max_retry_interval = 3 setting. On some hosts
this is generating a tone of false "HOST DOWN" notifications. Is there
any way to fix it?

Jonathan Call




This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

--
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Lilac 1.0 beta 1 released! The most robust Nagios3.x configuration tool available!

2008-12-24 Thread Jonathan Call

Does Lilac support distributed configuration? I looked over the site
briefly and did not see any such capability.

Jonathan

> -Original Message-
> From: Taylor Dondich [mailto:tdond...@gmail.com]
> Sent: Wednesday, December 10, 2008 12:45 PM
> To: nagios-user Mailinglist
> Subject: [Nagios-users] Lilac 1.0 beta 1 released! The most robust
> Nagios3.x configuration tool available!
> 
> Lilac 1.0 beta 1 is NOW released.  Featuring a robust importer for
> importing existing Nagios 2.x and 3.x configurations, an exporter to
> export to Nagios 3.x and a robust Auto-Discovery system.  Downloads
> and documentation at http://www.lilacplatform.com
> 
> Thanks to everyone for their extensive testing of Alpha and filing
> bugs and suggestions.  Beta 1 is *very* awesome.
> 
> --
> Taylor Dondich
> Check out Lilac, a configuration tool for Nagios 3 at
> http://www.lilacplatform.com
> 
> Check out my Shortcut with O'Reilly Press:
> Network Monitoring with Nagios:
> http://oreilly.com/catalog/9780596528195/index.html
> 
>

--
> 
> SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas,
> Nevada.
> The future of the web can't happen without you.  Join us at MIX09 to
help
> pave the way to the Next Web now. Learn more and register at
>
http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.
co
> m/
> ___
> Nagios-users mailing list
> Nagios-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null


This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] NSCA and Latency

2008-10-23 Thread Jonathan Call

NSCA just doesn't scale well within Nagios. 

 

You will need to try something like the OCP Daemon mentioned here:
http://www.nagioscommunity.org/wiki/index.php/OCP_Daemon

 

I believe Andreas Ericsson has also written a broker module for NSCA. It
is apparently still in its testing/alpha stages so you would have to
contact that person directly.

 

Jonathan

 

 



From: Maxwell,Brady [mailto:[EMAIL PROTECTED] 
Sent: Thursday, October 23, 2008 8:42 AM
To: nagios-users@lists.sourceforge.net
Subject: [Nagios-users] NSCA and Latency

 

My Environment:

3 x Dell 2950 Dual DualCore and 8 GB of RAM

One system runs checks against our Linux servers

One runs checks against our Windows servers

We are running SLES10 update 3

Both systems use nsca to send their check results to a third server that
displays the service checks for our operators.

All three systems are on the same vlan but separate cisco switchs.

I am running nsca in daemon mode on the central server with this command

/usr/local/nagios/bin/nsca -c /usr/local/nagios/etc/nsca.cfg -daemon

Nsca.cfg is as follows:

pid_file=/var/run/nsca.pidserver_port=5667#server_address=192.168.1.1nsc
a_user=nagiosnsca_group=nagios#nsca_chroot=/var/run/nagios/rwdebug=1comm
and_file=/usr/local/nagios/var/rw/nagios.cmdalternate_dump_file=/usr/loc
al/nagios/var/rw/nsca.dumpaggregate_writes=1append_to_file=1max_packet_a
ge=300password=xxdecryption_method=14

 

I just set the aggregate and append options to try and fix the problem
they were not set before either way the results are the same.

Ok so on the 2 servers doing the checks Everything runs fine even
with the OCSP running my send_service_check_results script. My script is
pretty much straight out of the book.

#!/bin/sh# Arguments:# $1 = Hostname of the host (using the $HOSTNAME$
macro)# $2 = Service description of the service (using the $SERVICEDESC$
macro)# $3 = Service status id of the service (using the
$SERVICESTATUSID$ macro)# $4 = Output of the Service Check (using the
$SERVICEOUTPUT$ macro)/bin/echo "$1","$2","$3","N3 - $4" |
/usr/local/nagios/libexec/send_nsca -H 10.10.129.37 -c
/usr/local/nagios/etc/send_nsca.cfg -d ","

Like I said everything is fine on the 2 servers even with OCSP on.
Between the 2 servers we are running about 10k service checks, latency
is very low just a few seconds. However if I turn on the NSCA Deamon on
the central server my latency creeps up to about 1500+ seconds with in
an hour and just gets worse from there on both remotes. The checks that
should run every 5 minutes on the 2 remote servers end up running every
few hours or less. The central server is doing 0 active checks.

I set debug mode and that proved to provide very little insight into the
problem.

CPU and Mem stats are both very low on all three server. The same thing
can be said for the network, network utilization is less than 2% and
there are no errors on the interfaces. Overall hardware utilization is
10% or less on these three systems. 

So my question is has anyone had this kind of problem with NSCA? What am
I missing? Should I be batching my service checks on the remote servers?
Should I be using xinetd for NSCA instead of deamon mode?

Thanks

Brady



This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.
-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] intermittent CGI failure

2008-09-25 Thread Jonathan Call

-Original Message-
From: Jon Angliss [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, September 24, 2008 5:52 PM
To: nagios-users@lists.sourceforge.net
Subject: Re: [Nagios-users] intermittent CGI failure

On Mon, 15 Sep 2008 12:29:55 -0400 (EDT), [EMAIL PROTECTED] wrote:

>Installed 3.0.3 from source on OpenBSD 4.3 (sparc64).  Everything
works,
>but every so often the CGI's will fail.
>
>e.g. If I refresh, say, status.cgi?host=all 10 times in a row, it'll
fail
>at least once or twice.
>
>I can reproduce using both Apache and nginx.
>
>Here's Apache error log snippet: "Premature end of script headers:
>/var/www/nagios/cgi-bin/status.cgi"
>
>Here's nginx error log snippet: "upstream closed prematurely FastCGI
>stdout while reading response header from upstream"
>
>Familiar issue to anyone?  Next steps to debug?

I keep noticing this every now and again.  I usually have the tactical
overview page open on a third monitor, and Firefox often throws a
"cannot connect" message.  I'm assuming this is probably the same
thing, just on a different page.  I've yet to go through my logs to
confirm though.

-- 
Jon Angliss

I'm running into the same issue:
[Tue Sep 23 04:49:28 2008] [error] [client xxx.xxx.xxx.xxx] Premature
end of script headers: status.cgi

My server is FreeBSD 6.3 running Nagios 2.12 and Apache 2.2.8. The error
is very intermittent, maybe two or three times a day on a 24/7
monitoring screen. Some times it is a 'cannot connect' sometimes the
screen is just blank.

Jonathan

This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] Nagios 3 distributed monitoring and NSCA

2008-09-10 Thread Jonathan Call

In Nagios 2.x Nagios the Obessive Compulsive Service Processor (OCSP) is
not very robust. Even with a few hundred service checks the OCSP stuff
on the distributed servers bogs down and does not send anything out.
This forced people like me to use tools like OCP_daemon. 

Has the OCSP infrastructure improved in Nagios 3? I need it to be robust
enough to handle ~2500 service checks.

Jonathan



This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Anyone tried Nagios 3.0.3 on FreeBSD yet?

2008-09-09 Thread Jonathan Call

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Dave
Horsfall
Sent: Monday, September 08, 2008 4:46 PM
To: Nagios Users
Subject: Re: [Nagios-users] Anyone tried Nagios 3.0.3 on FreeBSD yet?

On Mon, 8 Sep 2008, Sean McAfee wrote:

> This has long been in the ports tree as nagios-devel - it was just 
> pending the repocopy of 2.x to nagios2.  I've been running it since b2

> with no issues outside of the libthr one 
>
(http://www.freebsd.org/cgi/getmsg.cgi?fetch=662046+0+/usr/local/www/db/
text/2008/cvs-ports/20080120.cvs-ports).

Interesting; on FreeBSD 7 (and I think 6) libpthread is symlinked to 
libthr:

lrwxr-xr-x  1 root  wheel   8 Jul 20 16:07 libpthread.a -> libthr.a
lrwxr-xr-x  1 root  wheel   9 Jul 20 16:07 libpthread.so ->
libthr.so
lrwxr-xr-x  1 root  wheel  10 Jul 20 16:07 libpthread_p.a ->
libthr_p.a

That is the case in FreeBSD 7. But not FreeBSD 6.

> I honestly haven't noticed many changes (outside of cfg_dir recursion 
> working correctly).  On the completely anecdotal side, it does seem to

> be more efficient overall but I think that's related to general 
> improvements on the 3.x branch.

Well, it hasn't hung yet...

Did it ever hang while running Nagios 2? My current FreeBSD 7.0 (amd64)
box has not been able to run Nagios 2.12_1 as smoothly as my FreeBSD 6.3
(i386) can. And the FreeBSD 7.0 server has a significantly fewer number
of services too. I'm trying to figure out if upgrading might help.

Jonathan

This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] Anyone tried Nagios 3.0.3 on FreeBSD yet?

2008-09-08 Thread Jonathan Call

I noticed the port change a few days ago. Anyone tried it?

Does it behave better than Nagios 2 on FreeBSD 7?

Jonathan Call
Network Engineer - NTT/Verio
(801) 437-7476



This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Nagios gets stuck

2008-09-04 Thread Jonathan Call

I am running the default scheduler (SCHED_4BSD) with SMP. 

I have one box running FreeBSD 7.0-amd64 and three others running
FreeBSD 6.3-i386 in a distributed model. The FreeBSD 6.3 boxes had
issues in the past with service checks hanging but once Nagios was
"libmapped" to libthr instead of libpthread those issues went away.

I've been tempted to try that on the amd64 system but I'm waiting for
Nagios to hang/fail again.

Jonathan

-Original Message-
From: Sean McAfee [mailto:[EMAIL PROTECTED] 
Sent: Thursday, September 04, 2008 8:16 AM
To: Jonathan Call
Cc: Dave Horsfall; Nagios Users
Subject: Re: [Nagios-users] Nagios gets stuck

Jonathan Call wrote:
> Yes I have. And it is very annoying. A service check goes 
and
> the thread hangs, which makes Nagios hang. The  service
check,
> its thread parent remain as unkillable zombies until the server is
> rebooted.
>
> No one has offered any sort of solution other than "Have you tried
> Nagios 3?" (Which I have not)
>
> Jonathan
Jonathan, are you running BSD as well? 

If so, what scheduler are you using? How about you Dave?

Sean McAfee
System Engineer

Collaborative Fusion, Inc.
 [EMAIL PROTECTED]
 412-422-3463 x 4025

5849 Forbes Avenue
Pittsburgh, PA 15217

This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Nagios gets stuck

2008-09-04 Thread Jonathan Call

Yes I have. And it is very annoying. A service check goes  and
the thread hangs, which makes Nagios hang. The  service check,
its thread parent remain as unkillable zombies until the server is
rebooted.

No one has offered any sort of solution other than "Have you tried
Nagios 3?" (Which I have not)

Jonathan

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Dave
Horsfall
Sent: Wednesday, September 03, 2008 5:40 PM
To: Nagios Users
Subject: [Nagios-users] Nagios gets stuck

Nagios 2.12 with 1.4.11 plugins, on FreeBSD 7.0.

Sometimes Nagios hangs, and does not accept external commands etc; it's 
necessary to "kill -9" the process.  Has anyone else seen this?  Next
time 
I'll try and get a coredump of the process.

-- 
Dave Horsfall DTM VK2KFU  Ph: +61 2 9552-5509 (direct) +61 2 9552-5500
(switch)
Corinthian Eng'ng P/L, Ste 54 Jones Bay Whf, 26-32 Pirrama Rd, Pyrmont
2009, AU


-
This SF.Net email is sponsored by the Moblin Your Move Developer's
challenge
Build the coolest Linux based applications with Moblin SDK & win great
prizes
Grand prize is a trip for two to an Open Source event anywhere in the
world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when
reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Freshness checking and a distributed Nagios system.

2008-08-15 Thread Jonathan Call

I did read through that section thoroughly. I set up the check on the
central server and then disabled the check on the distributed server. I
waited for the time limit to expire. I watch the central server set the
next active check time to ~10 minutes after the last check time. But
that time came and went and nothing happened.

Perhaps this is because I have active checks disabled globally (via
nagios.cfg) on my central server instead of on a per host basis?

-Original Message-
From: Tom Ammon [mailto:[EMAIL PROTECTED] 
Sent: Friday, August 15, 2008 12:03 PM
To: Jonathan Call
Cc: nagios-users@lists.sourceforge.net
Subject: Re: [Nagios-users] Freshness checking and a distributed Nagios
system.

Nagios forces active checks to be run when used in conjuction with 
freshness checking, even when active checks for that service are 
disabled. The docs describe it pretty well at 
http://nagios.sourceforge.net/docs/2_0/distributed.html under the 
"Freshness Checking" section. You have to read it carefully, though.

Tom

Jonathan Call wrote:
> Correct me if I'm wrong:
> In order to run a distributed system, the central server should have
> active service checks disabled. But freshness checking executes the
> check command when it doesn't receive a passive response in a timely
> manner. This means the freshness check never runs.
>
> How do you get around that?
>
>
> This email message is intended for the use of the person to whom it
has been sent, and may contain information that is confidential or
legally protected. If you are not the intended recipient or have
received this message in error, you are not authorized to copy,
distribute, or otherwise use this message or its attachments. Please
notify the sender immediately by return e-mail and permanently delete
this message and any attachments. Verio, Inc. makes no warranty that
this email is error or virus free.  Thank you.
>
>

-
> This SF.Net email is sponsored by the Moblin Your Move Developer's
challenge
> Build the coolest Linux based applications with Moblin SDK & win great
prizes
> Grand prize is a trip for two to an Open Source event anywhere in the
world
> http://moblin-contest.org/redirect.php?banner_id=100&url=/
> ___
> Nagios-users mailing list
> Nagios-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when
reporting any issue. 
> ::: Messages without supporting info will risk being sent to /dev/null
>   

-- 
-
Tom Ammon
Network Engineer

Business Card at http://tomsbox.net/bizcard_TomAmmon.jpg

Center for High Performance Computing
University of Utah
http://www.chpc.utah.edu

This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] Freshness checking and a distributed Nagios system.

2008-08-15 Thread Jonathan Call

Correct me if I'm wrong:
In order to run a distributed system, the central server should have
active service checks disabled. But freshness checking executes the
check command when it doesn't receive a passive response in a timely
manner. This means the freshness check never runs.

How do you get around that?


This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] FreeBSD 7 and Nagios 2.12

2008-08-15 Thread Jonathan Call

I'm running FreeBSD 7 (amd64 at that) and Nagios 2.12.

It ran great for about a month. And then today I found that Nagios had
stopped processing checks and there are a few unkillable processes
lingering.

I remember at least one other person posting something similar to this.
Has anyone found a solution?

Jonathan


This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] Questions about Nagios Looking Glass

2008-06-24 Thread Jonathan Call

I just installed Nagios Looking Glass to see what it was like. So I
don't know much about it. I have two issues that I need some help with:

Problem: NLG looks horrible in Firefox 2 and Firefox 3.
Instead of the hostnames embedded in a colored 'bubble' it just has the
two ends of the bubble on separate lines followed by the hostname on a
third line.
Metrics have the same issue. They look like this:
[
]

Metric name

Is there a way to fix this so that both IE and Firefox 2&3 work?

Question: Ignoring acknowledged hosts or services.
Can I make NLG ignore hosts or services that have been acknowledged or
that are in Scheduled Downtime? I can do it with Nagios' status.cgi
using serviceprops=10 and hostsprops=10 variables. 

Thank you,

Jonathan Call



This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

-
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://sourceforge.net/services/buy/index.php
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] FreeBSD Nagios 2.12

2008-06-23 Thread Jonathan Call

> -Original Message-
> From: [EMAIL PROTECTED] [mailto:nagios-users-
> [EMAIL PROTECTED] On Behalf Of Marc Powell
> Sent: Sunday, June 22, 2008 10:31 AM
> To: nagios-user Mailinglist
> Subject: Re: [Nagios-users] FreeBSD Nagios 2.12
> 
> 
> On Jun 20, 2008, at 12:16 PM, Andrew D wrote:
> 
> >
> 
> > get_raw_command_line() start
> > Input: check_dns
> > clear_argv_macros() start
> > clear_argv_macros() end
> > find_command() start
> > Output: $USER1$/check_dns -H www.yahoo.com -s $HOSTADDRESS$
> > get_raw_command_line() end
> > process_macros() start
> > process_macros() end
> >
> >
> >
> > And thats where it stops and locks.
> 
> Thanks. After this, nagios makes a call to the event broker,
> determines if the plugin is a perl plugin (to use ePN if enabled),
> then forks the check_command.
> 
> Did you compile with the event broker? Is it enabled? Maybe try a
> compilation without the event broker/embedded perl interpreter.
> 
> There is (was?) a known issue with FreeBSD related to pthreads that
> may be in play (Known Issues -
> http://nagios.sourceforge.net/docs/2_0/whatsnew.html
> , Google for 'nagios freebsd pthreads'). It does specifically relate
> to the forking of check processes and a hang. I do not recall what the
> current status of that issue is but remember chatter about it either
> here or on nagios-devel. Perhaps on of the other FreeBSD users can
> chime in on that. My feeling is that it was fixed or there's a
> workaround but I don't remember specifics.
> 
> --
> Marc
> 

There was a change made to the Nagios port in February where it opted
for libthr instead of libpthread when available. That was supposed to
make permanent/official a workaround that used /etc/libmap.conf to link
them instead:

[nagios]
libpthread.so.2 libthr.so.2
libpthread.so   libthr.so

You could try using this libmap config or possibly reversing it. It may
be that FreeBSD 5 behaves better with libpthread? I've never used
FreeBSD 5 in any production environment so I'm just guessing.

Jonathan

This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

-
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://sourceforge.net/services/buy/index.php
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] FreeBSD 7.0-RELEASE and nagios 2.10

2008-03-14 Thread Jonathan Call

I did an upgrade of my two test Nagios boxes to FreeBSD 7.0. I rebuilt
all of the ports. I'm also still* using the libmap.conf options:

[nagios]# Resolve fork/vfork issues with Nagios
libpthread.so.2 libthr.so.2
libpthread.so   libthr.so

Both of my test boxes now have zombie processes that do not respond to
any kill command:
nagios 46134  0.0  0.0 0 0  ??  Z11:43PM   0:00.06 
nagios 46133  0.0  0.0 0 8  ??  DE   11:43PM   0:00.02
/usr/local/bin/nagios -d /usr/local/etc/nagios/nagios.cfg

Has anyone else run into this issue? 

Jonathan Call

* The port Makefile was updated with the following

USE_AUTOTOOLS=  autoconf:261 libltdl:15 -- Link with libthr when
available.  This should fix the CPU consumption problem. So I don't know
if the libmap.conf entry is necessary anymore?


This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] Is Embedded Perl + Nagios::Plugin worth it?

2008-02-05 Thread Jonathan Call

I've got one server that is getting tanked right now.  (Load average in
the 50's) Is it worth it to rewrite the many perl scripts I have to use
Embedded Perl and the Nagios::Plugin CPAN module?

I'm speaking in terms of performance and also in terms of future Nagios
releases/compatibility.

Jonathan


This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] FreeBSD 6.3 and Nagios

2008-01-29 Thread Jonathan Call

I can't find it right now but someone sent out an email saying they
could not get Nagios 2.10 to run under FreeBSD 6.3. I've upgraded two
systems to FreeBSD 6.3, both are running Nagios 2.10 without any
problems. I didn't even have to recompile the port. I do have the
/etc/libmap.conf entries for libthr though. 

Jonathan



This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] nagios-process with 100% CPU after updatetoNagios-2.10

2008-01-21 Thread Jonathan Call

I just installed Nagios 2.10 on a FreeBSD 6.3 server. It just has the
localhost config on it right now but it runs without any problems. 

> -Original Message-
> From: Bernd Kuhlen [mailto:[EMAIL PROTECTED]
> Sent: Friday, January 18, 2008 9:24 AM
> To: Jonathan Call
> Cc: Michael W. Lucas; nagios-users@lists.sourceforge.net
> Subject: Re: [Nagios-users] nagios-process with 100% CPU after
> updatetoNagios-2.10
> 
> Sorry I didn't, it just wanted to make sure it's something to do with
6.3.
> 
> Bernd
> 
> Jonathan Call schrieb:
> > Did you try my libmap suggestion? I'd be surprised to learn that
> > something in FreeBSD 6.3 breaks Nagios. There just aren't that many
> > changes. I'm building 6.3 right now to find out though.
> >
> > Jonathan
> >
> >
> >
> >> -Original Message-
> >> From: [EMAIL PROTECTED]
[mailto:nagios-users-
> >> [EMAIL PROTECTED] On Behalf Of Michael W. Lucas
> >> Sent: Thursday, January 17, 2008 3:01 PM
> >> To: Bernd Kuhlen
> >> Cc: nagios-users@lists.sourceforge.net
> >> Subject: Re: [Nagios-users] nagios-process with 100% CPU after
> >> updatetoNagios-2.10
> >>
> >> On Thu, Jan 17, 2008 at 10:52:19PM +0100, Bernd Kuhlen wrote:
> >>
> >>> Hi Jonathan
> >>>
> >>> I fixed it by rolling back to FreeBSD6.2, now Nagios is stable
> >>>
> > again.
> >
> >>> HELLO OUT THERE, PLEASE DO NOT TRY TO UPGRADE TO FREEBSD6.3 IF
> >>>
> > YOU'RE
> >
> >> RUNNING NAGIOS! AT LEAST NOT AT THE MOMENT.
> >>
> >>> Seems to be a serious bug.
> >>>
> >> I'd definitely bring this up on the freebsd-stable mailing list,
then.
> >>
> >> I'm running 2.10 on 6-stable and 8-current, no troubles.
> >>
> >> ==ml
> >>
> >> --
> >> Michael W. Lucas   [EMAIL PROTECTED],
> >>
> > [EMAIL PROTECTED]
> >
> >>http://www.BlackHelicopters.org/~mwlucas/
> >>   Now Shipping: "Absolute FreeBSD" --
> >>
> > http://www.AbsoluteFreeBSD.com
> >
> >> On 5/4/2007, the TSA kept 3 pairs of my soiled undies "for security
> >> reasons."
> >>
> >>
> >>
> >

> > -
> >
> >> This SF.net email is sponsored by: Microsoft
> >> Defy all challenges. Microsoft(R) Visual Studio 2008.
> >> http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
> >> ___
> >> Nagios-users mailing list
> >> Nagios-users@lists.sourceforge.net
> >> https://lists.sourceforge.net/lists/listinfo/nagios-users
> >> ::: Please include Nagios version, plugin version (-v) and OS when
> >> reporting any issue.
> >> ::: Messages without supporting info will risk being sent to
/dev/null
> >>
> >
> >
> > This email message is intended for the use of the person to whom it
has
> been sent, and may contain information that is confidential or legally
> protected. If you are not the intended recipient or have received this
> message in error, you are not authorized to copy, distribute, or
otherwise
> use this message or its attachments. Please notify the sender
immediately
> by return e-mail and permanently delete this message and any
attachments.
> Verio, Inc. makes no warranty that this email is error or virus free.
> Thank you.
> >
> >
> 



This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] nagios-process with 100% CPU after updatetoNagios-2.10

2008-01-18 Thread Jonathan Call

Did you try my libmap suggestion? I'd be surprised to learn that
something in FreeBSD 6.3 breaks Nagios. There just aren't that many
changes. I'm building 6.3 right now to find out though.

Jonathan

> -Original Message-
> From: [EMAIL PROTECTED] [mailto:nagios-users-
> [EMAIL PROTECTED] On Behalf Of Michael W. Lucas
> Sent: Thursday, January 17, 2008 3:01 PM
> To: Bernd Kuhlen
> Cc: nagios-users@lists.sourceforge.net
> Subject: Re: [Nagios-users] nagios-process with 100% CPU after
> updatetoNagios-2.10
> 
> On Thu, Jan 17, 2008 at 10:52:19PM +0100, Bernd Kuhlen wrote:
> > Hi Jonathan
> >
> > I fixed it by rolling back to FreeBSD6.2, now Nagios is stable
again.
> >
> > HELLO OUT THERE, PLEASE DO NOT TRY TO UPGRADE TO FREEBSD6.3 IF
YOU'RE
> RUNNING NAGIOS! AT LEAST NOT AT THE MOMENT.
> >
> > Seems to be a serious bug.
> 
> I'd definitely bring this up on the freebsd-stable mailing list, then.
> 
> I'm running 2.10 on 6-stable and 8-current, no troubles.
> 
> ==ml
> 
> --
> Michael W. Lucas  [EMAIL PROTECTED],
[EMAIL PROTECTED]
>   http://www.BlackHelicopters.org/~mwlucas/
>   Now Shipping: "Absolute FreeBSD" --
http://www.AbsoluteFreeBSD.com
> On 5/4/2007, the TSA kept 3 pairs of my soiled undies "for security
> reasons."
> 
>

-
> This SF.net email is sponsored by: Microsoft
> Defy all challenges. Microsoft(R) Visual Studio 2008.
> http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
> ___
> Nagios-users mailing list
> Nagios-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null

This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] nagios-process with 100% CPU after update toNagios-2.10

2008-01-17 Thread Jonathan Call

Sounds like the fork/vfork issue with FreeBSD's libpthread and Nagios.

The only solution I know of is to add the following to /etc/libmap.conf
and then do a stop/start of Nagios:

[nagios]
libpthread.so.2 libthr.so.2
libpthread.so   libthr.so

This forces Nagios to use an alternative POSIX threads library instead
of FreeBSD's default thread library.

Jonathan

> -Original Message-
> From: [EMAIL PROTECTED] [mailto:nagios-users-
> [EMAIL PROTECTED] On Behalf Of Bernd Kuhlen
> Sent: Thursday, January 17, 2008 5:34 AM
> To: nagios-users@lists.sourceforge.net
> Subject: Re: [Nagios-users] nagios-process with 100% CPU after update
> toNagios-2.10
> 
> Hi Bernd
> 
> oops, I sort of forgot the main thing. The actual problem:
> 
> Whenever this process occurs it's like a denial of service-attack. No
> checks are performed whatsoever. The only workaround (that I know) is
to
> have a cronjob running once per minute finding and killing these jobs
to
> make Nagios running properly again.
> 
> 
> - Bernd Kuhlen (bkuhlen)
> 
> ---
> This thread is located in the archive at this URL:
> http://www.nagiosexchange.org/nagios-
> users.34.0.html?&tx_maillisttofaq_pi1[showUid]=8408
> 
> 
>

-
> This SF.net email is sponsored by: Microsoft
> Defy all challenges. Microsoft(R) Visual Studio 2008.
> http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
> ___
> Nagios-users mailing list
> Nagios-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null


This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Processes hanging - Nagios 3rc1 on FreeBSD

2007-12-28 Thread Jonathan Call

> -Original Message-
> From: [EMAIL PROTECTED] [mailto:nagios-users-
> [EMAIL PROTECTED] On Behalf Of Alex French
> Sent: Friday, December 28, 2007 9:00 AM
> To: Chris Haulmark; nagios-users@lists.sourceforge.net
> Subject: Re: [Nagios-users] Processes hanging - Nagios 3rc1 on FreeBSD
> 
> On 28/12/2007, Chris Haulmark <[EMAIL PROTECTED]> wrote:
> 
> >
> > I ran into this same problem with the beta versions.  I created a
file
> > as the workaround.  This bug has been reported but apparently is
lost
> > in the email archives with no official bug support.
> >
> > try this:
> >
> > add those lines to /etc/libmap.conf
> > [/usr/local/bin/nagios]
> >
> > libpthread.so.2   libthr.so.2
> >
> > libpthread.so libthr.so
> 
> Yes, this fixed the problem for me (although in FreeBSD 5.4 I needed
> to use .so.1 rather than .so.2).
> 
> Thanks very much for your assistance, I probably wouldn't have figured
> that one out :-)
> 
> Is there any mechanism to report this as a bug?
> 
> Alex
> 

The FreeBSD port maintainer is already aware of the problems of Nagios2
vs. libpthread. The last time I talked with him (which was a long time
ago) he was debating if he wanted to add an option to link the Nagios
build directly to libthr or just put in a warning in the post install
text. I think the pending FreeBSD 7.0 release sort of left this decision
in limbo.

Jonathan

This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] How to monitor temp on Cisco 7200's?

2007-10-31 Thread Jonathan Call

Hello Patrick 

The Cisco 3560 and 3550 are very different from the Cisco 7200. You
cannot get an actual temperature value from them, just a temperature
state.

You'll need to use this SNMP Table:
ciscoEnvMonTemperatureState   .1.3.6.1.4.1.9.9.13.1.3.1.6
Possible states are: 
1:normal 2:warning 3:critical 4:shutdown 5:notPresent 6:notFunctioning

The 3560 and 3550 only have one thermal sensor so it's
1.3.6.1.4.1.9.9.13.1.3.1.6.1

> -Original Message-
> From: [EMAIL PROTECTED] [mailto:nagios-users-
> [EMAIL PROTECTED] On Behalf Of Patrick M.
> Sent: Wednesday, October 31, 2007 10:21 AM
> To: nagios-users@lists.sourceforge.net
> Subject: [Nagios-users] How to monitor temp on Cisco 7200's?
> 
> Hi all,
> 
> I was wondering if anyone had any recommendations on what plugins to
use
> to check temperature for a Cisco 7200 router.  If possible I'd like to
> monitor temp on our other devices like our 3560 Catalyst switches as
> well as our 3550's.
> 
> I tried looking on NagiosExchange but the plugins I found didn't
monitor
> temp for these devices.  Has anyone else had any luck?
> 
> Thanks in advance.
> 
> Patrick
> 


This email message is intended for the use of the person to whom it has been 
sent, and may contain information that is confidential or legally protected. If 
you are not the intended recipient or have received this message in error, you 
are not authorized to copy, distribute, or otherwise use this message or its 
attachments. Please notify the sender immediately by return e-mail and 
permanently delete this message and any attachments. Verio, Inc. makes no 
warranty that this email is error or virus free.  Thank you.

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Another Nagios Problem.

2007-10-17 Thread Jonathan Call

Are you aware of the fork/vfork issue between Nagios and the FreeBSD
pthread library? This may be causing your problem.

Try using these /etc/libmap.conf entries:
[nagios]
libpthread.so.2 libthr.so.2
libpthread.so   libthr.so

You will need to restart Nagios for the settings to take effect.


> -Original Message-
> From: [EMAIL PROTECTED] [mailto:nagios-users-
> [EMAIL PROTECTED] On Behalf Of Fulton, David
> Sent: Wednesday, October 17, 2007 12:43 PM
> To: Hugo van der Kooij
> Cc: nagios-users@lists.sourceforge.net
> Subject: Re: [Nagios-users] Another Nagios Problem.
> 
> I never said I wouldn't supply the coders with what they need. I would
> expect that those who coded it could point me in the right direction.
I
> have thus far tried changing how I get my data into perfparse and
> changing my timeperiods so that there are no overlapping times (i.e.
> from 00:00 - 24:00 to 00:00-23:59) since it always happens overnight
and
> the problem crops up after midnight I have to wait until then to get
> more data. Other than that I am using FreeBSD 6.2 with the latest
> plugins (1.4.10), nrpe, nsca and perfparse (the performance data is
send
> to files via a command definition that calls a perl script that writes
> it to a file. Perparse picks it up via a cron job that runs every 5
> minutes).
> 
> The purpose of the nagios-users list is to obtain help when one gets
> stuck not to have someone tell them how they should be able to do
their
> own support. I have set up a complex piece of software and have been
> running it since version 3.0b1. To my knowledge, there are only so
many
> sources of information that I could provide. Nagios doesn't stop,
> doesn't run a particular command. It simply starts orphaning check
> results after midnight every day. Turning on debugging does not give
any
> indication as to why. If I truss (strace) the process it immediately
> spawns a new copy of itself that consumes all CPU time on whatever CPU
> it is running on without returning anything.
> 

(snipped for brevity)

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Distributed monitoring Freshness checkingfailing then recovering

2007-10-16 Thread Jonathan Call

Sean;

I have a very large deployment so I use this tool:

http://www.nagioscommunity.org/wiki/index.php/OCP_Daemon

This daemon runs on each of the distributed servers while a normal ncsa
daemon listens on the central server.
 
Jonathan

> -Original Message-
> From: [EMAIL PROTECTED] [mailto:nagios-users-
> [EMAIL PROTECTED] On Behalf Of Sean McAvoy
> Sent: Monday, October 15, 2007 12:09 PM
> To: nagios-users@lists.sourceforge.net
> Subject: Re: [Nagios-users] Distributed monitoring Freshness
> checkingfailing then recovering
> 
> On further investigations it looks as though the problem is with the
> time taken to submit the results back to nagios via send_nsca.
> I have read about a couple different options for getting results back
> quickly. One being a bulk system of transfer, a file containing the
> results is sent via a send_nsca bulk transfer executed via cron. The
> other being a system that makes use of the performance data output
> option on the remote nagios systems and submits the results using a
> custom daemon on both ends.
> Does anybody know of any other options? Also, is there any guides to
> setting up either of these options, most of what I have read is email
> threads..
> Thanks.
> 
> On 12-Oct-07, at 12:40 PM, Sean McAvoy wrote:
> 
> > Hello,
> > I have 1 central nagios system with 5 distributed servers. I have
> > enabled freshness checking on both central and remote systems. I am
> > constantly seeing services go to unknown status for 1-3 minutes and
> > then recover.
> > on the remotes I have:
> > check_service_freshness=1
> > service_freshness_check_interval=10
> > check_host_freshness=1
> > host_freshness_check_interval=60
> > service_inter_check_delay_method=s
> > max_service_check_spread=10
> > service_interleave_factor=1
> > host_inter_check_delay_method=s
> > max_host_check_spread=30
> > max_concurrent_checks=0
> >
> > It does appear as though checks are being run in parallel. I'm
wonder
> > how I can best determine where the problem is, with the execution of
> > checks, submittal to the central system or other.
> > Thanks.
> >
> >
> > _sean
> >
> >
--
> > ---
> > This SF.net email is sponsored by: Splunk Inc.
> > Still grepping through log files to find problems?  Stop.
> > Now Search log events and configuration files using AJAX and a
> > browser.
> > Download your FREE copy of Splunk now >> http://get.splunk.com/
> > ___
> > Nagios-users mailing list
> > Nagios-users@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/nagios-users
> > ::: Please include Nagios version, plugin version (-v) and OS when
> > reporting any issue.
> > ::: Messages without supporting info will risk being sent to
/dev/null
> 
> Sean McAvoy
> NOC Acting Team Lead
> Afilias Canada
> 
> P. 416.673.4194
> 
> 
> 
> 
>

-
> This SF.net email is sponsored by: Splunk Inc.
> Still grepping through log files to find problems?  Stop.
> Now Search log events and configuration files using AJAX and a
browser.
> Download your FREE copy of Splunk now >> http://get.splunk.com/
> ___
> Nagios-users mailing list
> Nagios-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] Config management - Reinventing the wheel

2007-06-22 Thread Jonathan Call

I currently use nagiosweb (http://sourceforge.net/projects/nagiosweb/)
to maintain a Nagios configuration for a central server in mysql. Based
off of certain host groups I want to generate configuration files for
distributed Nagios servers for that central server.

Has anyone written code (for example, perl) to generate distributed
Nagios 2.x configuration files based on a central Nagios server's
configuration that is stored in a mysql database? It doesn't have to be
nagiosweb. I believe that any db style would be easy enough to change to
make it work.

I thought I would ask to see if I could avoid reinventing the wheel.

Jonathan

-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Problems with FreeBSD and Nagios

2007-06-20 Thread Jonathan Call

> -Original Message-
> From: [EMAIL PROTECTED] [mailto:nagios-users-
> [EMAIL PROTECTED] On Behalf Of Douglas K. Rand
> Sent: Tuesday, June 19, 2007 3:16 PM
> To: Kyle Sexton
> Cc: nagios-users@lists.sourceforge.net
> Subject: Re: [Nagios-users] Problems with FreeBSD and Nagios
> 
> Doug> The following entry in /etc/libmap.conf has, for us, solved the
> issue
> Doug> of run away Nagios processes.
> 
> Doug> [nagios]
> Doug> libpthread.so.2 libthr.so.2
> Doug> libpthread.so   libthr.so
> 
> Doug> This is on FreeBSD 6.2.
> 
> Kyle> Was there a recompile or anything necessary?
> 
> No. You do have to stop and restart the nagios process after the
> edit. A restart via the web interface is not sufficient. libmap.conf
> is a runtime configuration.

It's been about 24 hours since I implemented this dependency mapping on
one of my more heavily used FreeBSD 6.2/Nagios 2.9 servers. I have not
had any problems with child processes and my load average actually
dropped from around 7.5 to 4.

I'll give it a week or two before I declare it a complete success, but
it has been great so far!

Jonathan

-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Problems with FreeBSD and Nagios

2007-06-19 Thread Jonathan Call



> -Original Message-
> From: [EMAIL PROTECTED] [mailto:nagios-users-
> [EMAIL PROTECTED] On Behalf Of Michael W. Lucas
> Sent: Tuesday, June 19, 2007 5:16 AM
> To: Kyle Sexton
> Cc: nagios-users@lists.sourceforge.net
> Subject: Re: [Nagios-users] Problems with FreeBSD and Nagios
> 
> On Mon, Jun 18, 2007 at 06:42:18PM -0500, Kyle Sexton wrote:
> > On 12/14/06, Andreas Ericsson <[EMAIL PROTECTED]> wrote:
> > > Jonathan Call wrote:
> > > >
> > > > Given your ideas and some google work I seem to have found my
> problem:
> > > >
> > > > http://lists.freebsd.org/pipermail/freebsd-hackers/2005-
> August/013247.ht
> > > > ml
> > > >
> > > > Not a pretty discussion. :(
> > > >
> > >
> > > Nope. Definitely not.
> > >
> > > The problem for Nagios is that threading was added after the fact
so
> > > nagios actually breaks some of the *strong* recommendations on
what to
> > > do and what not to do in a threaded application after a fork().
> > >
> > > The problem for *BSD and their thread implementation of the thread
> > > library is that Nagios actually works everywhere but on *BSD, and
it
> > > *often* works there too, but not always. This
"often-but-not-always"
> is
> > > usually a sign of a broken implementation, although exactly
> > > "often-but-not-always" is a sign of the errors you'll run into
when
> you
> > > do what Nagios does post-fork().
> > >
> > > I don't know of any other program that has the same problem on
*BSD,
> but
> > > it would be interesting to see if there's a common pattern so one
can
> > > pinpoint the exact pattern that causes the lock contention and
races.
> It
> > > would, from a practical point of view, be best to patch it in the
> > > library, as that is a fix that would work for all possible future
> > > problems as well, although it's technically more correct to fix it
in
> > > Nagios.
> > >
> > > Ugly discussion indeed.
> > >
> > >
> > > > I'll try using a non SMP kernel to see it might help. If it
doesn't
> this
> > > > pretty much renders Nagios useless on FreeBSD. (Which makes me
> wonder
> > > > why they even bother maintaining it in ports?)
> > > >
> > >
> > > Out of curiousity, do you use passive checks, active checks or a
mix
> of
> > > both in your setup?
> > Was there ever a solution found to this problem?

No. 
I was forced to implement a distributed model and limit the service
checks to less than 1000 on a server. Even then I still have to run a
cron job that checks for nagios children than are spinning on the CPU as
a result of this fork issue.

I've found that somewhere after 1500+ service checks there will be a
random weekly event that causes almost a hundred nagios checks to hit
this fork issue all at the same time and promptly tank the FreeBSD
server.

> 
> Skimming the (long) discussion thread, my first thought is to try
> libthr instead of libkse.  The discussion seems to be on 5.x, I'd
> definitely try libthr on 6.x.  Check libmap.conf for details.

Are you referring to this type of mapping within /etc/libmap.conf?

[/usr/local/bin/nagios]
 libpthread.so.2 libthr.so.2
 libpthread.so   libthr.so

If so I'd be willing to try it on my FreeBSD 6.2 server.

Jonathan

-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] Host/Ping check and Nagios performance.

2007-06-07 Thread Jonathan Call

Is there any reason why Nagios stops running all service checks while it
executes check-host-alive/ping on hosts? Can I change that? I cannot
find a setting to do it.

With the large number of service checks I'm running (1300+) whenever a
host goes down (or in some cases just stops answering ICMP) it kills
performance on the Nagios server.


Jonathan Call


-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Passive monitoring is running slow?

2007-05-02 Thread Jonathan Call



> -Original Message-
> From: [EMAIL PROTECTED] [mailto:nagios-users-
> [EMAIL PROTECTED] On Behalf Of Marc Powell
> Sent: Wednesday, May 02, 2007 3:39 PM
> To: nagios-users@lists.sourceforge.net
> Subject: Re: [Nagios-users] Passive monitoring is running slow?
> 
> 
> 
> > -Original Message-
> > From: [EMAIL PROTECTED]
[mailto:nagios-users-
> > [EMAIL PROTECTED] On Behalf Of Jonathan Call
> > Sent: Wednesday, May 02, 2007 10:07 AM
> > To: nagios-users@lists.sourceforge.net
> > Subject: Re: [Nagios-users] Passive monitoring is running slow?
> >
> >
> >
> > > -Original Message-
> > > From: Thomas Guyot-Sionnest [mailto:[EMAIL PROTECTED]
> > > Sent: Tuesday, May 01, 2007 4:29 PM
> > > To: Jonathan Call
> > > Cc: nagios-users@lists.sourceforge.net
> > > Subject: Re: [Nagios-users] Passive monitoring is running slow?
> > >
> > > On 01/05/07 05:15 PM, Jonathan Call wrote:
> > > > I have set up a distributed monitoring system per the Nagios
> > > documentation.
> > > >
> > > > I initially tested it out by having the distributed server
monitor
> > only
> > > 24 or so services on about 8 hosts. There didn't seem to be any
> > problems.
> > > >
> > > > I then cranked it up to 427 services on 81 hosts. I'm watching
the
> > > distributed server right now and there is hardly any system load
but
> > the
> > > Service Check Latency seems extremely high:
> > > >
> > > > Metric  Min.Max.Average
> > > > Check Execution Time:   0.05 sec1.67 sec0.701
> > sec
> > > > Check Latency:  60.40 sec   287.36 sec  184.514
> > sec
> > > > Percent State Change:   0.00%   0.00%   0.00%
> > > >
> > > > This is resulting in 50% or less of the service checks
completing
> in
> > the
> > > 5 minutes or less timeframe.
> > > >
> 
> 
> > So this is a know design failure in Nagios then? I'm fairly new to
> 
> Absolutely not.
> 
> > Nagios and I am completely dumbfounded at this. If you can't service
> > even a quarter (and probably even a tenth) of the amount of hosts
and
> > services on a distributed server than you can on a regular active
> server
> > then what is the point of having a distributed model at all?
> 
> I have 5 data collector machines running nagios
> -and- cricket for thousands of services each with nagios reporting all
> results back to two central hosts as documented. Average latency is
> 0.689 seconds and Max of 3.65 seconds right now. The distributed
server
> should be performing exactly like a regular active server as far as
> latency stats are concerned. You're either starving nagios for
resources
> needed to run its active checks (run ~nagios/bin/nagios -s
> ~nagios/etc/nagios.cfg to see recommended settings) or, less likely,
> something is wrong with your submit-check-result. If you submit a
result
> from the command line, does it complete in a timely manner? If you
> disable OCSP does the latency go away? Basic troubleshooting dictates
> you should try methodically enabling features on your distributed
> machine to turn it from an active-only server to active submitting
check
> results via OCSP.
> 
> Disable OCSP program-wide (nagios.cfg)
> Test

With OCSP disabled service check latency is under half a second.

> Enable OCSP but have your OCSP script do everything except call
> send_nsca
> Test

With the send_nsca line commented out (basically calling an empty shell
script) service check latency is under half a second as well.

> Enable send_nsca in your OCSP script.
> Test

Service Latency times spike again.
Watching top for a few minutes reveals a LOT of send_nsca processes
being spawned but few checks actually running. Of course the SNMP checks
themselves run very quickly but there always seems to be a send_nsca
client running.  Not the same one either, always a different PID.

I timed the script itself (copied right off the Nagios documentation
website) and it executes in a timely manner as well:
0.000u 0.009s 0:00.71 0.0%  0+0k 0+0io 0pf+0w

> 
> 
> Do you have regular host checks enabled? Post the output of nagios -v
> and nagios -s.

Scheduled host checks are not enabled.

Nagios -v output:
Nagios 2.9
Copyright (c) 1999-2007 Ethan Galstad (http://www.nagios.org)
Last Modified: 04-10-2007
License: GPL

Reading configuration data...

Running pre-flight check on configuration data...

Checking services...
Checked 427 services.
Checking hosts..

Re: [Nagios-users] Passive monitoring is running slow?

2007-05-02 Thread Jonathan Call

> -Original Message-
> From: Thomas Guyot-Sionnest [mailto:[EMAIL PROTECTED]
> Sent: Tuesday, May 01, 2007 4:29 PM
> To: Jonathan Call
> Cc: nagios-users@lists.sourceforge.net
> Subject: Re: [Nagios-users] Passive monitoring is running slow?
> 
> On 01/05/07 05:15 PM, Jonathan Call wrote:
> > I have set up a distributed monitoring system per the Nagios
> documentation.
> >
> > I initially tested it out by having the distributed server monitor
only
> 24 or so services on about 8 hosts. There didn't seem to be any
problems.
> >
> > I then cranked it up to 427 services on 81 hosts. I'm watching the
> distributed server right now and there is hardly any system load but
the
> Service Check Latency seems extremely high:
> >
> > Metric  Min.Max.Average
> > Check Execution Time:   0.05 sec1.67 sec0.701
sec
> > Check Latency:  60.40 sec   287.36 sec  184.514
sec
> > Percent State Change:   0.00%   0.00%   0.00%
> >
> > This is resulting in 50% or less of the service checks completing in
the
> 5 minutes or less timeframe.
> >
> > The Central server has had no significant change in performance at
all
> and seems to be receiving and processing everything without
difficulty.
> >
> > The nsca server on the central server is running with the following
> arguments:
> > /usr/local/sbin/nsca --daemon -c /usr/local/etc/nsca.cfg
> >
> > The submit_check_result script on the distributed server is right
out of
> the documentation.
> 
> There are many ways to do that; my favorite (obviously since I wrote
it
> :) ) is using the host and service performance data files as named
> pipes, and having a daemon reaping them and batch-sending data to
> send_nsca..
> 
> The howto is here (and I'll be more than happy to answer your
questions
> or get your feedback):
> 
> http://www.nagioscommunity.org/wiki/index.php/OCP_Daemon
> 
> It will require Libevent and the Perl module Event::Lib.
> 
> Thomas

So this is a know design failure in Nagios then? I'm fairly new to
Nagios and I am completely dumbfounded at this. If you can't service
even a quarter (and probably even a tenth) of the amount of hosts and
services on a distributed server than you can on a regular active server
then what is the point of having a distributed model at all?

I will take a look at your batch sending method.

-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] Passive monitoring is running slow?

2007-05-01 Thread Jonathan Call

I have set up a distributed monitoring system per the Nagios documentation.

I initially tested it out by having the distributed server monitor only 24 or 
so services on about 8 hosts. There didn't seem to be any problems.

I then cranked it up to 427 services on 81 hosts. I'm watching the distributed 
server right now and there is hardly any system load but the Service Check 
Latency seems extremely high:

Metric  Min.Max.Average
Check Execution Time:   0.05 sec1.67 sec0.701 sec
Check Latency:  60.40 sec   287.36 sec  184.514 sec
Percent State Change:   0.00%   0.00%   0.00%

This is resulting in 50% or less of the service checks completing in the 5 
minutes or less timeframe.

The Central server has had no significant change in performance at all and 
seems to be receiving and processing everything without difficulty.

The nsca server on the central server is running with the following arguments:
/usr/local/sbin/nsca --daemon -c /usr/local/etc/nsca.cfg

The submit_check_result script on the distributed server is right out of the 
documentation.

Encryption within nsca has been reduced to simple XOR with a password.

Is there any way to optimize the send_nsca features or is that high of a 
Service Check Latency not a big deal? 

Jonathan

-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] Breakdown of status.cgi?

2007-03-23 Thread Jonathan Call

Is there some documentation somewhere that breaks down the possible
variables and options available to status.cgi?

For example, what are all the possible binary operands for
servicestatustypes or style?

I'm trying to create a Current Network Status view that will be more
appropriate for NOC people than the Tactical Overview page.


Jonathan

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] Preferred configuration utility?

2007-01-19 Thread Jonathan Call

Looking over nagiosexchange I see several web based configuration
management utilities. 

I've been using NagiosWeb; it's simple, somewhat immature, but
effective. It lacks the ability to import configurations, which has
become more important now that I'm moving my Nagios deployment to a
distributed model.

I looked at Fruity, but the forums for it have pointed out a showstopper
problem with importing and exporting configurations (especially
contacts)

Monarch has also been mentioned but their website appears to be
non-functional on sourceforge right now.

Has anyone found one that stands out among the others?

Jonathan Call
Network Engineer - NTT/Verio


-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Problems with FreeBSD and Nagios

2006-12-14 Thread Jonathan Call

nagios# gdb --pid=$74056
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you
are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for
details.
This GDB was configured as "i386-marcel-freebsd".
"/var/spool/nagios/rw/" is not a core dump: File format not recognized
(gdb) bt
No stack.
(gdb)

Given your ideas and some google work I seem to have found my problem: 

http://lists.freebsd.org/pipermail/freebsd-hackers/2005-August/013247.ht
ml

Not a pretty discussion. :(

I'll try using a non SMP kernel to see it might help. If it doesn't this
pretty much renders Nagios useless on FreeBSD. (Which makes me wonder
why they even bother maintaining it in ports?)


> -Original Message-
> From: Andreas Ericsson [mailto:[EMAIL PROTECTED]
> Sent: Thursday, December 14, 2006 2:26 AM
> To: Jonathan Call
> Cc: nagios-users@lists.sourceforge.net
> Subject: Re: [Nagios-users] Problems with FreeBSD and Nagios
> 
> Jonathan Call wrote:
> > I scanned the mailing list trying to find a solution for this. I
found a
> > brief discussion where someone had the same problem but there was
> > nothing really discussed what was potentially wrong.
> >
> > My system:
> > Dual 2.8GHz P4 processors
> > 4GB of RAM
> > FreeBSD 6.1-RELEASE-p10
> >
> > Running processes:
> > Nagios 2.6 (installed from ports without embedded perl or nanosleep)
> > One mysqld process for the nagiosweb utility
> > A few NSCA daemon processes for passive checking
> > A backup tool daemon
> > Apache+modssl (latest from ports)
> > Basic FreeBSD services (sshd, sendmail, etc.)
> >
> > Problem:
> > Random service and host check control processes will lock up and
'spin'
> > on the CPU. This is really bad when a host check does it because it
> > brings all checks to a halt. It doesn't seem to even notice that all
> > checks have gone stale.
> >
> > It will look like this in top:
> >
> >   PID USERNAME  THR PRI NICE   SIZERES STATE  C   TIME   WCPU
> > COMMAND
> > 94068 nagios  1 1160  7500K  6748K CPU2   0 727:37 30.15%
nagios
> > 94082 nagios  1 1160  7500K  6748K CPU2   0 734:28 32.55%
nagios
> > 94104 nagios  1 1160  7500K  6748K CPU2   0 845:21 37.42%
nagios
> > 75338 nagios  5  200  7500K  6776K kserel 0  91:33  0.00%
nagios
> >
> > In this example the main nagios pid is 75338. The hung service
and/or
> > host processes are the other ones.
> >
> > The service checks are almost entirely custom scripts, but the host
> > check is a standard check_ping that comes with the nagios program.
> >
> > Any ideas on how to figure out which service or host check is hung?
Or
> > how to deal with this problem at all?
> >
> 
> Host and service checks going into infinite loops wouldn't show up as
> Nagios processes in CPU spinlock, as the nagios check execution
children
> just sit around and wait for the child to finish (or 60 seconds to
pass
> in default config, before it kills it off).
> 
> You've found a bug in Nagios which most likely was either introduced
in
> the port of it, or is a result of library differences between FreeBSD
> and Linux.
> 
> I wouldn't be all too surprised if it turns out that the FreeBSD
pthread
> implementation disallows something that the Linux version allows. Note
> that this doesn't necessarily have to be a bug; Nagios doesn't use the
> pthread ABI in a way that is explicitly stated as safe, but the
pthread
> implementation on Linux and most other unices are forgiving enough to
> make it work anyway.
> 
> It's also possible that this bug only triggers on dual-CPU systems
with
> a particular library installed, as some kinds of timing and
> race-conditions just doesn't happen on single-CPU systems.
> 
> What happens if you do
> 
> $ gdb --pid=$(pidof spinning-nagios-process)
> (gdb) bt
> 
> ?
> 
> --
> Andreas Ericsson   [EMAIL PROTECTED]
> OP5 AB www.op5.se
> Tel: +46 8-230225  Fax: +46 8-230231

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] Problems with FreeBSD and Nagios

2006-12-13 Thread Jonathan Call

I scanned the mailing list trying to find a solution for this. I found a
brief discussion where someone had the same problem but there was
nothing really discussed what was potentially wrong.

My system: 
Dual 2.8GHz P4 processors
4GB of RAM
FreeBSD 6.1-RELEASE-p10

Running processes:
Nagios 2.6 (installed from ports without embedded perl or nanosleep)
One mysqld process for the nagiosweb utility
A few NSCA daemon processes for passive checking
A backup tool daemon
Apache+modssl (latest from ports)
Basic FreeBSD services (sshd, sendmail, etc.)

Problem:
Random service and host check control processes will lock up and 'spin'
on the CPU. This is really bad when a host check does it because it
brings all checks to a halt. It doesn't seem to even notice that all
checks have gone stale.

It will look like this in top:

  PID USERNAME  THR PRI NICE   SIZERES STATE  C   TIME   WCPU
COMMAND
94068 nagios  1 1160  7500K  6748K CPU2   0 727:37 30.15% nagios
94082 nagios  1 1160  7500K  6748K CPU2   0 734:28 32.55% nagios
94104 nagios  1 1160  7500K  6748K CPU2   0 845:21 37.42% nagios
75338 nagios  5  200  7500K  6776K kserel 0  91:33  0.00% nagios

In this example the main nagios pid is 75338. The hung service and/or
host processes are the other ones.

The service checks are almost entirely custom scripts, but the host
check is a standard check_ping that comes with the nagios program.

Any ideas on how to figure out which service or host check is hung? Or
how to deal with this problem at all?

Jonathan Call
Network Engineer - NTT/Verio
801.437.7476



-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

49 matches

Mail list logo