Re: [Nagios-users] graphing trends across hosts or services instead of a timeseries

2009-06-26 Thread Rahul Nabar
On Wed, Mar 4, 2009 at 3:33 PM, Marco Tiradomarco.tir...@gmail.com wrote:
 PNP4Nagios has a feature called pages that allows you to show multiple
 services for the same host or multiple hosts for the same service. It should
 be easy to use since it supports regular expressions. Check the following
 link

 http://www.pnp4nagios.org/pnp/pages

I can't seem to get the pages feature of pnp4nagios working
correctly to get an assembly of charts.
I just tried creating a page by modifying the canned file
web_traffic.cfg-sample in the directory /nagios/etc/pnp/pages/
like so:

filename: cpu_temperatures.cfg #
define  page  {
use_regex 1
page_name Compute Nodes Current CPU Core Temperatures
}
define graph {
host_name   star177,star178,star179
service_descCpuCoreTemperature
}


I still do not see the webpage. Where should I be looking for in the
web-interface?

I already have a corresponding entry in the services.cfg file:
define service{
use rpn_intermediate_service
hostgroup_name 64bit-compute-nodes
service_description CpuCoreTemperature
check_command check_nrpe!check_cpu_temp
use srv-pnp-rpn-intermediate
}

The permissions and ownership of cpu_temperatures.cfg also seems
correct. What else could I be messing up? Any advice?

-- 
Rahul

--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] how can I access data stored by pnp4nagios

2009-06-25 Thread Rahul Nabar
On Thu, Jun 25, 2009 at 7:31 PM, Rahul Nabar rpna...@gmail.com wrote:



 In which case is there another way for me to access historic_perf_quantity
 at a given point in time for all servers? Or perhaps it is possible to
 generate something like this using custom rrdtool queries?

 Any suggestions?


Just noticed that one of my old emails had received a pointed to the pages
feature in pnp4nagios. Maybe that will work this time around..

-- 
Rahul
--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] how can I access data stored by pnp4nagios

2009-06-25 Thread Rahul Nabar
I just updated my nagios installation so that I can get cpu-core
temperatures with lm_sensors. Works fine. I have pnp4nagios which give good
time series trends of temperatures as well.

But the problem is if I want a snapshot of CPU temperatures right now for my
whole server room hardware (or at any other past instance of time)
pnp4nagios is not useful. It gives a time series but not a spatial (trend
over servers at a point in time) graph. If I wanted to hack together my own
graphing tool what's the best place to pull out data from?

Digging in I noticed the /nagios/share/perfdata/serverxxx/ directory which
does have the xml and rrd files. Is this a good spot to pull data out from
for each monitored server?

Unfortunately that would only be for the latest time instance , right? But
since pnp4nagios can plot over a time range it must have access to historic
temperatures too. So, where are these stored? I suspect internally  as a
binary produced my rrdtool?

In which case is there another way for me to access historic_perf_quantity
at a given point in time for all servers? Or perhaps it is possible to
generate something like this using custom rrdtool queries?

Any sugesstions?

-- 
Rahul
--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] nagvis requires ndoutils; how stable is ndoutils?

2009-06-24 Thread Rahul Nabar
On Tue, Jun 16, 2009 at 6:47 AM, Kevin Keane subscript...@kkeane.comwrote:

 I just installed ndoutils with mysql. There indeed was one pitfall: the
 database is growing quite large very quickly. Eventually, the DB got
 sluggish and couldn't keep up with the data Nagios threw at it (the DB
 server is quite underpowered). It got so bad that after a week or so,
 Nagios wouldn't even start up.

 It turned out that it wasn't primarily the database itself, but binary
 logging. It is turned on by default (at least on CentOS) but you only
 need it for replication. If you are not using replication, simply turn
 off binary logging and you should be good to go. At least, I hope so; I
 only made that change yesterday, so I won't know for another week or so.


Thanks for all those helpful comments guys! You might have saved me from a
few disasters here. I think I am staying away from Nagvis (and ndoutils )
for now.

Nagviz seems to me one of those tools that simply look great but the
back-end still needs quite some work before I'd be brave enough to unleash
it in a production environment!

-- 
Rahul
--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] nagvis requires ndoutils; how stable is ndoutils?

2009-06-24 Thread Rahul Nabar
On Thu, Jun 25, 2009 at 12:20 AM, Kevin Keane subscript...@kkeane.comwrote:

 I think that is a bit overreacting. ndoutils is a database client.


Thanks Kevin. Point taken.


 Databases need management and tuning to get you good performance -
 that's just routine, regardless of the brand you are using: mysql, SQL
 Server, Oracle, Postgres, 


But the way nagios natively stores data seems to be pretty robust though.
Nagios has scaled excellently right out of the box. From all these
discussions it seems that the problems arise when I try to hook up ndoutils
etc. in there. Maybe I am wrong!


No amount of work or polishing will
change that. There's a reason DBAs are highly valued professionals.

I feel that's the crux though. If each native nagios install neeed a skilled
DBA to tune it till it worked I doubt it'd have been so successful.

-- 
Rahul
--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] nagvis requires ndoutils; how stable is ndoutils?

2009-06-11 Thread Rahul Nabar
On Wed, Jun 10, 2009 at 2:56 AM, Giorgio Zarrellizarre...@linux.it wrote:
 it works. It's not the best, it has some overhead problems with MySql,
 causing some taxing utilization of the cpu, but it works. Sometimes you
 can fall in some indexing problems, but you can workaround them with this
 sql clause:

Thanks Giorgio and Marc-André. I think those comments confirm what I
thought. It is somewhat unstable and unfit to push into a production
environment. Nagvis did look so cool otherwise though!

Support also seems sort of iffy. I was toying with using the ndo2fs
backend but there is very little documentation. Besides usually Nagios
questions generate a lot of responses on the list but the relative
silence for my Nagvis questions seems to say that not so many users
are trying it yet.

-- 
Rahul

--
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing 
server and web deployment.
http://p.sf.net/sfu/businessobjects
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] cannot make comments under nagios after server crash. nagios.cmd missing

2009-05-16 Thread Rahul Nabar
I recently had a server crash. I recovered and restarted nagios
manually but now I seem to have lost the ability to make comments on
hosts. If I try I get the error message:

Error: Could not stat() command file '/usr/local/nagios/var/rw/nagios.cmd'!
The external command file may be missing, Nagios may not be running,
and/or Nagios may not be checking external commands.
An error occurred while attempting to commit your command for processing.


That file is indeed missing. Running a locate nagios.cmd though
shows the file at that location so it must have been there before the
crash.

Do I need to restart something? What am I missing.

The older comments are intact though. It is just that I cannot make
new comments.

-- 
Rahul

--
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables 
unlimited royalty-free distribution of the report engine 
for externally facing server and web deployment. 
http://p.sf.net/sfu/businessobjects
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] how to generate an availability report for a specific service on a whole hostgroup.

2009-03-09 Thread Rahul Nabar
I cannot figure out what is the best way to generate (if at all
possible) this kind of a report in Nagios:

For a specific service (ssh) list all the nodes in a hostgroup that
had a status other than normal in the last year (say). I can get a
trend for a service on a specific host. But how do I get this report
for an entire hostgroup+service combination?

Any tips?

-- 
Rahul

--
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] A group of Nagios users are:

2009-03-09 Thread Rahul Nabar
On Mon, Mar 9, 2009 at 6:15 PM, Martyn mar...@chetnet.co.uk wrote:
 Just on a lighter note, what do we call a bunch of Nagios users;
 Nagiothions?

Nagiosers is my vote. :)

--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Alert History seems empty

2009-03-03 Thread Rahul Nabar
On Mon, Mar 2, 2009 at 1:12 PM, Marc Powell m...@ena.com wrote:

 On Mar 2, 2009, at 11:25 AM, Rahul Nabar wrote:


 Alert history isn't performance data. An 'alert' is logged when the
 service changes state (i.e. OK-CRITICAL for example). Your service
 has not changed state in the current log file.

Thanks Marc. My bad. What I should have been looking for is View
Trends for this service. I got confused between those two.

-- 
Rahul

--
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Alert History seems empty

2009-03-03 Thread Rahul Nabar
On Mon, Mar 2, 2009 at 1:57 PM, Jim Avery j...@jimavery.me.uk wrote:
 The history will normally only record anything if the ping check has
 changed state (for example from OK to Warning).  If there's nothing in
 the log for a particular day, it simply means it''s been pinging fine
 all day (or if it's been critical all day long).

 Nagios itself doesn't do anything with the performance data, but can
 be configured to pass it on to flat files, a database or to a graph
 for example PNP or nagiosgrapher.  I use (and recommend) PNP as it's
 easy to install and use and seems to get better  better with every
 new release.  http://www.pnp4nagios.org/pnp/start

Thanks Jim. That makes sense now. I was misinterpreting the term
alert history. I already have PNP4nagios working.

-- 
Rahul

--
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] Alert History seems empty

2009-03-02 Thread Rahul Nabar
If I try View Alert History for this Service I seem to get an error
No history information was found for this this service in the current
log file

The file that it reports File: /usr/local/nagios/var/nagios.log is
indeed correctly present. What else could be wrong? Does Alert
History have to be explicitly enabled in some way?

This is a simple ping service so it does have performance data.

-- 
Rahul

--
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] graphing trends across hosts or services instead of a timeseries

2009-02-24 Thread Rahul Nabar
On Tue, Feb 24, 2009 at 3:17 AM, Jim Avery j...@jimavery.me.uk wrote:
 I had to install some dependencies, I forget which ones, but I'm
 pretty sure librrds-perl was one of them.

 The web interface for drraw is fairly intuitive, except it took me a
 few minutes to notice that in order to save a graph, you need to
 specify a Graph Title in the Graph Options section!

 hth,

 Jim

Awesome! That will be a lot of help I am sure. Thank you again, Jim. I
appreciate the help!

-- 
Rahul

--
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] nagios does not produce performance data for check_total_procs

2009-02-13 Thread Rahul Nabar
While getting PNP to graph stuff for my Nagios services I get an error
whenever I try to plot the Total Processes field.

RRD Database/usr/local/nagios/share/perfdata/star23/Total_Processes.rrd not
found.

I checked and that file for performance data indeed seems absent. I do have
the process_perf_data set to 1. Any ideas why Nagios is not producing
performance data for this particular service?

define service{
hostgroup_name npre-compute-nodes
service_description Total Processes
check_command check_nrpe!check_total_procs
process_perf_data 1
use srv-pnp-rpn-intermediate
}

-- 
Rahul
--
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] check_lm_sensors and the correct sensor names for checking cpu temperatures

2009-02-12 Thread Rahul Nabar
On Thu, Feb 12, 2009 at 1:35 AM, Matteo Corti matteo.co...@gmail.comwrote:

 Hi Rahul,


 On Feb 11, 2009, at 20:55 , Rahul Nabar wrote:



 On Wed, Feb 11, 2009 at 11:15 AM, Matteo Corti matteo.co...@gmail.com
 wrote:
 Dear Rahul,



 Does the input include the newline between the Core0 Temp: and the
 temperature?


 Yes, it does! Is that messing up the regexes?


 Yes ... :-(

 I hoped that the sensors -u output was more standard and that I could rely
 on that. I'll try to see if I can get the sensors information in a way which
 is consistent on several systems ...


Regexes always have this habit of breaking up when one is sure one covered
all test cases! :) Thanks for helping me figure out what it was! I'll see if
I can quickly hack together something that'll make it work for me!

-- 
Rahul
--
Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM)
software. With Adobe AIR, Ajax developers can use existing skills and code to
build responsive, highly engaging applications that combine the power of local
resources and data with the reach of the web. Download the Adobe AIR SDK and
Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] graphing trends across hosts or services instead of a timeseries

2009-02-12 Thread Rahul Nabar
One other thing that I haven't figured out yet with PNP-NAGIOS is this: How
does one get trending across services or hosts? i.e. It is easy to see time
series graphs of pingtimes, load averages disk usages etc. but sometimes
what seems more relevant is a chart across services for a given snapshot in
time. Say, to identify a hot node, or a node with unusually high load
averages.

Is there a way to do this? Or am I tinkering with the wrong tool!

-- 
Rahul
--
Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM)
software. With Adobe AIR, Ajax developers can use existing skills and code to
build responsive, highly engaging applications that combine the power of local
resources and data with the reach of the web. Download the Adobe AIR SDK and
Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] graphing trends across hosts or services instead of a timeseries

2009-02-12 Thread Rahul Nabar
On Thu, Feb 12, 2009 at 11:46 AM, Lee Azzarello l...@dropio.com wrote:

 Nagios itself does have some trending tools in version 3, though they
 are not very comprehensive. Are you looking for something beyond their
 scope?


Thanks Lee. I am not aware of the scope of the inbuilt trending tools. Maybe
that's a good place to start. How does one use those? Say, how can one
obtain a graph of ping times across all hosts in a suitable format?

That might make it easy to identify problem machines.

-- 
Rahul
--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] graphing trends across hosts or services instead of a timeseries

2009-02-12 Thread Rahul Nabar
On Thu, Feb 12, 2009 at 2:06 PM, Lee Azzarello l...@dropio.com wrote:

 In Nagios version 3, you click on Reporting-Trends and use the menus
 to generate a picture


Thanks again Lee!


 The limitation is you can only see one picture at a time for a
 particular host or service


That is a drawback. The whole idea is to get a picture across *many* hosts
or services. For a given host my PNP already generates better plots than the
inbuilt Nagios trending suite.


PNP does seem very geared for this. Just not sure how to make it plot a
certain time slice instead of a historic time series!

-- 
Rahul
--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] check_iptables and the -S option for iptables; now defunct?

2009-02-12 Thread Rahul Nabar
I was trying to roll the check_iptables script but ran a hiccup since my
system iptables refuses to accept the -S option that is included in the
script when it invokes iptables.

iptables v1.3.5: Unknown arg `-S'

Any other users of this script? Have you guys done away with the -S option?
Any workarounds? It seems this option was removed in later iptables
versions. But I am not expert enough with iptables to exactly understand its
relevance.

 --
Rahul
--
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] check_iptables and the -S option for iptables; now defunct?

2009-02-12 Thread Rahul Nabar
Found this on the list.

I had to make one modification within the script: The -S argument is not known
by the version, 1.3.8, of iptables on the server in question, so I replaced it
with the -L argument.


[
http://www.mail-archive.com/nagios-users@lists.sourceforge.net/msg23867.html
]

It does seem to work; although I am not really sure what I am doing! :)

-- 
Rahul
--
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] check_lm_sensors and the correct sensor names for checking cpu temperatures

2009-02-11 Thread Rahul Nabar
On Wed, Feb 11, 2009 at 1:28 AM, Matteo Corti matteo.co...@gmail.comwrote:



 Could you try to send me both the output of 'sensors' and 'check_lm_sensors
 --list -v -v'.

 Many thanks,


Sure. Here it is

rpna...@star255:~/usr/local/nagios/libexec/check_lm_sensors --list -v -v
warning: hddtemp not found: HDD temperatures not checked
sensors found at /usr/bin/sensors
LM_SENSORS OK - |
rpna...@star255:~sensors
k8temp-pci-00c3
Adapter: PCI adapter
Core0 Temp:
 +25°C
Core1 Temp:
 +20°C
--
Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM)
software. With Adobe AIR, Ajax developers can use existing skills and code to
build responsive, highly engaging applications that combine the power of local
resources and data with the reach of the web. Download the Adobe AIR SDK and
Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] check_lm_sensors and the correct sensor names for checking cpu temperatures

2009-02-11 Thread Rahul Nabar
On Wed, Feb 11, 2009 at 11:15 AM, Matteo Corti matteo.co...@gmail.comwrote:

 Dear Rahul,



 Does the input include the newline between the Core0 Temp: and the
 temperature?


Yes, it does! Is that messing up the regexes?

-- 
Rahul
--
Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM)
software. With Adobe AIR, Ajax developers can use existing skills and code to
build responsive, highly engaging applications that combine the power of local
resources and data with the reach of the web. Download the Adobe AIR SDK and
Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] check_lm_sensors and the correct sensor names for checking cpu temperatures

2009-02-11 Thread Rahul Nabar
On Wed, Feb 11, 2009 at 11:18 AM, Matteo Corti matteo.co...@gmail.comwrote:



 I forgot: could you also please send me the output of

 'sensors -uA'

 I feel that I could parse the raw output to avoid problems with spaces and
 newlines.

 Cheers and thanks again


 Matteo


k8temp-pci-00c3
Core0 Temp: 24.00 (temp1)
ERROR: Can't get feature `temp2' data!
Core1 Temp: 20.00 (temp3)
ERROR: Can't get feature `temp4' data!

-- 
Rahul
--
Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM)
software. With Adobe AIR, Ajax developers can use existing skills and code to
build responsive, highly engaging applications that combine the power of local
resources and data with the reach of the web. Download the Adobe AIR SDK and
Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] check_lm_sensors and the correct sensor names for checking cpu temperatures

2009-02-10 Thread Rahul Nabar
On Tue, Feb 10, 2009 at 2:57 PM, Matteo Corti matteo.co...@gmail.comwrote:

 Dear Rahul,
 Please post the output of /usr/local/nagios/libexec/check_lm_sensors
 --list
 Maybe, the --list option should tell us what check_lm_sensors sees but
 since it parses the output of sensors it should work.


Thanks again Matteo!

usr/local/nagios/libexec/check_lm_sensors --list
LM_SENSORS OK - |

That's the only output I get.

-- 
Rahul
--
Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM)
software. With Adobe AIR, Ajax developers can use existing skills and code to
build responsive, highly engaging applications that combine the power of local
resources and data with the reach of the web. Download the Adobe AIR SDK and
Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] check_lm_sensors and the correct sensor names for checking cpu temperatures

2009-02-10 Thread Rahul Nabar
On Tue, Feb 10, 2009 at 3:03 PM, Rahul Nabar rpna...@gmail.com wrote:



 On Tue, Feb 10, 2009 at 2:57 PM, Matteo Corti matteo.co...@gmail.comwrote:

 Dear Rahul,
 Please post the output of /usr/local/nagios/libexec/check_lm_sensors
 --list
 Maybe, the --list option should tell us what check_lm_sensors sees but
 since it parses the output of sensors it should work.


 Thanks again Matteo!

 usr/local/nagios/libexec/check_lm_sensors --list
 LM_SENSORS OK - |

 That's the only output I get.


Just a thought:

Could it have something to do with my $LANG variable? It was set to
en_US.UTF-8 earlier and then in the output of sensors the centigrade sign
appeared messed up.

Once I set it to export LANG=en_EN it appears correctly. Could this be
screwing up the regexes inside  check_lm_sensors?

If so, what's the workaround? Maybe I'm totally off the mark and this is
just a red herring.

-- 
Rahul
--
Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM)
software. With Adobe AIR, Ajax developers can use existing skills and code to
build responsive, highly engaging applications that combine the power of local
resources and data with the reach of the web. Download the Adobe AIR SDK and
Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] NRPE clutters /var/log/messages

2009-02-09 Thread Rahul Nabar
On Sun, Feb 8, 2009 at 9:18 AM, Hiren Patel hir3npa...@gmail.com wrote:

 Rahul Nabar wrote:

 My /var/log/messages shows hundreds of entries of this sort:

 Feb  6 23:33:00 star256 xinetd[15109]: START: nrpe pid=17610
 from=:::11.0.0.100
 Feb  6 23:33:01 star256 xinetd[15109]: EXIT: nrpe status=0 pid=17610
 duration=1(sec)

 Are they just indicative of normal nrpe operations? If so, how can I
 disable them so as not to clutter my log?

 I do have debug=0  in  my nrpe.conf. Why still these messages?

 --


 looks normal for me. the messages seem like they come from xinetd though,
 you could look at:
 1) xinetd logging options
 2) getting inetd logging to a separate file using xinet/syslog
 configuration.



Thanks Hiren. I'll look into those. Otherwise it is too much data to be
logged during normal operations!

-- 
Rahul
--
Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM)
software. With Adobe AIR, Ajax developers can use existing skills and code to
build responsive, highly engaging applications that combine the power of local
resources and data with the reach of the web. Download the Adobe AIR SDK and
Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] NRPE and redundant calls to remote hosts.

2009-02-09 Thread Rahul Nabar
On Sun, Feb 8, 2009 at 9:08 AM, Hiren Patel hir3npa...@gmail.com wrote:

 Marc Powell wrote:

 
  Passive checks with NSCA is pretty close, minus the 'if there is a
  status change' part. You could build that logic into whatever wrapper
  you are using to run the plugins on the remote host though. From the
  perspective of the nagios host, passive checks are much better than
  active checks.
 


Thanks Hiren and Marc!




 and if you're processing performance data for graphing or the like, you
 want the results submitted even if the service is okay.


True. But for some services I'd like to know much quicker if something is
wrong than if it is just sending performance data back for graphs. The
passive approach seems perfect for this.

-- 
Rahul
--
Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM)
software. With Adobe AIR, Ajax developers can use existing skills and code to
build responsive, highly engaging applications that combine the power of local
resources and data with the reach of the web. Download the Adobe AIR SDK and
Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] how is Service check Latency defined in nagios?

2009-02-09 Thread Rahul Nabar
What exactly is the Service check Latency in nagios? My processor load
averages are still ok after I enabled PNP but my latencies have shot through
the roof. Should I be worried or not? I have latencies around 46k millisecs
and execution times of 800 millisecs for my services.

-- 
Rahul
--
Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM)
software. With Adobe AIR, Ajax developers can use existing skills and code to
build responsive, highly engaging applications that combine the power of local
resources and data with the reach of the web. Download the Adobe AIR SDK and
Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] how is Service check Latency defined in nagios?

2009-02-09 Thread Rahul Nabar
On Mon, Feb 9, 2009 at 1:31 PM, Marc Powell m...@ena.com wrote:



- not allowing nagios to run sufficient concurrent checks for your
 configuration. running 'bin/nagios -s etc/nagios.cfg' will provide you
 with a recommendation. Make sure max_concurrent_checks is that high or
 higher.



Thanks Marc. I have: max_concurrent_checks=0

I ran the -s option and it produced a bunch of stats. (see below) But I am
not sure which line is the recommendation you refer to.

In addition it does report some savings I could get by using the -x and -u
options. Maybe I ought to enable those. Not sure whether those will be
latency reduction (good) or  execution time reduction (not so relevant for
me)

-- 
Rahul

/usr/local/nagios/bin/nagios -s
/usr/local/nagios/etc/nagios.cfg
Nagios 3.0.6
Copyright (c) 1999-2008 Ethan Galstad (http://www.nagios.org)
Last Modified: 12-01-2008
License: GPL

Timing information on object configuration processing is listed
below.  You can use this information to see if precaching your
object configuration would be useful.

Object Config Source: Config files (uncached)

OBJECT CONFIG PROCESSING TIMES  (* = Potential for precache savings with
-u option)
--
Read: 0.012501 sec
Resolve:  0.000822 sec  *
Recomb Contactgroups: 0.81 sec  *
Recomb Hostgroups:0.004820 sec  *
Dup Services: 0.031199 sec  *
Recomb Servicegroups: 0.000847 sec  *
Duplicate:0.03 sec  *
Inherit:  0.003707 sec  *
Recomb Contacts:  0.05 sec  *
Sort: 0.01 sec  *
Register: 0.028375 sec
Free: 0.002792 sec
  
TOTAL:0.085155 sec  * = 0.041487 sec (48.72%) estimated
savings


RETENTION DATA TIMES
--
Read and Process: 0.429982 sec
  
TOTAL:0.429982 sec


Timing information on configuration verification is listed below.

CONFIG VERIFICATION TIMES  (* = Potential for speedup with -x
option)
--
Object Relationships: 0.103647 sec
Circular Paths:   0.002239 sec  *
Misc: 0.003193 sec
  
TOTAL:0.109079 sec  * = 0.002239 sec (2.1%) estimated
savings


EVENT SCHEDULING TIMES
-
Get service info:0.023477 sec
Get host info info:  0.000172 sec
Get service params:  0.35 sec
Schedule service times:  0.041946 sec
Schedule service events: 0.021094 sec
Get host params: 0.07 sec
Schedule host times: 0.003610 sec
Schedule host events:0.008372 sec
 
TOTAL:   0.098713 sec


Projected scheduling information for host and service checks
is listed below.  This information assumes that you are going
to start running Nagios with your current config files.

HOST SCHEDULING INFORMATION
---
Total hosts: 265
Total scheduled hosts:   262
Host inter-check delay method:   SMART
Average host check interval: 300.00 sec
Host inter-check delay:  1.15 sec
Max host check spread:   30 min
First scheduled check:   Mon Feb  9 13:50:01 2009
Last scheduled check:Mon Feb  9 13:54:59 2009


SERVICE SCHEDULING INFORMATION
---
Total services: 2073
Total scheduled services:   2057
Service inter-check delay method:   SMART
Average service check interval: 300.58 sec
Inter-check delay:  0.15 sec
Interleave factor method:   SMART
Average services per host:  7.82
Service interleave factor:  8
Max service check spread:   30 min
First scheduled check:  Mon Feb  9 13:50:38 2009
Last scheduled check:   Mon Feb  9 13:55:40 2009


CHECK PROCESSING INFORMATION

Check result reaper interval:   10 sec
Max concurrent service checks:  Unlimited


PERFORMANCE SUGGESTIONS
---
I have no suggestions - things look okay.
--
Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM)
software. With Adobe AIR, Ajax developers can use existing skills and code to
build responsive, highly engaging applications that combine the power of local
resources and data with the reach of the web. Download the Adobe AIR SDK and
Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages 

Re: [Nagios-users] how is Service check Latency defined in nagios?

2009-02-09 Thread Rahul Nabar
On Mon, Feb 9, 2009 at 3:58 PM, Max perld...@webwizarddesign.com wrote:

 Rahul,

 On Mon, Feb 9, 2009 at 2:53 PM, Rahul Nabar rpna...@gmail.com wrote:
  Thanks Marc. I have: max_concurrent_checks=0

 Our experience has been that with max_concurrent_checks set to 0 and
 inter-check delay and nagios sleep set very low we get high reported
 service check latencies as we are basically asking Nagios to try and
 run everything as soon as possible ... 1000s of checks over a few
 seconds in essence ... which it can't do.   As far as 'real life'
 negative impact the high latency in this singular case hasn't meant
 much; it initially really worried me until i realized that the high
 service latency is just happening because we are basically telling
 nagios to pause / sleep / wait for as little time as possible and run
 things as quickly as possible.  We have around a 146 second service
 check latency but from our detailed Nagios metrics we see that check
 runs are completing in right around 4 minutes, under our 5 minute
 hard-ceiling (around 6000 checks).  our PNP performance graphs prove
 our suspicions .. our reporting server receives 6000 metrics in 4
 minutes or less and we have no gaps in our graphs or major under or
 over sampling problems with the data we retrieve from our remote
 agents.

 I only bring that up because if you not only have
 max_concurrent_checks set to 0 but also have tuned way down
 inter-check delay settings and sleep time you might be encountering
 the same situation and the high latency might not be something to
 worry about .. but only IF you have all your delays tuned very low and
 no ceiling on max checks.  for any other situation it is definitely
 something to investigate.



Thanks Max. That is a pretty intricate issue that I had no idea about! I'm
still trying to figure out the exact implications of what you describe.
Maybe I need to visit the Nagios manual again to re-read nagios's scheduling
logic. It's especially important to me now that I also have PnP running
performance stats.

Meanwhile this is a dump of the relevant parameters you speak about. I don't
recall changing any from their defaults.
Maybe I ought to in the light of what you mentioned?

service_inter_check_delay_method=s
host_inter_check_delay_method=s
sleep_time=0.25

#Timeouts:
service_check_timeout=60
host_check_timeout=30
event_handler_timeout=30
notification_timeout=30
ocsp_timeout=5
perfdata_timeout=5

-- 
Rahul
--
Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM)
software. With Adobe AIR, Ajax developers can use existing skills and code to
build responsive, highly engaging applications that combine the power of local
resources and data with the reach of the web. Download the Adobe AIR SDK and
Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] how is Service check Latency defined in nagios?

2009-02-09 Thread Rahul Nabar
On Mon, Feb 9, 2009 at 4:52 PM, Max perld...@webwizarddesign.com wrote:

 Yes, definitely do that.  I talk about how my team set up nagios and
 PNP to minimize delays in polling on my blog, though be warned that we
 break some of the rules that the documentation says to always follow,
 like doing a fork() in a NEB module and setting inter-check delay
 methods to n .. none.  so while it works for us I know that a number
 of people on this list would probably balk at how we did things and
 call us idiots :).


Thanks again Max! I think sometimes one is forced to disobey the standard
prescriptions! Maybe that is idiotic but whatever works! :)



 http://www.semintelligent.com/blog/


Thanks for the blog. Just found a very useful snippet there: ps -e -a -x -f
-o %u | sort | uniq -c | sort -rn there. If I use this I find that the
nagios owned processes seem to fluctuate a lot. Suddenly it goes as high
as 54 and then for a while it owns only 3 processes. Then it shoots up
again. Very interesting. Maybe that is the phenomenon you were referring to?
I should probably wrap it in a bash wrapper and get it to graph the nagios
processes in a 1 sec resolution to get a finer-time-grained idea of what is
going on!

-- 
Rahul
--
Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM)
software. With Adobe AIR, Ajax developers can use existing skills and code to
build responsive, highly engaging applications that combine the power of local
resources and data with the reach of the web. Download the Adobe AIR SDK and
Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] One Time Check

2009-02-06 Thread Rahul Nabar
On Fri, Feb 6, 2009 at 10:59 AM, Marc Powell m...@ena.com wrote:



 set the correct check_interval in the service definition.
 http://nagios.sourceforge.net/docs/3_0/objectdefinitions.html#service

 check_interval  1440 # 1440 minutes or 1 every 24 hours.


Unless it is important to control *when* the check runs within a 24 hour
period?

-- 
Rahul
--
Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM)
software. With Adobe AIR, Ajax developers can use existing skills and code to
build responsive, highly engaging applications that combine the power of local
resources and data with the reach of the web. Download the Adobe AIR SDK and
Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] NRPE and redundant calls to remote hosts.

2009-02-06 Thread Rahul Nabar
I've been adding a bunch of checks via NRPE on remote nodes and this got me
thinking. Isn't it inefficient to keep starting check_nrpe calls from the
monitoring host all the time?

Why cannot nrpe on the remote node monitor some of the local services and
only send a message back to nagios if there is a status change?

For warnings based on things like disk usage, cpu usage, total procs,
pbs scheduler daemon status , cpu temperatures etc. coudn't this
approach relive the central host's cpu of a lot of endless check_nrpe calls?
NRPE is already doing the work; its just a question of what initiates a
communication channel between NRPE and nagios.

Just curious. Or maybe there is a way of achieving this already that I don't
know of!

-- 
Rahul
--
Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM)
software. With Adobe AIR, Ajax developers can use existing skills and code to
build responsive, highly engaging applications that combine the power of local
resources and data with the reach of the web. Download the Adobe AIR SDK and
Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] One Time Check

2009-02-06 Thread Rahul Nabar
On Fri, Feb 6, 2009 at 11:57 AM, Marc Powell m...@ena.com wrote:


 On Feb 6, 2009, at 11:08 AM, Rahul Nabar wrote:

 
  check_interval  1440 # 1440 minutes or 1 every 24 hours.
 
  Unless it is important to control *when* the check runs within a 24
  hour period?

 The OP didn't state any such requirement.


You are right. I over-assumed.

-- 
Rahul
--
Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM)
software. With Adobe AIR, Ajax developers can use existing skills and code to
build responsive, highly engaging applications that combine the power of local
resources and data with the reach of the web. Download the Adobe AIR SDK and
Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Nagios remote GUI interface

2009-02-06 Thread Rahul Nabar
On Fri, Feb 6, 2009 at 11:47 AM, michael.washing...@fitchratings.comwrote:


 Just completed an upgrade, from Nagios 1.x to 3.x on separate desktop
 devices however.  On replacement device, I can locally load GUI using http
 specified  to either localhost or actual static ip, but cannot load
 remotely as I was able to on 1.x device.  1.x was load on REL enterprise
 3.x level while 3.x was through Fedora 9/SELinux.  My browser fails to
 present an authentication Window fro the 3.x device. I thought it was the
 Linux firewall config which I have temporarily disabled while resolving,
 but still same problem.  Any thoughts?


Maybe SELINUX is blocking it? Try setenforce 0 just to check? Maybe you
already did. Just a thought.

-- 
Rahul
--
Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM)
software. With Adobe AIR, Ajax developers can use existing skills and code to
build responsive, highly engaging applications that combine the power of local
resources and data with the reach of the web. Download the Adobe AIR SDK and
Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Nagios remote GUI interface

2009-02-06 Thread Rahul Nabar
On Fri, Feb 6, 2009 at 12:49 PM, michael.washing...@fitchratings.comwrote:

 Just performed it...but still no luck


Another random idea. Can you open any pages at all if they reside on the new
machine? Just wondering if its an apache (etc.) issue. I had a bunch of
restrictive conditions on my /etc/httpd/conf/httpd.conf about who could
access what pages.

-- 
Rahul
--
Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM)
software. With Adobe AIR, Ajax developers can use existing skills and code to
build responsive, highly engaging applications that combine the power of local
resources and data with the reach of the web. Download the Adobe AIR SDK and
Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] NRPE clutters /var/log/messages

2009-02-06 Thread Rahul Nabar
My /var/log/messages shows hundreds of entries of this sort:

Feb  6 23:33:00 star256 xinetd[15109]: START: nrpe pid=17610
from=:::11.0.0.100
Feb  6 23:33:01 star256 xinetd[15109]: EXIT: nrpe status=0 pid=17610
duration=1(sec)

Are they just indicative of normal nrpe operations? If so, how can I
disable them so as not to clutter my log?

I do have debug=0  in  my nrpe.conf. Why still these messages?

-- 
Rahul
--
Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM)
software. With Adobe AIR, Ajax developers can use existing skills and code to
build responsive, highly engaging applications that combine the power of local
resources and data with the reach of the web. Download the Adobe AIR SDK and
Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] posting graphs of pnp4nagios performance stats.: latencies of host and service checks are terribly degraded

2009-02-05 Thread Rahul Nabar
After I enabled pnp4nagios my service and host latencies have shot up
disastrously! The transition around midnight yesterday (when I got pnp
running) is amazing!

Just posting two graphs here in case it helps anybody else:

http://picasaweb.google.com/rpnabar/Nagios_debug?feat=directlink

Either:
(A) I am doing something stupidly bad with the default mode
OR
(B) I really need to go to the bulk mode.

I would really appreciate any other tips / stories users have for me!

-- 
Rahul
--
Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM)
software. With Adobe AIR, Ajax developers can use existing skills and code to
build responsive, highly engaging applications that combine the power of local
resources and data with the reach of the web. Download the Adobe AIR SDK and
Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] nagios and lm_sensors

2009-02-04 Thread Rahul Nabar
On Wed, Feb 4, 2009 at 12:46 AM, Matteo Corti matteo.co...@id.ethz.chwrote:

 Dear Rahul,
 You need the Nagios::Plugin and Nagios::Plugin::Threshold Perl modules
 which you can get on CPAN

 http://search.cpan.org/~tonvoon/Nagios-Plugin-0.31/lib/Nagios/Plugin.pmhttp://search.cpan.org/%7Etonvoon/Nagios-Plugin-0.31/lib/Nagios/Plugin.pm


I did install ' Nagios::Plugin' from the link. Is there a seperate
'Nagios::Plugin::Threshold' module to be installed or is it included by
default?

It seems I still don't have success. 'perl Makefile.PL' says:

Warning: prerequisite Class::Accessor 0 not found.
Warning: prerequisite Config::Tiny 0 not found.
Warning: prerequisite Math::Calc::Units 0 not found.
Warning: prerequisite Params::Validate 0 not found.
Writing Makefile for Nagios::Plugin

make doesn't complain *but* make test is a complete disaster!

Failed 16/16 test scripts, 0.00% okay. 715/718 subtests failed, 0.42% okay.
make: *** [test_dynamic] Error 255

I must still be doing something stupid! Any other suggestions?

-- 
Rahul
--
Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM)
software. With Adobe AIR, Ajax developers can use existing skills and code to
build responsive, highly engaging applications that combine the power of local
resources and data with the reach of the web. Download the Adobe AIR SDK and
Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] added PNP in the default mode; how serious is the performance degradation to warrant a switch to the bulk mode?

2009-02-04 Thread Rahul Nabar
So, I finally succeed in configuring PNP for Nagios (whew!!)! It's been a
long bloody battle but I think I eventually won! :)

I've just added PNP performance graphs to my 4-switches for now.

Am a bit hesistant about adding it to all my 300 hosts due to all the
caeveats about performance and PNP in the default mode. Any other PNP users?
How many services / hosts are you running PNP performance graphs on? How is
your performance? Have you been forced to switch to the bulk mode already?

-- 
Rahul
--
Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM)
software. With Adobe AIR, Ajax developers can use existing skills and code to
build responsive, highly engaging applications that combine the power of local
resources and data with the reach of the web. Download the Adobe AIR SDK and
Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] nagios and lm_sensors

2009-02-04 Thread Rahul Nabar
Ah! Ok. Yes. I can dig deeper and install them all from CPAN. Didn't realize
the dependencies were so many!

Thanks again!

-- 
Rahul

On Wed, Feb 4, 2009 at 11:22 PM, Matteo Corti matteo.co...@id.ethz.chwrote:

 Dear Rahul,


 On Feb 5, 2009, at 5:23 , Rahul Nabar wrote:



 On Wed, Feb 4, 2009 at 12:46 AM, Matteo Corti matteo.co...@id.ethz.ch
 wrote:
 Dear Rahul,
 You need the Nagios::Plugin and Nagios::Plugin::Threshold Perl modules
 which you can get on CPAN

 http://search.cpan.org/~tonvoon/Nagios-Plugin-0.31/lib/Nagios/http://search.cpan.org/%7Etonvoon/Nagios-Plugin-0.31/lib/Nagios/
 Plugin.pm


 I did install ' Nagios::Plugin' from the link. Is there a seperate
 'Nagios::Plugin::Threshold' module to be installed or is it included by
 default?

 It seems I still don't have success. 'perl Makefile.PL' says:

 Warning: prerequisite Class::Accessor 0 not found.
 Warning: prerequisite Config::Tiny 0 not found.
 Warning: prerequisite Math::Calc::Units 0 not found.
 Warning: prerequisite Params::Validate 0 not found.
 Writing Makefile for Nagios::Plugin

 make doesn't complain *but* make test is a complete disaster!

 Failed 16/16 test scripts, 0.00% okay. 715/718 subtests failed, 0.42%
 okay.
 make: *** [test_dynamic] Error 255

 I must still be doing something stupid! Any other suggestions?


 All the warnings are telling you that you are missing several *needed*
 modules (e.g., Class::Accessor, Config::Tiny, ...)

 You can install them via CPAN or maybe you can already find them packaged
 for your OS. Consult the documentation of the Perl distribution you are
 using.


 Matteo

 --
 ETH Zurich, Dr. Matteo Corti, Informatikdienste / Basisdienste
 STC E 13, Stampfenbachstrasse 67, 8092 Zurich
 Tel +41 44 6327944, http://www.id.ethz.ch




--
Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM)
software. With Adobe AIR, Ajax developers can use existing skills and code to
build responsive, highly engaging applications that combine the power of local
resources and data with the reach of the web. Download the Adobe AIR SDK and
Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] nagios and lm_sensors

2009-02-04 Thread Rahul Nabar
On Wed, Feb 4, 2009 at 11:22 PM, Matteo Corti matteo.co...@id.ethz.chwrote:

 All the warnings are telling you that you are missing several *needed*
 modules (e.g., Class::Accessor, Config::Tiny, ...)

 You can install them via CPAN or maybe you can already find them packaged
 for your OS. Consult the documentation of the Perl distribution you are
 using.


Thanks for all the help thus far Matteo! Maybe I can bug you one last time!
I think I have success in getting Nagios::Plugin etc. working. But when I
come back and try to compile check_lm_sensor it has only one last
complaint (solved all the other previous dependency issues)

/
hostperl Makefile.PL
Cannot determine perl version info from check_lm_sensors.pod
WARNING: INSTALLSITESCRIPT is not a known parameter.
'INSTALLSITESCRIPT' is not a known MakeMaker parameter name.
Writing Makefile for check_lm_sensors
///

I'm not sure if this is a problem or not? But if I just go ahead then make
and make install silently proceed. But if I try using check_lm_sensors
it crashes badly!

//
/usr/lib/nagios/plugins/contrib/check_lm_sensors
Can't locate version.pm in @INC (@INC contains:
/usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi
/usr/lib/perl5/site_perl/5.8.7/i386-linux-thread-multi
/usr/lib/perl5/site_perl/5.8.6/i386-linux-thread-multi
/usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi
/usr/lib/perl5/site_perl/5.8.8 /usr/lib/perl5/site_perl/5.8.7
/usr/lib/perl5/site_perl/5.8.6 /usr/lib/perl5/site_perl/5.8.5
/usr/lib/perl5/site_perl
/usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi
/usr/lib/perl5/vendor_perl/5.8.7/i386-linux-thread-multi
/usr/lib/perl5/vendor_perl/5.8.6/i386-linux-thread-multi
/usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi
/usr/lib/perl5/vendor_perl/5.8.8 /usr/lib/perl5/vendor_perl/5.8.7
/usr/lib/perl5/vendor_perl/5.8.6 /usr/lib/perl5/vendor_perl/5.8.5
/usr/lib/perl5/vendor_perl /usr/lib/perl5/5.8.8/i386-linux-thread-multi
/usr/lib/perl5/5.8.8 .) at /usr/lib/nagios/plugins/contrib/check_lm_sensors
line 33.
BEGIN failed--compilation aborted at
/usr/lib/nagios/plugins/contrib/check_lm_sensors line 33.
///

Any more pointers?

-- 
Rahul
--
Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM)
software. With Adobe AIR, Ajax developers can use existing skills and code to
build responsive, highly engaging applications that combine the power of local
resources and data with the reach of the web. Download the Adobe AIR SDK and
Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] nagios and lm_sensors

2009-02-04 Thread Rahul Nabar
On Thu, Feb 5, 2009 at 12:14 AM, Matteo Corti matteo.co...@id.ethz.chwrote:


 Yes you are missing another module 'version'. I forgot to put it in the
 list of requirements since it usually part of most Perl distribution.


Awesome. It works! Thanks for all the help Matteo! I'm one step closer to
monitoring my remote server cpu temperatures via nrpe + check_lm_sensors +
nagios.  This was a statistic we have been lacking on our HPC cluster for a
long long time!

My Perl distribution probably needs updating. I think we have been lazy
about this!

-- 
Rahul
--
Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM)
software. With Adobe AIR, Ajax developers can use existing skills and code to
build responsive, highly engaging applications that combine the power of local
resources and data with the reach of the web. Download the Adobe AIR SDK and
Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] nagios and lm_sensors

2009-02-03 Thread Rahul Nabar
I just installed the lm_sensors package on my remote machines to get their
temperatures. sensors works OK. I wanted to use check_nrpe and somehow add
the remote machine temps to my nagios webpage.

I found the check_sensors command but that only returns sensor ok. I
suppose the check_lm_sensors plugin (
http://www.nagiosexchange.org/cgi-bin/page.cgi?g=Detailed%2F1289.html;d=1)
is what I need? Are there other users of this?

 I did try compiling it but am running into problems following the
instructions in the INSTALL

/
Cannot determine perl version info from check_lm_sensors.pod
WARNING: INSTALLSITESCRIPT is not a known parameter.
Warning: prerequisite Nagios::Plugin 0 not found.
Warning: prerequisite Nagios::Plugin::Threshold 0 not found.
'INSTALLSITESCRIPT' is not a known MakeMaker parameter name.
Writing Makefile for check_lm_sensors
//

Do I need to configure more pre-requisites before I can get this running?

-- 
Rahul
--
Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM)
software. With Adobe AIR, Ajax developers can use existing skills and code to
build responsive, highly engaging applications that combine the power of local
resources and data with the reach of the web. Download the Adobe AIR SDK and
Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] SNMP monitoring of a Dell switch: snmpwalk succeds but check_snmp fails.

2009-01-30 Thread Rahul Nabar
On Thu, Jan 29, 2009 at 8:40 PM, Max perld...@webwizarddesign.com wrote:

Thanks for the detailed comments Max!


 The OID that maps to ifOperStatus from RFC1213-MIB is 1.3.6.1.2.1.2.2.1.8

 So grep for

 1.3.6.1.2.1.2.2.1.8.1


No luck. No such string in there.



 if the interface you want to look at is indeed at index 1 :).


Ah! Now I am lost! What do you mean by this. Sorry I am a networks newbiee
and especially SNMP is greek-n-latin to me!

Actually I am not even sure what I should be monitoring on a switch. I was
just using the example from the nagios tutorial for now. Maybe its
alive/dead status ; bandwidth of individual ports (but that's mrtg's job
right?) ; dropped packets; some thermal events? How does one go about this?
What are other users montoring on their switches and how does one go about
translating the fairly cryptic SNMP fields into something usable? Should I
dig into my Dell switch manuals? Or is this reinventing the wheel and Nagios
has an automated way to achieve this already?





 If you are just running this for one port on one switch, then loading
 the MIB is no biggie, if you plan to monitor hundreds or thousands of
 ports, would be better to use the numeric form of the OID and run an
 ePN plugin using the perl Net::SNMP or NSNMP library or a plugin that
 implements the C Net-SNMP libraries directly


The maximum I'll end up monitoring is perhaps 4 switches with 48 ports each.
So from your stats this should be on the fairly low side.

-- 
Rahul
--
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] SNMP monitoring of a Dell switch: snmpwalk succeds but check_snmp fails.

2009-01-29 Thread Rahul Nabar
I was trying to monitor my Dell Power Connect Switch via nagios. I used the
default templates and have this check_command in my switch.cfg:
check_command   check_snmp!-C public -o ifOperStatus.1 -r 1 -m
RFC1213-MIB

Unfortunately the web-interface shows: SNMP CRITICAL - *down(2)* 

Now I tried a naked SNMP query on this switch:

snmpwalk -v1 -c public switch3 -m ALL.1  /tmp/switch3.snmp.log

The switch does respond with reams of output! But I am not really sure what
I should be looking for in there. I tried grepping on  ifOperStatus.1 but
that is not to be found.

Any other suggestions how to monitor this recalcitrant Dell switch?

-- 
Rahul
--
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Ways and tweaks to make nagios more efficient. load average on monitoring host edging up.

2009-01-28 Thread Rahul Nabar
On Tue, Jan 27, 2009 at 5:58 PM, Jake jakepau...@gmail.com wrote:


 I use ping as both a service check and a host check because i want to ping
 all of the time to measure latency, etc. I wouldn't think so much about
 eliminating service checks that aren't directly redundant as much as making
 sure the checks you do are as fast as possible.


Thanks Jake! I'll heed the advice. I wasn't sure about what are the parts
best worth tackling to gain efficiency.



 Specifically, look for any service check that takes longer than a second.


Is there a place where it logs how long a service check too? How do you
usually find out? I can only see when it was last checked on my interface
but not how long it took.


 Also make sure your timeouts are set low as this can easily be a source for
 high load averages - e.g. if you consider 500ms latency on the ping service
 to be critical then why not set your timeout value to one or two seconds
 instead of 10 (which is the default for check_ping).


Is *service_check_timeout=60 in the main config file the timeout that you
are talking about? I might be mistaking what you mean.*

Shouldn't this matter only for the nodes that *do* have a latency problem
alone? I hope these will remain a minor fraction. But the major chunk will
be the ones that respond within the timeout but still a *lot* of work. How
does it work out that the timeout made such a huge difference for you?




 That single change for check_ping made a huge difference for me and that
 was before I started even looking at other services like my
 check_dell-hardware and check_hp-hardware which were awfully slow prior to
 rewriting them (now available on nagiosexchange.)


 --
 Jake Paulus
 jakepau...@gmail.com

--
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Ways and tweaks to make nagios more efficient. load average on monitoring host edging up.

2009-01-28 Thread Rahul Nabar
On Wed, Jan 28, 2009 at 1:32 AM, Kyle O'Donnell kyleodonn...@gmail.comwrote:

 I use service deps.  Most of my services are nrpe checks and I create
 a dep on nrpe.  If a check comes back critical (or which ever state
 you choose to execute the dep) it does an nrpe check,  if nrpe returns
 critical (or whichever state you choose) it stops executing the
 services dependant on nrpe.

 My load is less than 2 on a machine with 800 hosts and 6000 services.

 Active host checks are disabled.


So, you have no active checks at all? Or just no active host checks? I am a
bit confused. All my checks are active. How does one disable active host
checks? And then when will the host check be done at all?



 As for ping I don't check as a service only a host check which gets
 executed if any service turns critical.


That might be the exact functionality I was thinking of. If I look under
Host Status Details for all host groups I see very recent and regular
checks being done on all my hosts under the column for Last Check. Even
ones that do not have any services critical.

Or will I only see the behavior you describe after I somehow disable active
host checks?



 You can use check_ssh as the host check command instead of ping if you
 prefer as well.


Good idea. But I still want ping to fall back on. If ssh fails only then
ping. Is that logical?

-- 
Rahul
--
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Ways and tweaks to make nagios more efficient. load average on monitoring host edging up.

2009-01-28 Thread Rahul Nabar
On Tue, Jan 27, 2009 at 6:04 PM, Mathieu Gagné mga...@iweb.com wrote:

 We have +2000 hosts and +4700 services configured on one of our Nagios
 instance. Load average is between 1.3 an 2.0 which I find acceptable.


Wow. That's way bigger than what I have. Mine's a cluster of 256 machines
and around 6 services checked on each. I have an advantage that most are on
a local LAN so no internet connectivity issues and external bandwidth
bottlenecks.


 The SSH service state can be CRITICAL while all the other services are
 still OK. (ie. ssh server misconfiguration) You probably want to be informed
 about it too.


True. But if SSH is down will NRPE still work? Or are they totally
independent?


 What kind of server are you using?


Intel(R) Xeon(TM) CPU 2.80GHz dual core. 2 GB RAM



 Also, what's the check_interval? A 1 minute interval might put the server
 on its knee since it would be scheduling and executing 1536 checks per
 minute. (as per your informations)


nagios.cfg
 command_check_interval=-1
services.cfg
normal_check_interval   5
retry_check_interval1


-- 
Rahul
--
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Ways and tweaks to make nagios more efficient. load average on monitoring host edging up.

2009-01-28 Thread Rahul Nabar
On Wed, Jan 28, 2009 at 12:32 PM, Marc Powell m...@ena.com wrote:



 Just out of curiosity, what is the magnitude of the 'edging upwards'
 that you are seeing?


Its not bad right now .But the trend is what I am wary about. My load
factors are around 3.  But I am still planning on adding more hosts and
services and I thought it best to investigate early on if I was doing things
efficiently before it came to a critical point.


 Just about any hardware released in the past 5
 years or so should have no problems with that number of checks at all
 if they're 'normal' (base nagios-plugins) and run at a normal interval
 (5 min). Even older hardware could probably do it. What are they types
 of checks you are performing? How often?


Intel(R) Xeon(TM) CPU 2.80GHz dual core. 2 GB RAM
Its about 5 years old now I think.

Checks I have are:

check_ssh
check_ping

NRPE
check_nrpe!check_load
check_nrpe!check_total_procs
check_nrpe!check_disk
check_nrpe!check_disk_scratch
check_nrpe!check_pbsmom
check_nrpe!check_time_node


Are they perl checks and do
 you have the embedded perl interpreter (ePN) enabled?


I don't think I have that enabled. It is disabled by default I think at
compile-time.

-- 
Rahul
--
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Ways and tweaks to make nagios more efficient. load average on monitoring host edging up.

2009-01-28 Thread Rahul Nabar
On Wed, Jan 28, 2009 at 4:34 PM, Marc Powell m...@ena.com wrote:


 On Jan 28, 2009, at 2:21 PM, Rahul Nabar wrote:

  Intel(R) Xeon(TM) CPU 2.80GHz dual core. 2 GB RAM
  Its about 5 years old now I think.


A minor correction. Mine is just a hyperthreaded machine. I don't think it
is two real cores. But still shows up as twin cpus. In case it matters.

-- 
Rahul
--
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Ways and tweaks to make nagios more efficient. load average on monitoring host edging up.

2009-01-28 Thread Rahul Nabar
On Wed, Jan 28, 2009 at 5:07 PM, Mathieu Gagné mga...@iweb.com wrote:



 According to cpubenchmark.net, my el cheapo CPU is better than yours:

 Intel Xeon 2.80GHz
 Score: 495
 Rank: 281
 Link: http://www.cpubenchmark.net/cpu_lookup.php?cpu=Intel+Xeon+2.80GHz

 Intel Core2 4300 @ 1.80GHz
 Score: 983
 Rank: 170
 Link:
 http://www.cpubenchmark.net/cpu_lookup.php?cpu=Intel+Core2+4300+%40+1.80GHz


 Xeon isn't always better. Sorry. :-(


Haha! I guess I have to live with that for now! Too bad!

-- 
Rahul
--
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] Ways and tweaks to make nagios more efficient. load average on monitoring host edging up.

2009-01-27 Thread Rahul Nabar
I set up my nagios system to monitor 256 odd nodes each with about 6
services (direct and NRPE). It is working fine but my load averages have
started edging upwards. Not critical yet but I wanted some tips to make
things more efficient and see if there are things I might have done
ineffeciently.

One of the points I identified is this: I am doing a ping and ssh check on
each server. This seems redundant. Is there a way to set it up so that:
Do a ssh check; if this succeds obviously ping is ok. If it fails do a ping
check and report on that.


How about the other way around too? I have a bunch of NRPE checks:
load_average, total-processes, scratch and home dir usage, pbs_mom,
ntp_time. If ssh fails then there is obviously no reason to try these other
checks right? But I think the monitoring_host wastes its cycles still trying
them (based on the Last Check time)

Any tips how I can achieve these effeciency tweaks? Or is there a problem in
my strategy? Any other performance tweaks so that I can squeeze every ounce
of Nagios performace?

Already I am using NRPE rather than check_by_sshh since I was told the
latter might be ineffecient for the monitoring host load usage.

-- 
Rahul
--
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] NRPE installation fails since check_nrpe plugin is not found in the libexec directory

2009-01-26 Thread Rahul Nabar
I could successfully install nagios-plugins-1.4.13.tar.gz  but when I try
installing nrpe-2.8 the make install-plugin step fails.

It does not seem to find the check_nrpe file in the libexec directory. All
the other check_* files seem to be present but not this one!

Any sugesstions what I could be messing up? I've the relevant error below.

-- 
Rahul


[r...@star177 nrpe-2.8]# make install-plugin
cd ./src/  make install-plugin
make[1]: Entering directory
`/usr/local/src/nagios_nodes/downloads/nrpe-2.8/src'
/usr/bin/install -c -m 775 -o nagios -g nagios -d /usr/local/nagios/libexec
/usr/bin/install -c -m 775 -o nagios -g nagios check_nrpe
/usr/local/nagios/libexec
/usr/bin/install: cannot stat `check_nrpe': No such file or directory
make[1]: *** [install-plugin] Error 1
make[1]: Leaving directory
`/usr/local/src/nagios_nodes/downloads/nrpe-2.8/src'
make: *** [install-plugin] Error 2
--
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] NRPE installation fails since check_nrpe plugin is not found in the libexec directory

2009-01-26 Thread Rahul Nabar
On Mon, Jan 26, 2009 at 1:38 PM, Andy Shellam andy-li...@networkmail.euwrote:

 Hi Rahul,

 The error it's returning suggests that check_nrpe is not in the src
 subdirectory of the current directory
 (/usr/local/src/nagios_nodes/downloads/nrpe-2.8/src.)

 It looks as if the plugin has not been built.  Do you in fact have
 check_nrpe in the above mentioned directory?

 What was the final output from make ?  I'm guessing it threw an error.


Thanks again Andy! I don't seem to have the check_nrpe. Neither here nor
in /usr/local/nagios/libexec. All the other scripts seem to be present there
though! But not this one.

You are right. make did throw errors (I got carried away by my previous
successful installs and did not notice!) Snippet attached below but it seems
to be a bunch of SSL errors that prevent it from compiling check_nrpe.

I traced back further and checked config.log for the nagios-plugins-1.4.13
and it shows:

configure:24258: WARNING: OpenSSL or GnuTLS libs could not be found or were
disabled

I'm not sure why though! yum info openssl.i686 openssl-devel.i386 show
those both packages as installed

ANy other sugesstions?

-- 
Rahul

/
/usr/include/openssl/bn.h:287: error: expected specifier-qualifier-list
before ‘BN_ULONG’
/usr/include/openssl/bn.h:303: error: expected specifier-qualifier-list
before ‘BN_ULONG’
/usr/include/openssl/bn.h:449: error: expected ‘=’, ‘,’, ‘;’,
‘asm’ or ‘__attribute__’ before ‘BN_mod_word’
/usr/include/openssl/bn.h:450: error: expected ‘=’, ‘,’, ‘;’,
‘asm’ or ‘__attribute__’ before ‘BN_div_word’
/usr/include/openssl/bn.h:451: error: expected declaration specifiers or
‘...’ before ‘BN_ULONG’
/usr/include/openssl/bn.h:452: error: expected declaration specifiers or
‘...’ before ‘BN_ULONG’
/usr/include/openssl/bn.h:453: error: expected declaration specifiers or
‘...’ before ‘BN_ULONG’
/usr/include/openssl/bn.h:454: error: expected declaration specifiers or
‘...’ before ‘BN_ULONG’
/usr/include/openssl/bn.h:455: error: expected ‘=’, ‘,’, ‘;’,
‘asm’ or ‘__attribute__’ before ‘BN_get_word’
/usr/include/openssl/bn.h:470: error: expected declaration specifiers or
‘...’ before ‘BN_ULONG’
/usr/include/openssl/bn.h:743: error: expected ‘=’, ‘,’, ‘;’,
‘asm’ or ‘__attribute__’ before ‘bn_mul_add_words’
/usr/include/openssl/bn.h:744: error: expected ‘=’, ‘,’, ‘;’,
‘asm’ or ‘__attribute__’ before ‘bn_mul_words’
/usr/include/openssl/bn.h:745: error: expected ‘)’ before ‘*’ token
/usr/include/openssl/bn.h:746: error: expected ‘=’, ‘,’, ‘;’,
‘asm’ or ‘__attribute__’ before ‘bn_div_words’
/usr/include/openssl/bn.h:747: error: expected ‘=’, ‘,’, ‘;’,
‘asm’ or ‘__attribute__’ before ‘bn_add_words’
/usr/include/openssl/bn.h:748: error: expected ‘=’, ‘,’, ‘;’,
‘asm’ or ‘__attribute__’ before ‘bn_sub_words’
In file included from /usr/include/openssl/ssl.h:978,
 from ../include/config.h:228,
 from ../include/common.h:24,
 from utils.c:32:
/usr/include/openssl/ssl3.h:303: error: expected specifier-qualifier-list
before ‘PQ_64BIT’
In file included from /usr/include/openssl/dtls1.h:64,
 from /usr/include/openssl/ssl.h:980,
 from ../include/config.h:228,
 from ../include/common.h:24,
 from utils.c:32:
/usr/include/openssl/pqueue.h:73: error: expected specifier-qualifier-list
before ‘PQ_64BIT’
/usr/include/openssl/pqueue.h:80: error: expected ‘)’ before
‘priority’
/usr/include/openssl/pqueue.h:89: error: expected declaration specifiers or
‘...’ before ‘PQ_64BIT’
In file included from /usr/include/openssl/ssl.h:980,
 from ../include/config.h:228,
 from ../include/common.h:24,
 from utils.c:32:
/usr/include/openssl/dtls1.h:92: error: expected specifier-qualifier-list
before ‘PQ_64BIT’
make[1]: *** [nrpe] Error 1
make[1]: Leaving directory
`/usr/local/src/nagios_nodes/downloads/nrpe-2.8/src'
//
--
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] NRPE installation fails since check_nrpe pluginis not found in the libexec directory

2009-01-26 Thread Rahul Nabar


 Yes, check to ensure gnutls is also installed (rpm -q | grep tls) , and
 also run ldconfig -v | grep ssl and just be sure it can see the
 openssl-related *.so file(s) ;)


Thanks Jamie! I'm stumped here am doing all checks possible to sniff out
what my problems are!

I am not very sure, but the output below seems to indicate that we have what
we need right?

rpm -q gnutls
gnutls-1.6.3-2.fc8
gnutls-1.6.3-2.fc8

ldconfig -v | grep ssl
libssl.so.6 - libssl.so.0.9.8b
libssl.so.6 - libssl.so.0.9.8b
libssl3.so - libssl3.so
libgnutls-openssl.so.13 - libgnutls-openssl.so.13.3.0
libssl3.so - libssl3.so
libgnutls-openssl.so.13 - libgnutls-openssl.so.13.3.0

and just be sure it can see the openssl-related *.so file(s) ;)

ls -al /usr/lib/libgnutls-openssl.so.13.3.0
-rwxr-xr-x 1 root root 102572 2007-08-21 16:25
/usr/lib/libgnutls-openssl.so.13.3.0

Hmm..I am not sure what you mean. Is the above a sufficient check?

Feel free to shoot more suggestions at me, however unlikely! At this point I
really am grasping at straws! :-(

In case it matters I am running FC8; pretty standard. Hence I had never
expected nrpe to be so difficult to get up and running!

-- 
Rahul
--
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] NRPE installation fails since check_nrpe pluginis not found in the libexec directory

2009-01-26 Thread Rahul Nabar
On Mon, Jan 26, 2009 at 3:34 PM, James Pratt jpr...@norwich.edu wrote:



 Yes all appears to be in order...  I'm not sure what to tell you there 
 The only other thing I can think of is that FC8 is just too old (?)... I
 don't think there are even updates for it anymore...


Thanks for your tips James! I don't think it is an FC8 issue. Just last week
I got nrpe running on about 175 machines all using FC8. THese were 32 bit
machines though. Today I tried extending this to our remaining machines (64
bit) and that is when i started running into problems.

The 64 bit machines might be a red herring though. It's just this class of
machines ; may not have to do anything with the 64 bit arch.






 You may want to just install FC10 or 11 (Or whatever the most recent is!) -
 I remember I once setup Nagios on FC10 or 11 and it was a breeze using RPM's
 for everything, and the Nagios site has good instructions...  (I now use
 CentOS 5.2 - fedora's release cycle is much too fast, and CentOS is as
 stable as RHEL for me...


Upgrading my OS isnt an option unfortunately. Too big a project to unroll on
256 machines for now.

-- 
Rahul
--
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] NRPE installation fails since check_nrpe pluginis not found in the libexec directory

2009-01-26 Thread Rahul Nabar
On Mon, Jan 26, 2009 at 3:34 PM, James Pratt jpr...@norwich.edu wrote:


 Yes all appears to be in order...  I'm not sure what to tell you there 
 The only other thing I can think of is that FC8 is just too old (?)... I
 don't think there are even updates for it anymore...

 You may want to just install FC10 or 11 (Or whatever the most recent is!) -
 I remember I once setup Nagios on FC10 or 11 and it was a breeze using RPM's
 for everything, and the Nagios site has good instructions...  (I now use
 CentOS 5.2 - fedora's release cycle is much too fast, and CentOS is as
 stable as RHEL for me...


Solved it! Thanks for all your help guys. Well, the 64 bit issue was a red
herring. I am still not a 100% sure what my issue was  but here's what I
think:

It was all a problem with NFS mounted drives. I have a base system and
several of my remote hosts find their executibles by NFS mounts on the
relevant dirs. I was trying to install from one such machine and that's when
I had these problems. I went back and tried on the home-machine where these
NFS mounts reside and it all worked.

No idea why! The only suspicion I have is some soft-links business. These
don't span the NFS mounts I remember.

-- 
Rahul
--
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] nagios service flapping

2009-01-23 Thread Rahul Nabar
I just had a bunch of  services start flapping on me. THe common factor
seems all of these were services monitored by nrpe.

//
Notifications for this service are being suppressed because it was detected
as having been flapping between different states (22.4% change = 20.0%
threshold). When the service state stabilizes and the flapping stops,
notifications will be re-enabled.
//

My nrpe.cfg is pristine except for

command[check_disk_scratch]=/usr/local/nagios/libexec/check_disk -w 20 -c 10
-p /scratch

What could be causing a service to start flapping. Never happened to me
before. ANy debug sugesstions?

The Status for the service is correct though.
DISK OK - free space: /scratch 14886 MB (52% inode=98%):

-- 
Rahul

snippet from services.cfg
define service{
use rpn_intermediate_service
hostgroup_name npre-compute-nodes
service_description /scratch Partition on nodes
check_command check_nrpe!check_disk_scratch ; details
defined in the nrpe.conf
}
--
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] using NPRE to monitor pbs_mom. Error: NRPE: Unable to read output

2009-01-22 Thread Rahul Nabar
I'm a bit confused about how exactly to add stuff with NRPE to monitor local
services on my remote hosts. I got the basics out of the way and I can
already monitor the easy stuff like users, procs, swap etc.

More ambitiously, I wanted to monitor the status of my pbs_mom (Torque
Scheduler daemon) on each node in my cluster. I found the script
check_pbsmom.sh on the NagiosExchange (snippet below) and copied it to my
/usr/local/nagios/libexec.

Then I added this line to my nrpe.cfg

command[check_pbsmom]=/usr/local/nagios/libexec/check_pbsmom

But then I don't seem to have much success.
remotehost/usr/local/nagios/libexec/check_nrpe -H localhost -c check_pbsmom
NRPE: Unable to read output

If I just run the shell script though it seems to be working
/usr/local/nagios/libexec/check_pbsmom.sh
PBS_MOM OK:  Daemon is running.  Host is listening.


What am I doing wrong here! I'm still a bit confused about the interaction
between command.cfg on the monitoring machine and the nrpe.cfg on the remote
host.

Any advice?

-- 
Rahul

#!/bin/bash
# SYNOPSIS
#   check_pbsmom [TCP port] [TCP port] ...
#
# DESCRIPTION
#   This NAGIOS plugin checks whether: 1) pbs_mom is running and
#   2) the host is listening on the given port(s).  If no port
#   number is specified TCP ports 15002 and 15003 are checked.
#
# AUTHOR
#   wayne.mall...@jcu.edu.au

OK=0
WARN=1
CRITICAL=2
PATH=/bin:/sbin:/usr/bin:/usr/sbin

# Default listening ports are TCP 15004 and 42559.
if [ $# -lt 1 ] ; then
  list=15002 15003
else
  list=$*
fi

if [ `ps -C pbs_mom | wc -l` -lt 2 ]; then
  echo PBS_MOM CRITICAL:  Daemon is NOT running!
  exit $CRITICAL
else
  for port in $list ; do
if [ `netstat -ln | grep -E tcp.*:$port | wc -l` -lt 1 ]; then
  echo PBS_MOM CRITICAL:  Host is NOT listening on TCP port $port!
  exit $CRITICAL
fi
  done
  echo PBS_MOM OK:  Daemon is running.  Host is listening.
  exit $OK
fi
--
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] using NPRE to monitor pbs_mom. Error: NRPE: Unableto read output

2009-01-22 Thread Rahul Nabar
On Thu, Jan 22, 2009 at 3:18 PM, Seth Simmons ssimm...@cymfony.com wrote:

  The filename you specified is check_pbsmom.sh though your command shows
 check_pbsmom

I was careless. That was exactly it! Thanks Seth. My bad. It works now.

-- 
Rahul
--
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] tweaking the order of sorting in nagios lists: numeric rather than alphabetical

2009-01-20 Thread Rahul Nabar
Is there a way to tweak the manner in which nagios sorts the names of hosts
in the Service Status Details? I have hosts named star01, star02 and so on
all the way through star256 and nagios insists on sortin these like so:

star10
star101
star102
[snip]
star119
star12

etc.

Can I make the numbers sort in a numeric order rather than a strict
alphabetical order?

-- 
Rahul
--
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Looking for a Nagios answering service

2009-01-07 Thread Rahul Nabar
On Wed, Jan 7, 2009 at 4:54 PM, Baron Schwartz ba...@percona.com wrote:


   * watch Nagios email or SMS alerts 24/7
   * filter out obvious spam
   * response time must be on the order of minutes
   * call our on-call engineer, and once our engineer acks, the job is done.



Maybe I am missing something. But what's the additional service provided
by this intermediate company, again? I'm just curious. Why cannot an
appropriately set notification-scheme directly targeting your on-call
engineer work? Isn't that the purpose of notification policies?

-- 
Rahul
--
Check out the new SourceForge.net Marketplace.
It is the best place to buy or sell services for
just about anything Open Source.
http://p.sf.net/sfu/Xq1LFB___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] nagios error: Could not stat() command file

2008-12-24 Thread Rahul Nabar
On Wed, Dec 24, 2008 at 8:41 AM, Marc Powell m...@ena.com wrote:
 list. I missed out on this one. Sorry.

 Make sure you've followed the Documentation on enabling External
 Commands. The CGI's use that functionality to send commands to nagios.
 It's disabled by default.


I must be missing out on something very basic. Here's what I have
confirmed in my nagios.cfg :

check_external_commands=1
command_check_interval=-1
command_file=/usr/local/nagios/var/rw/nagios.cmd

My permissions on the directory /usr/local/nagios/var/rw seem correct too:

drwxrwsr-x 2 nagios nagcmd  4096 Dec 22 17:06 rw

cat /etc/group shows that both the required users are a part of the
correct group:

nagcmd:x:1239:nagios,apache

locate nagios.cmd returns a null showing that this file is not
accidentally being created in a wrong location.

getenforce  gives Disabled so I guess it is not
damn-SELINUX-once-more day yet!

At this point I am stumped again! Any other checks I am missing out on?

Ian, I did peruse the list postings from last month on this topic
which is how I came up with these checks I outlined above. In case I
am still missing the relevant instruction it'd be great if you could
point me to the correct post that you might have in mind.

Thanks again guys; and I apologize if I am missing something clearly basic!
-- 
Rahul

--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] nagios error: Could not stat() command file

2008-12-24 Thread Rahul Nabar
On Wed, Dec 24, 2008 at 9:51 AM, Marc Powell m...@ena.com wrote:


 Make sure you've restarted nagios after adding these. Also check for
 errors in nagios.log.


Works! Thanks Marc. I am not sure what it was.

But earlier I was doing a /etc/init.d nagios reload
Now I tried a restart.

Perfect! Thanks again guys!

-- 
Rahul

--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] re-enabling check_snmp

2008-12-24 Thread Rahul Nabar
Getting more and more impressed with Nagios's capabilities I was
getting more ambitious and now was getting it to monitor the switches
on my university-research-computing-cluster as well. The pings work
fine but the SNMP monitoring fails. Digging deeper I noticed that I
did not have the command check_snmp

I think  this is because when I installed Nagios two days ago I did
not have snmpwalk, snmpget etc,.installed on my system. I just did a
yum install net-snmp earlier today.

How can I now retroactively get this check_snmp functionality? Do I
have to do a ./confgure, make , make install dance again on the
nagios_plugins source? That's ok but I was just afraid if it would
overwrite any of my configs etc.

What is the recommended procedure now? [I guess I was stupid in the
fact that I skimped reading config.log in my eagerness to go ahead. I
also find the ./configure has skipped on some other potentially useful
plugins for me eg. mysql]

I hope I am not overusing this group in my eagerness to get more done
with Nagios! Apologize in advance if I did!

-- 
Rahul

--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Cannot add remote-linux server to my setup to be monitored

2008-12-23 Thread Rahul Nabar
On Tue, Dec 23, 2008 at 4:02 AM, Kenneth Holter kenneho@gmail.com wrote:
 Just a little side note: I don't think you need to maintain the hostgroup-
 host relationship in both the hostgroup and host definitions. Keep the
 definition in one of the two to get a cleaner code. Someone please correct
 me if I'm wrong. :)


Thanks guys! I got it working now.

Another question: I see a Critical Notification of the sort:

PROCS CRITICAL: 1217 processes with STATE = RSZDT on my localhost itself.

What is this? Any clues? I'm stumped.

-- 
Rahul

--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Cannot add remote-linux server to my setup to be monitored

2008-12-23 Thread Rahul Nabar
On Tue, Dec 23, 2008 at 6:52 PM, Andy Shellam andy-li...@networkmail.eu wrote:

 It means you have a service check set up to check how many processes are in
 the state RSZDT (I believe these are active processes) with a critical
 threshold.
 The current number of (active?) processes on the machine is 1,217 which is
 above your critical threshold you have defined so Nagios is alerting you
 (good Nagios.)


Good Nagios indeed! It has paid back pretty quickly! Something did
indeed go wrong on my server and had spawned a lot of processes in the
S status. I am looking into this now.

Glad I did not ignore the red critical flag!

-- 
Rahul

--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] nagios error: Could not stat() command file

2008-12-23 Thread Rahul Nabar
When I try to access many of the sub-menu options under nagios (eg.
deactivate the service etc.) I get the following error:

Error: Could not stat() command file '/usr/local/nagios/var/rw/nagios.cmd'!
The external command file may be missing, Nagios may not be running,
and/or Nagios may not be checking external commands.
An error occurred while attempting to commit your command for processing.

I looked in the indicated dir and it seems empty. Should there be
something in there? Does it point to a fault Nagios install? All my
tests seemed OK.

Any suggestions?

-- 
Rahul

--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] nagios error: Could not stat() command file

2008-12-23 Thread Rahul Nabar
On Wed, Dec 24, 2008 at 1:14 AM, Ian Masters i...@acces.co.jp wrote:


 Have you Googled and checked the list archives? I answered this same
 question earlier this month.

Oh! I'll google on the Nagios list. I missed out on this one. Sorry.

-- 
Rahul

--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] Cannot add remote-linux server to my setup to be monitored

2008-12-22 Thread Rahul Nabar
I just installed Nagios and I can monitor my localhost all right. I
tried to start with one of my remote compute-nodes but this does not
seem to work so well.

I see my new group compute-nodes on the web interface but it does
not list the remote machine I tried adding. I'm stumped as to what I
am doing wrong!

To my nagios.cfg I added this line :cfg_file=/usr/local/nagios/etc/hosts.cfg

And made a new /usr/local/nagios/etc/hosts.cfg like so:

define hostgroup{
hostgroup_name  compute-nodes
alias   compute-nodes
members star01
}

define host{
host_name   star01
alias   star01
address 11.0.0.1
hostgroups  compute-nodes
check_command   check-host-alive
max_check_attempts  5
check_period24x7
process_perf_data   0
retain_nonstatus_information0
contact_groups  admins
notification_interval   30
notification_period 24x7
notification_optionsd,u,r
}


Shouldn't this be a basic template to get me started up? What else do
I need to do? Any debug suggestions? A ping to 11.0.0.1 is successful.

-- 
Rahul

--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null