[Nagios-users] People search Engine!!!

2009-09-03 Thread ranjith kumar
Hi All,
 
Use http://ranjithkumar.com/ search Engine to search peoples in 50 states and 
their backgroup check.
 
Regards,
Ranjith

Send free SMS to your Friends on Mobile from your Yahoo! Messenger. Download 
Now! http://messenger.yahoo.com/download.php--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Nagio + SMS

2009-09-03 Thread Martyn
Hi Craig, looks like I was wrong after doing some research

-Original Message-
From: Martyn [mailto:mar...@chetnet.co.uk] 
Sent: 02 September 2009 10:21
To: cr...@hooters-uk.com; christian.masop...@siemens.com;
nagios-users@lists.sourceforge.net
Subject: RE: [Nagios-users] Nagio + SMS

Craig not sure if this is a correct statement but I think that gnokii only
supports Nokia Mobiles, like I said not 100% sure.

-Original Message-
From: cr...@hooters-uk.com [mailto:cr...@hooters-uk.com]
Sent: 02 September 2009 10:15
To: christian.masop...@siemens.com; cr...@hooters-uk.com;
nagios-users@lists.sourceforge.net
Subject: RE: [Nagios-users] Nagio + SMS

Thanks for the replies, I guess we have to be looking at a GSM gateway then
as our rack manager will not allow us to have loose mobiles in cabs.

I have just had a quick Google on some GSM Modems and wondered if any of
these would be OK and use something like gnokii.

GSM Modems
http://www.rfsolutions.co.uk/acatalog/GSM_Modems.html

Looking at this:
http://www.rfsolutions.co.uk/acatalog/Maestro_Industrial_GSM_Modem.html

Thanks


--- Original Message ---
From: Masopust,
Christian[mailto:christian.masop...@siemens.com]
Sent: 02/09/2009 09:55:12
To  : cr...@hooters-uk.com;
nagios-users@lists.sourceforge.net
Cc  : 
Subject : RE: [Nagios-users] Nagio + SMS

 
Hello Craig,

I'm using smstools (
http://smstools.meinemullemaus.de/ ) with a simple GSM-Modem (Siemens TC35)
here and it does all I need.

Christian


--
I sense much NT in you, NT leads to Blue Screen. 
Blue Screen leads to downtime, downtime leads to suffering. NT is the path
to the darkside. 

- Unknown Unix Jedi  

 -Original Message-
 From: cr...@hooters-uk.com [
mailto:cr...@hooters-uk.com]  
 Sent: Wednesday, September 02, 2009 10:29 AM
 To: nagios-users@lists.sourceforge.net
 Subject: [Nagios-users] Nagio + SMS
 
 Hi all
 
 I'm looking for an SMS solution for my Nagios set-up and noticed on 
 the Nagios page that you lean towards the SMS FoxBox, this looks like 
 a nice solution but at the cost of 850 Euro's this is way outside of 
 my budget.
 
 What does this community use and recommend for a nice and simple as 
 well a cheap SMS solutions.
 
 Thanks
 
 Craig
 
 

--
 
 Let Crystal Reports handle the reporting - Free
Crystal 
 Reports 2008 30-Day
 trial. Simplify your report design, integration and deployment - and 
 focus on what you do best, core application coding. Discover
what's new with 
 Crystal Reports now.   http://p.sf.net/sfu/bobj-july 
 ___
 Nagios-users mailing list
 Nagios-users@lists.sourceforge.net
 
https://lists.sourceforge.net/lists/listinfo/nagios-users

 ::: Please include Nagios version, plugin version
(-v) and OS 
 when reporting any issue. 
 ::: Messages without supporting info will risk
being sent to /dev/null
 



--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Monitoring a router

2009-09-03 Thread Jim Avery
2009/9/2 David Dyer-Bennet d...@dd-b.net:
 I'm just looking to use it to filter out failure
 reports from services beyond failed network links.

If that's the case I'd recommend maybe implementing your checks solely
(and somewhat pedantically) from the perspective of what is useful for
Nagios reachability checks.  I would set up a host check for each
interface on the far side of the router (from the Nagios server), but
only for interfaces where there are hosts behind which Nagios is
interested in.  That way, if the one router interface goes down, or if
the whole router goes down, then either way Nagios should handle
reachability properly (assuming you've also set the parent
relationships on the child nodes correctly too).

Cheers,

Jim

--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] nsclient++ check_nt USEDDISKSPACE Segmentation fault

2009-09-03 Thread Natxo Asenjo
On Mon, Aug 31, 2009 at 5:48 PM, Massimo Balestra 
massimobales...@hotmail.com wrote:

  I have a problem monitoring the USEDDISKSPACE on one drive of one of the
 windows servers.



 It is a Windows Server  2003 R2 Standard edition (Service pack 2).

 The problem occurs after I did the last Windows Update last Friday. Before
 it was working.


your update broke it :(

well, two solutions:

1. roll update back;
2. check disks with nrpe in windows:

in your nsc.ini define an nrpe handler like this one:

nrpe_CheckDriveSize=inject CheckDriveSize MinWarn=10% MinCrit=5% CheckAll
FilterType=FIXED FilterType=REMOTE

and your check disk service in nagios would be something like:

check_nrpe -H $HOSTADDRESS$ -c nrpe_CheckDriveSize

it works great like this. We check *all* disks in one go.

natxo
--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] can nagios take some pro-active actions?

2009-09-03 Thread Leonardo Carneiro
hello everyone.

Started to play with Nagios a few days ago and i'm very excited with it. 
I have a very small setup (2 linux server being monitored via npre by a 
third linux server) and i'd wrote some bash scripts to monitor some of 
the services that we run on those services (proprietary services, 
non-standard ones like ssh, apache and that stuff).

I know Nagios can send sms, email and other things to warn 
administrators about problems, but can Nagios take any action to fix the 
problem, like restart the service if reach critical state, or restart 
the service if the service stays critical for more than 5 minutes?

If yes, can someone just point me to the direction i should go? :)

Tks in advance, and sorry about my poor english. I'm from Brazil.
-- 

*Leonardo de Souza Carneiro*
*Veltrac - Tecnologia em Logística.*
lscarne...@veltrac.com.br mailto:lscarne...@veltrac.com.br
http://www.veltrac.com.br http://www.veltrac.com.br/
/Fone Com.: (43)2105-5601/
/Av. Higienópolis 1601 Ed. Eurocenter Sl. 803/
/Londrina- PR/
/Cep: 86015-010/




--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] www.ranjithkumar.com

2009-09-03 Thread Kevin Davison
Nope! Apparently just spam.


-Original Message-
From: James Pratt [mailto:jpr...@norwich.edu] 
Sent: Wednesday, September 02, 2009 11:53 AM
To: ranjith kumar; nagios-users@lists.sourceforge.net
Subject: Re: [Nagios-users] www.ranjithkumar.com



 -Original Message-
 From: ranjith kumar [mailto:ranjithkodu...@yahoo.co.in]
 Sent: Wednesday, September 02, 2009 11:40 AM
 To: nagios-users@lists.sourceforge.net
 Subject: [Nagios-users] www.ranjithkumar.com
 
 Please check this link www.ranjithkumar.com
 

Super... But... Any particular reason why?  

--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus
on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

No virus found in this incoming message.
Checked by AVG - www.avg.com 
Version: 8.5.409 / Virus Database: 270.13.74/2339 - Release Date: 09/02/09
05:50:00


--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] can nagios take some pro-active actions?

2009-09-03 Thread Marc Powell

On Sep 3, 2009, at 6:44 AM, Leonardo Carneiro wrote:

 I know Nagios can send sms, email and other things to warn
 administrators about problems, but can Nagios take any action to fix  
 the
 problem, like restart the service if reach critical state, or restart
 the service if the service stays critical for more than 5 minutes?

Yes. Nagios calls them 'Event handlers'.

--
Marc

--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] Control false-positives alarms

2009-09-03 Thread Lincoln Zuljewic Silva
Good morning to all,

I was reading the Nagios documentation about service and contact
definition and didn’t find any parameter that could be used to control
false-positives alarms.

For example: the CPU load may vary during a time period (30 minutes)
and I would like to receive an alarm message only when the
“max_check_attempts” is reached, or only if the CPU load is greater
than the critical parameter for X minutes.

I’m using Nagios v3.

Regards
--
Lincoln Zuljewic Silva
MSN: lincolnzsi...@gmail.com
Mobile: +55-11-9608-3408
URL: http://meadiciona.com/lsilva/

How often must a question be asked before it’s considered a
frequently asked question?

--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] can nagios take some pro-active actions?

2009-09-03 Thread dave stern - e-mail.pluribus.unum
Ok, everyone agrees event handler can take action to fix a problem but bear in
mind that this comes with caveats. Affectively, nagios event handler is treating
a symptom; the disease goes merely on its way. If a service stops, WHY did
it stop in the first place? Most good sysadmins would tackle the problem from
the system end to insure that the service would never fail again. Furthermore,
let's say a service failed for a reason, eg out of disk space. What
good what it
do to restart the service again? And if you build smarts into the
event handler to
look for and fix such a condition, is that the ONLY condition that could occur
to stop this service?

Having said all this, event handlers do have their place. We in fact use them
to shut down hosts if the temperature gets too hot. You can imagine the
testing we went through before rolling out something like this.



On Thu, Sep 3, 2009 at 7:44 AM, Leonardo
Carneirolscarne...@veltrac.com.br wrote:
 hello everyone.

 Started to play with Nagios a few days ago and i'm very excited with it.
 I have a very small setup (2 linux server being monitored via npre by a
 third linux server) and i'd wrote some bash scripts to monitor some of
 the services that we run on those services (proprietary services,
 non-standard ones like ssh, apache and that stuff).

 I know Nagios can send sms, email and other things to warn
 administrators about problems, but can Nagios take any action to fix the
 problem, like restart the service if reach critical state, or restart
 the service if the service stays critical for more than 5 minutes?

 If yes, can someone just point me to the direction i should go? :)

 Tks in advance, and sorry about my poor english. I'm from Brazil.
 --

 *Leonardo de Souza Carneiro*
 *Veltrac - Tecnologia em Logística.*
 lscarne...@veltrac.com.br mailto:lscarne...@veltrac.com.br
 http://www.veltrac.com.br http://www.veltrac.com.br/
 /Fone Com.: (43)2105-5601/
 /Av. Higienópolis 1601 Ed. Eurocenter Sl. 803/
 /Londrina- PR/
 /Cep: 86015-010/




 --
 Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
 trial. Simplify your report design, integration and deployment - and focus on
 what you do best, core application coding. Discover what's new with
 Crystal Reports now.  http://p.sf.net/sfu/bobj-july
 ___
 Nagios-users mailing list
 Nagios-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/nagios-users
 ::: Please include Nagios version, plugin version (-v) and OS when reporting 
 any issue.
 ::: Messages without supporting info will risk being sent to /dev/null


--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] can nagios take some pro-active actions?

2009-09-03 Thread Leonardo Carneiro
Tks to everyone. Let me explain the situation. The service in question 
is a software developed by my own company. This service consumes files 
in a defined directory, generated by other program. This is the metric i 
use to monitor.

Like any software in constant development, it will eventualy crash or 
freeze. Doing so, the files on the directory end up accumulating. If the 
number of files cross the threshold, the warn or crit flag is set up.

We DO check why the service stoped, but the service must be up and 
running as fast as possible, so this is why we restart the service. 
Later we can check what is going wrong.

I also made, some months ago, a simple bash script that monitors the # 
of files, restart the service if necessary and logs this kind of event.

What i do not know if this is the best aproach. Nagios gives me the 
visual tools to se in real time in a big panel if everything is OK with 
my servers. So i though if it can take proactives actions and if this 
aproach would be better than my simple scripts.

dave stern - e-mail.pluribus.unum escreveu:
 Ok, everyone agrees event handler can take action to fix a problem but bear in
 mind that this comes with caveats. Affectively, nagios event handler is 
 treating
 a symptom; the disease goes merely on its way. If a service stops, WHY did
 it stop in the first place? Most good sysadmins would tackle the problem from
 the system end to insure that the service would never fail again. Furthermore,
 let's say a service failed for a reason, eg out of disk space. What
 good what it
 do to restart the service again? And if you build smarts into the
 event handler to
 look for and fix such a condition, is that the ONLY condition that could occur
 to stop this service?

 Having said all this, event handlers do have their place. We in fact use them
 to shut down hosts if the temperature gets too hot. You can imagine the
 testing we went through before rolling out something like this.



 On Thu, Sep 3, 2009 at 7:44 AM, Leonardo
 Carneirolscarne...@veltrac.com.br wrote:
   
 hello everyone.

 Started to play with Nagios a few days ago and i'm very excited with it.
 I have a very small setup (2 linux server being monitored via npre by a
 third linux server) and i'd wrote some bash scripts to monitor some of
 the services that we run on those services (proprietary services,
 non-standard ones like ssh, apache and that stuff).

 I know Nagios can send sms, email and other things to warn
 administrators about problems, but can Nagios take any action to fix the
 problem, like restart the service if reach critical state, or restart
 the service if the service stays critical for more than 5 minutes?

 If yes, can someone just point me to the direction i should go? :)

 Tks in advance, and sorry about my poor english. I'm from Brazil.
 --

 *Leonardo de Souza Carneiro*
 *Veltrac - Tecnologia em Logística.*
 lscarne...@veltrac.com.br mailto:lscarne...@veltrac.com.br
 http://www.veltrac.com.br http://www.veltrac.com.br/
 /Fone Com.: (43)2105-5601/
 /Av. Higienópolis 1601 Ed. Eurocenter Sl. 803/
 /Londrina- PR/
 /Cep: 86015-010/




 --
 Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
 trial. Simplify your report design, integration and deployment - and focus on
 what you do best, core application coding. Discover what's new with
 Crystal Reports now.  http://p.sf.net/sfu/bobj-july
 ___
 Nagios-users mailing list
 Nagios-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/nagios-users
 ::: Please include Nagios version, plugin version (-v) and OS when reporting 
 any issue.
 ::: Messages without supporting info will risk being sent to /dev/null

 

   

-- 

*Leonardo de Souza Carneiro*
*Veltrac - Tecnologia em Logística.*
lscarne...@veltrac.com.br mailto:lscarne...@veltrac.com.br
http://www.veltrac.com.br http://www.veltrac.com.br/
/Fone Com.: (43)2105-5601/
/Av. Higienópolis 1601 Ed. Eurocenter Sl. 803/
/Londrina- PR/
/Cep: 86015-010/




--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] can nagios take some pro-active actions?

2009-09-03 Thread Leonardo Carneiro
Yeah, i understand that exists some situations that a event handler 
can't effectively fix something, but reading the documention link you 
guys send me, it turns out that this is EXACTLY what i'm looking for. 
check some times, restart, check again, if still down, notify the admin 
somehow.

Thanks again to everyone for your support.

Menard, Chris escreveu:
 We use event_handlers EXACTLY as you describe. Let nagios restart service 
 immediately and THEN figure out why it stopped.

 We all agree that root cause analysis is importantbut often secondary to 
 restoring service.


 -Original Message-
 From: Leonardo Carneiro [mailto:lscarne...@veltrac.com.br] 
 Sent: Thursday, September 03, 2009 10:04 AM
 To: nagios-users@lists.sourceforge.net
 Subject: Re: [Nagios-users] can nagios take some pro-active actions?

 Tks to everyone. Let me explain the situation. The service in question 
 is a software developed by my own company. This service consumes files 
 in a defined directory, generated by other program. This is the metric i 
 use to monitor.

 Like any software in constant development, it will eventualy crash or 
 freeze. Doing so, the files on the directory end up accumulating. If the 
 number of files cross the threshold, the warn or crit flag is set up.

 We DO check why the service stoped, but the service must be up and 
 running as fast as possible, so this is why we restart the service. 
 Later we can check what is going wrong.

 I also made, some months ago, a simple bash script that monitors the # 
 of files, restart the service if necessary and logs this kind of event.

 What i do not know if this is the best aproach. Nagios gives me the 
 visual tools to se in real time in a big panel if everything is OK with 
 my servers. So i though if it can take proactives actions and if this 
 aproach would be better than my simple scripts.

 dave stern - e-mail.pluribus.unum escreveu:
   
 Ok, everyone agrees event handler can take action to fix a problem but bear 
 in
 mind that this comes with caveats. Affectively, nagios event handler is 
 treating
 a symptom; the disease goes merely on its way. If a service stops, WHY did
 it stop in the first place? Most good sysadmins would tackle the problem from
 the system end to insure that the service would never fail again. 
 Furthermore,
 let's say a service failed for a reason, eg out of disk space. What
 good what it
 do to restart the service again? And if you build smarts into the
 event handler to
 look for and fix such a condition, is that the ONLY condition that could 
 occur
 to stop this service?

 Having said all this, event handlers do have their place. We in fact use them
 to shut down hosts if the temperature gets too hot. You can imagine the
 testing we went through before rolling out something like this.



 On Thu, Sep 3, 2009 at 7:44 AM, Leonardo
 Carneirolscarne...@veltrac.com.br wrote:
   
 
 hello everyone.

 Started to play with Nagios a few days ago and i'm very excited with it.
 I have a very small setup (2 linux server being monitored via npre by a
 third linux server) and i'd wrote some bash scripts to monitor some of
 the services that we run on those services (proprietary services,
 non-standard ones like ssh, apache and that stuff).

 I know Nagios can send sms, email and other things to warn
 administrators about problems, but can Nagios take any action to fix the
 problem, like restart the service if reach critical state, or restart
 the service if the service stays critical for more than 5 minutes?

 If yes, can someone just point me to the direction i should go? :)

 Tks in advance, and sorry about my poor english. I'm from Brazil.
 --

 *Leonardo de Souza Carneiro*
 *Veltrac - Tecnologia em Logística.*
 lscarne...@veltrac.com.br mailto:lscarne...@veltrac.com.br
 http://www.veltrac.com.br http://www.veltrac.com.br/
 /Fone Com.: (43)2105-5601/
 /Av. Higienópolis 1601 Ed. Eurocenter Sl. 803/
 /Londrina- PR/
 /Cep: 86015-010/




 --
 Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
 trial. Simplify your report design, integration and deployment - and focus 
 on
 what you do best, core application coding. Discover what's new with
 Crystal Reports now.  http://p.sf.net/sfu/bobj-july
 ___
 Nagios-users mailing list
 Nagios-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/nagios-users
 ::: Please include Nagios version, plugin version (-v) and OS when 
 reporting any issue.
 ::: Messages without supporting info will risk being sent to /dev/null

 
   
   
 

   

-- 

*Leonardo de Souza Carneiro*
*Veltrac - Tecnologia em Logística.*
lscarne...@veltrac.com.br mailto:lscarne...@veltrac.com.br
http://www.veltrac.com.br http://www.veltrac.com.br/
/Fone Com.: (43)2105-5601/
/Av. Higienópolis 1601 Ed. Eurocenter Sl. 803/
/Londrina- PR/
/Cep: 

Re: [Nagios-users] Control false-positives alarms

2009-09-03 Thread Morris, Patrick
Lincoln Zuljewic Silva wrote:
 Good morning to all,

 I was reading the Nagios documentation about service and contact
 definition and didn’t find any parameter that could be used to control
 false-positives alarms.

 For example: the CPU load may vary during a time period (30 minutes)
 and I would like to receive an alarm message only when the
 “max_check_attempts” is reached, or only if the CPU load is greater
 than the critical parameter for X minutes.
   

This is exactly how Nagios works by default. If your max_check_attempts 
is set to 3, for example, it will need 3 non-OK results in a row before 
it will notify.

--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Control false-positives alarms

2009-09-03 Thread Marc Powell

On Sep 3, 2009, at 8:09 AM, Lincoln Zuljewic Silva wrote:

 For example: the CPU load may vary during a time period (30 minutes)
 and I would like to receive an alarm message only when the
 “max_check_attempts” is reached, or only if the CPU load is greater
 than the critical parameter for X minutes.

That's what nagios does. Unless you've set 'is_volatile', nagios will  
only send an alert when max_check_attempts is reached and then again  
every notification_interval.

--
Marc


--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] AIX 6.1 Binaries for nsca, nrpe, nagios-plugins available

2009-09-03 Thread Kyle O'Donnell
I've compiled the binaries and uploaded to the
nagios^H^H^H^H^H^Hmonitoringexchange:

http://www.monitoringexchange.org/cgi-bin/page.cgi?g=Detailed%2F3192.html;d=1

Compiled binaries for AIX 6.1

NRPE 2.12
./configure --prefix=/opt/nagios --enable-command-args --without-ssl

NSCA 2.7.2
./configure --prefix=/opt/nagios --without-mcrypt

check_logfiles 3.0:
./configure --prefix=/opt/nagios --with-seekfiles-dir=/var/tmp
--with-protocols-dir=/var/tmp

nagios-plugins-trunk-200909021200
./configure --prefix=/opt/nagios --enable-perl-modules
--with-ps-command=/usr/sysv/bin/ps -eo 's uid pid ppid vsz rss pcpu
etime comm args' --with-ps-format=%s %d %d %d %d %d %f %s %s %n
--with-ps-cols=10
--with-ps-varlist=procstat,procuid,procpid,procppid,procvsz,procrss,procpcpu,procetime,procprog,pos

ENJOY!

--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] can nagios take some pro-active actions?

2009-09-03 Thread Leandro Quibem Magnabosco
Title: Untitled Document




"Ol" Leonardo,

Please note that Nagios uses mostly scripts to check services/disk/etc.
and that those scripts that 'tell' nagios the status of the
service/daemon/disk/etc.
That said, I think you should not focus on Nagios to be proactive hence
that it's scripts could be used for that.

Let's say you have check_http configured to check www.example.com.
It would connect to www.example.com on port 80 and report if it suceeds
on sending a cmd or not .
If something goes wrong, it would send a critical message back.
This does not mean that the script is necessarily alerting nagios about
the problem, it is alerting whatever called it in the first place.

What I mean is, you don't *need* nagios to be in the middle of it and
(IMO) you should not try to integrate it into this kind of solution
because it would just make things more complicated.

One simple way to implement that would be improving the scripts that
comes with nagios-plugins.
A "simple" if statement and some coding after it would do the trick.
If the script already has the capability to check the status of
something and be aware of the present status and take active measures .

Interfere with the "function" that prints the message "CRITICAL" to
make it, for eg., "ssh -T host /etc/init.d/apache2 restart".

This email might be a little confusing cause I'm not thinking proper
english this morning, but I would be happy to help you in our both
mother language (portuguese) if you prefer.

Good luck for now,




  

  
  
  
  
  
  Leandro
Quibem Magnabosco
Consultor de TI
(48) 3251-5323
  leandro.magnabo...@fcdl-sc.org.br
  www.fcdl-sc.org.br
  Rua: Rafael Bandeira, 41
CEP. 88015-450 Florianpolis - SC
  
  

  

"Este  um e-mail oriundo da Federao das Cmaras de
Dirigentes Lojistas de Santa Catarina, e seu contedo  confidencial e
destinado exclusivamente a seu(s) destinatrio(s), no podendo ser
copiado ou repassado,no todo ou em parte, a terceiros. Se esta mensagem
foi-lhe enviada por engano, pedimos o obsquio de entrar em contato
conosco.
This is an e-mail from the Federao das Cmaras de Dirigentes Lojistas
de Santa Catarina and its contents are privileged and confidential to
the ordinary user(s) of the e-mail address(es) to which it was
addressed, and no one else may copy or forward all or any of it in any
form. If this e-mail was sent to you in error, please contact us."



Leonardo Carneiro escreveu:

  hello everyone.

Started to play with Nagios a few days ago and i'm very excited with it. 
I have a very small setup (2 linux server being monitored via npre by a 
third linux server) and i'd wrote some bash scripts to monitor some of 
the services that we run on those services (proprietary services, 
non-standard ones like ssh, apache and that stuff).

I know Nagios can send sms, email and other things to warn 
administrators about problems, but can Nagios take any action to fix the 
problem, like restart the service if reach critical state, or restart 
the service if the service stays critical for more than 5 minutes?

If yes, can someone just point me to the direction i should go? :)

Tks in advance, and sorry about my poor english. I'm from Brazil.
  



--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] can nagios take some pro-active actions?

2009-09-03 Thread Menard, Chris
We use event_handlers EXACTLY as you describe. Let nagios restart service 
immediately and THEN figure out why it stopped.

We all agree that root cause analysis is importantbut often secondary to 
restoring service.


-Original Message-
From: Leonardo Carneiro [mailto:lscarne...@veltrac.com.br] 
Sent: Thursday, September 03, 2009 10:04 AM
To: nagios-users@lists.sourceforge.net
Subject: Re: [Nagios-users] can nagios take some pro-active actions?

Tks to everyone. Let me explain the situation. The service in question 
is a software developed by my own company. This service consumes files 
in a defined directory, generated by other program. This is the metric i 
use to monitor.

Like any software in constant development, it will eventualy crash or 
freeze. Doing so, the files on the directory end up accumulating. If the 
number of files cross the threshold, the warn or crit flag is set up.

We DO check why the service stoped, but the service must be up and 
running as fast as possible, so this is why we restart the service. 
Later we can check what is going wrong.

I also made, some months ago, a simple bash script that monitors the # 
of files, restart the service if necessary and logs this kind of event.

What i do not know if this is the best aproach. Nagios gives me the 
visual tools to se in real time in a big panel if everything is OK with 
my servers. So i though if it can take proactives actions and if this 
aproach would be better than my simple scripts.

dave stern - e-mail.pluribus.unum escreveu:
 Ok, everyone agrees event handler can take action to fix a problem but bear in
 mind that this comes with caveats. Affectively, nagios event handler is 
 treating
 a symptom; the disease goes merely on its way. If a service stops, WHY did
 it stop in the first place? Most good sysadmins would tackle the problem from
 the system end to insure that the service would never fail again. Furthermore,
 let's say a service failed for a reason, eg out of disk space. What
 good what it
 do to restart the service again? And if you build smarts into the
 event handler to
 look for and fix such a condition, is that the ONLY condition that could occur
 to stop this service?

 Having said all this, event handlers do have their place. We in fact use them
 to shut down hosts if the temperature gets too hot. You can imagine the
 testing we went through before rolling out something like this.



 On Thu, Sep 3, 2009 at 7:44 AM, Leonardo
 Carneirolscarne...@veltrac.com.br wrote:
   
 hello everyone.

 Started to play with Nagios a few days ago and i'm very excited with it.
 I have a very small setup (2 linux server being monitored via npre by a
 third linux server) and i'd wrote some bash scripts to monitor some of
 the services that we run on those services (proprietary services,
 non-standard ones like ssh, apache and that stuff).

 I know Nagios can send sms, email and other things to warn
 administrators about problems, but can Nagios take any action to fix the
 problem, like restart the service if reach critical state, or restart
 the service if the service stays critical for more than 5 minutes?

 If yes, can someone just point me to the direction i should go? :)

 Tks in advance, and sorry about my poor english. I'm from Brazil.
 --

 *Leonardo de Souza Carneiro*
 *Veltrac - Tecnologia em Logística.*
 lscarne...@veltrac.com.br mailto:lscarne...@veltrac.com.br
 http://www.veltrac.com.br http://www.veltrac.com.br/
 /Fone Com.: (43)2105-5601/
 /Av. Higienópolis 1601 Ed. Eurocenter Sl. 803/
 /Londrina- PR/
 /Cep: 86015-010/




 --
 Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
 trial. Simplify your report design, integration and deployment - and focus on
 what you do best, core application coding. Discover what's new with
 Crystal Reports now.  http://p.sf.net/sfu/bobj-july
 ___
 Nagios-users mailing list
 Nagios-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/nagios-users
 ::: Please include Nagios version, plugin version (-v) and OS when reporting 
 any issue.
 ::: Messages without supporting info will risk being sent to /dev/null

 

   

-- 

*Leonardo de Souza Carneiro*
*Veltrac - Tecnologia em Logística.*
lscarne...@veltrac.com.br mailto:lscarne...@veltrac.com.br
http://www.veltrac.com.br http://www.veltrac.com.br/
/Fone Com.: (43)2105-5601/
/Av. Higienópolis 1601 Ed. Eurocenter Sl. 803/
/Londrina- PR/
/Cep: 86015-010/




--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july

Re: [Nagios-users] can nagios take some pro-active actions?

2009-09-03 Thread Menard, Chris
Using nagios event_handlers provided a couple benefits.

First, you can configure when to restart or correct the service.  You can wait 
for the 2nd or 3rd SOFT non-OK status to perform the corrective action.  This 
takes into account false-positives (as described in another thread today) and 
will avoid corrective actions when not needed.

Second, by using nagios event_handlers, you have a record of all actions in one 
place, and visibility to the issue.

From: Leandro Quibem Magnabosco [mailto:leandro.magnabo...@fcdl-sc.org.br]
Sent: Thursday, September 03, 2009 8:19 AM
To: Leonardo Carneiro
Cc: nagios-users@lists.sourceforge.net
Subject: Re: [Nagios-users] can nagios take some pro-active actions?

Olá Leonardo,

Please note that Nagios uses mostly scripts to check services/disk/etc. and 
that those scripts that 'tell' nagios the status of the service/daemon/disk/etc.
That said, I think you should not focus on Nagios to be proactive hence that 
it's scripts could be used for that.

Let's say you have check_http configured to check 
www.example.comhttp://www.example.com.
It would connect to www.example.comhttp://www.example.com on port 80 and 
report if it suceeds on sending a cmd or not .
If something goes wrong, it would send a critical message back.
This does not mean that the script is necessarily alerting nagios about the 
problem, it is alerting whatever called it in the first place.

What I mean is, you don't *need* nagios to be in the middle of it and (IMO) you 
should not try to integrate it into this kind of solution because it would just 
make things more complicated.

One simple way to implement that would be improving the scripts that comes with 
nagios-plugins.
A simple if statement and some coding after it would do the trick.
If the script already has the capability to check the status of something and 
be aware of the present status and take active measures .

Interfere with the  function that prints the message CRITICAL to make it, 
for eg., ssh -T host /etc/init.d/apache2 restart.

This email might be a little confusing cause I'm not thinking proper english 
this morning, but I would be happy to help you in our both mother language 
(portuguese) if you prefer.

Good luck for now,

[cid:image001.gif@01CA2C8B.2B064D30]




Leandro Quibem Magnabosco
Consultor de TI
(48) 3251-5323
leandro.magnabo...@fcdl-sc.org.brmailto:leandro.magnabo...@fcdl-sc.org.br
www.fcdl-sc.org.brhttp://www.fcdl-sc.org.br
Rua: Rafael Bandeira, 41
CEP. 88015-450  Florianópolis - SC

Este é um e-mail oriundo da Federação das Câmaras de Dirigentes Lojistas de 
Santa Catarina, e seu conteúdo é confidencial e destinado exclusivamente a 
seu(s) destinatário(s), não podendo ser copiado ou repassado,no todo ou em 
parte, a terceiros. Se esta mensagem foi-lhe enviada por engano, pedimos o 
obséquio de entrar em contato conosco.
This is an e-mail from the Federação das Câmaras de Dirigentes Lojistas de 
Santa Catarina and its contents are privileged and confidential to the ordinary 
user(s) of the e-mail address(es) to which it was addressed, and no one else 
may copy or forward all or any of it in any form. If this e-mail was sent to 
you in error, please contact us.


Leonardo Carneiro escreveu:

hello everyone.



Started to play with Nagios a few days ago and i'm very excited with it.

I have a very small setup (2 linux server being monitored via npre by a

third linux server) and i'd wrote some bash scripts to monitor some of

the services that we run on those services (proprietary services,

non-standard ones like ssh, apache and that stuff).



I know Nagios can send sms, email and other things to warn

administrators about problems, but can Nagios take any action to fix the

problem, like restart the service if reach critical state, or restart

the service if the service stays critical for more than 5 minutes?



If yes, can someone just point me to the direction i should go? :)



Tks in advance, and sorry about my poor english. I'm from Brazil.


inline: image001.gif--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] nsclient++ check_nt USEDDISKSPACE Segmentation fault

2009-09-03 Thread Massimo Balestra
 

Hi 

 

Thank you for the answer.

 

I don't think the problem is the windows update. I think that there is a
problem that showed up only after the windows update.

 

And the problem is: accessing the drive C from a service I get this error:

Could not get free space for: c: c: reason: 32: The process cannot access
the file because it is being used by another process.

 

In other words: a permission denied.

 

And, if this is true,  don't solve it setting up another way to perform my
check.

Is there anybody who already had this problem and can help me solve it? I
already tried to run the nsclient++ service as system account and as
administrator (with its password). Same result.

 

Thank you.

Massimo

 

 

From: Natxo Asenjo [mailto:natxo.ase...@gmail.com] 
Sent: Thursday, September 03, 2009 4:25 AM
To: Nagios Users Mailinglist
Subject: Re: [Nagios-users] nsclient++ check_nt USEDDISKSPACE Segmentation
fault

 

On Mon, Aug 31, 2009 at 5:48 PM, Massimo Balestra
massimobales...@hotmail.com wrote:

I have a problem monitoring the USEDDISKSPACE on one drive of one of the
windows servers.

 

It is a Windows Server  2003 R2 Standard edition (Service pack 2).

The problem occurs after I did the last Windows Update last Friday. Before
it was working.

 

your update broke it :(

 

well, two solutions:

 

1. roll update back;

2. check disks with nrpe in windows:

 

in your nsc.ini define an nrpe handler like this one:

 

nrpe_CheckDriveSize=inject CheckDriveSize MinWarn=10% MinCrit=5% CheckAll
FilterType=FIXED FilterType=REMOTE

 

and your check disk service in nagios would be something like:

 

check_nrpe -H $HOSTADDRESS$ -c nrpe_CheckDriveSize

 

it works great like this. We check *all* disks in one go.

 

natxo

--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] NRPE gives wrong exit codes

2009-09-03 Thread Ciro Iriarte
Hi, i'm trying to monitor a Solaris box using NRPE. The thing is
nagios sees always an EXIT_CODE=0.

Running the check by hand on the Solaris host works as expected:

--
[solaris ~]$ /usr/local/nagios/libexec/check_disk -w 50% -c 10% -p /kml_
DISK WARNING - free space: /kml_inst2 101125 MB (28% inode=100%);|
/kml_inst2=248599MB;174862;314752;0;349725
[solaris ~]$ echo $?
1
--

But running it from the nagios host I get:
-
spmon:~ # /usr/lib/nagios/plugins/check_nrpe -H solaris -c check_disk
-a 50% 10% /test
DISK WARNING - free space: /test 101125 MB (28% inode=100%);|
/test=248599MB;174862;314752;0;349725
spmon:~ # echo $?
0
-

Versions:

check_nrpe
--
NRPE Plugin for Nagios
Copyright (c) 1999-2008 Ethan Galstad (nag...@nagios.org)
Version: 2.12
Last Modified: 03-10-2008
License: GPL v2 with exemptions (-l for more info)
SSL/TLS Available: Anonymous DH Mode, OpenSSL 0.9.6 or higher required
-

remote NRPE agent
-
spmon:~ # /usr/lib/nagios/plugins/check_nrpe -H solaris
NRPE v2.12


It's a bug in NRPE agent?, configuration error?

Regards,

-- 
Ciro Iriarte
http://cyruspy.wordpress.com
--

--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] service dependency problem - config check hangs

2009-09-03 Thread Terry
Hello,

When I try to check my config, it hangs on Checking for circular host
and service dependencies

I have this service dependency:
define servicedependency{
hostgroup_name  windows
service_description nrpe
dependent_hostgroup_namewindows
dependent_service_description   automatic services
inherits_parent 1
execution_failure_criteria  n
notification_failure_criteria   w,u,c
}
define hostgroup {
hostgroup_name  windows
alias   windows
}
define service {
use service-active
hostgroup_name  windows
service_description automatic services
check_command   check_nrpe_win_all_services!exclude=SysmonLog
}
define service {
use service-active
hostgroup_name  windows
service_description nrpe
check_command   check_tcp!5666
}

Any ideas?

--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] NRPE gives wrong exit codes

2009-09-03 Thread Morris, Patrick
On Thu, 03 Sep 2009, Ciro Iriarte wrote:

 Hi, i'm trying to monitor a Solaris box using NRPE. The thing is
 nagios sees always an EXIT_CODE=0.
 
 Running the check by hand on the Solaris host works as expected:
 
 --
 [solaris ~]$ /usr/local/nagios/libexec/check_disk -w 50% -c 10% -p /kml_
 DISK WARNING - free space: /kml_inst2 101125 MB (28% inode=100%);|
 /kml_inst2=248599MB;174862;314752;0;349725
 [solaris ~]$ echo $?
 1
 --
 
 But running it from the nagios host I get:
 -
 spmon:~ # /usr/lib/nagios/plugins/check_nrpe -H solaris -c check_disk
 -a 50% 10% /test
 DISK WARNING - free space: /test 101125 MB (28% inode=100%);|
 /test=248599MB;174862;314752;0;349725
 spmon:~ # echo $?
 0
 -

How is check_disk defined in your nrpe config and is it configured to 
allow arguments?

--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] any solution which will allow me to forward Nagios alerts to Netcool?

2009-09-03 Thread Scott Xiao
 
Hi Max
Thanks for your help!
I am now ok to use the snmptrap to send out alert. But it seems the
nagios doesn't involk the send_trap script to do the job.
I setup same on a vm with nagios and use my laptop with wireshark to
monitor incoming snmp traffic.
In localhost.cfg, I changed to 1 and 2 for current users to create
warning. 
I can see warning on nagios UI but still nothing to capture on sniffer
with 
port 162. unless I manual to send trap..anything I mssing? I also
attached 
the result that sniffer recieved, and my cfg as below. please advise,
Thanks!


define service{
use local-service ; Name of 
service
template to use
host_name   localhost
service_description Current Users
 event_handler   send_trap
event_handler_enabled1
check_command   check_local_users!1!2
}




commands.cfg

# 'send_trap' command definition
define command{
command_name send_trap
command_line /optnagios/libexec/send_trap  192.168.32.1 public
$HOSTNAME$ 
$SERVICEDESC$ $SERVICESTATEID$ $SERVICEOUTPUT$
}



[r...@nagios libexec]# more send_trap
#!/bin/sh
# Arguments:
# $1 = Management Station
# $2 = Community String
# $3 = host_name
# $4 = service_description (Description of the service)
# $5 = return_code (An integer that determines the state
# of the service check, 0=OK, 1=WARNING, 2=CRITICAL,
# 3=UNKNOWN).
# $6 = plugin_output (A text string that should be used
# as the plugin output for the service check)
#
#
echo $1  /var/tmp/debug
echo $2  /var/tmp/debug
echo $3  /var/tmp/debug
echo $HOSTNAME$  /var/tmp/debug
echo $SERVICEDESC$   /var/tmp/debug
echo $SERVICESTATEID$  /var/tmp/debug
echo $SERVICEOUTPUT$  /var/tmp/debug

/usr/bin/snmptrap -v 2c -c $2 $1 '' NAGIOS-NOTIFY-MIB::nSvcEvent 
nSvcHostname s $3 nSvcDesc s $
4  nSvcStateID i $5 nSvcOutput s $6

#/usr/bin/snmptrap -v 2c -c $2 192.168.32.1  ''
NAGIOS-NOTIFY-MIB::nSvcEvent 
nSvcHostname s $3
 nSvcDesc s $4 nSvcStateID i $5 nSvcOutput s $6
[r...@nagios libexec]#


[r...@nagios libexec]# /usr/bin/snmptrap -v 2c -c public 192.168.32.1 ''

NAGIOS-NOTIFY-MIB::nSvcEvent nSvcHostname s localhost nSvcDesc s some 
service desc supposed to come from nagios  nSvcStateID i 0 nSvcOutput s

some service may down testing
[r...@nagios libexec]# tail /var/tmp/debug
192.168.32.1
public
localhost
nagios.localdomain$
$
$
$
[r...@nagios libexec]#

-Original Message-
From: max.schub...@gmail.com [mailto:max.schub...@gmail.com] On Behalf
Of Max
Sent: Wednesday, August 26, 2009 12:31 PM
To: Scott Xiao
Cc: nagios-users@lists.sourceforge.net
Subject: Re: [Nagios-users] any solution which will allow me to forward
Nagios alerts to Netcool?

Scott,

On Wed, Aug 26, 2009 at 12:02 AM, Scott Xiaoscott_x...@nec.com.sg
wrote:

  Hi friends
 Is there any solution which will allow me to forward
 Nagios alerts to Netcool? I read looperng but not many details on how
to
 forward the alert to netcool, any advice (url /docs)?
 Thanks
 Scott

Does Netcool have a trap receiver?  If so, you can forward alerts with
SNMP.  We forward Nagios notifications to Spectrum by sending them as
traps using the NAGIOS-NOTIFY-MIB service and host events .. works
quite well.

- max


nagiosforwarding1.pcap
Description: nagiosforwarding1.pcap
--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] NRPE gives wrong exit codes

2009-09-03 Thread Ciro Iriarte
2009/9/3 Morris, Patrick patrick.mor...@hp.com:
 On Thu, 03 Sep 2009, Ciro Iriarte wrote:

 Hi, i'm trying to monitor a Solaris box using NRPE. The thing is
 nagios sees always an EXIT_CODE=0.

 Running the check by hand on the Solaris host works as expected:

 --
 [solaris ~]$ /usr/local/nagios/libexec/check_disk -w 50% -c 10% -p /kml_
 DISK WARNING - free space: /kml_inst2 101125 MB (28% inode=100%);|
 /kml_inst2=248599MB;174862;314752;0;349725
 [solaris ~]$ echo $?
 1
 --

 But running it from the nagios host I get:
 -
 spmon:~ # /usr/lib/nagios/plugins/check_nrpe -H solaris -c check_disk
 -a 50% 10% /test
 DISK WARNING - free space: /test 101125 MB (28% inode=100%);|
 /test=248599MB;174862;314752;0;349725
 spmon:~ # echo $?
 0
 -

 How is check_disk defined in your nrpe config and is it configured to
 allow arguments?


I have:


dont_blame_nrpe=1
command[check_disk]=/usr/local/nagios/libexec/check_disk -w $ARG1$ -c
$ARG2$ -p $ARG3$


Regards,

-- 
Ciro Iriarte
http://cyruspy.wordpress.com
--

--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] nagios scheduling and Hard/soft state question

2009-09-03 Thread shadih rahman
All,
   according to the definition hard state is reached upon completing the
max_check_attempt .  This particular service status information is stating
otherwise.  This particular service check has max_check_attempt set to 3.
However it looks like soft state changed into Hard with checking for 3
times.  Is this a bug or am I missing something here.  Please advise on
this.  Thanks

[09-02-2009 13:58:30] SERVICE ALERT: Host
B;batteryliebert;WARNING;HARD;1;Status is a WARNING level - SNMP OID does
not exist
[image: Service Warning][09-02-2009 13:56:30] SERVICE ALERT:
HOSTA;batteryliebert;WARNING;SOFT;1;Status is a WARNING level - SNMP agent
not responding

-- 
Cordially,
Shadhin Rahman
--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] any solution which will allow me to forward Nagios alerts to Netcool?

2009-09-03 Thread Marc Powell

On Sep 3, 2009, at 1:54 PM, Scott Xiao wrote:


 Hi Max
 Thanks for your help!
 I am now ok to use the snmptrap to send out alert. But it seems the
 nagios doesn't involk the send_trap script to do the job.

 # 'send_trap' command definition
 define command{
command_name send_trap
 command_line /optnagios/libexec/send_trap  192.168.32.1 public  
 $HOSTNAME$ $SERVICEDESC$ $SERVICESTATEID$ $SERVICEOUTPUT$
}

/optnagios/ - is that an actual paste or a typo on your part?

If your hostname, servicedesc, servicestateid or serviceoutput contain  
spaces or special characters, you'll want to quote them above.

 [r...@nagios libexec]# /usr/bin/snmptrap -v 2c -c public  
 192.168.32.1 '' [strangely line wrapped stuff removed]

Nagios doesn't run this as root, you shouldn't either. Permission  
problems not seen when testing as root are a common issue when  
integrating something new. Also note that the way you are testing is  
different than the command_line you specify above in many ways.

--
Marc


--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] nagios scheduling and Hard/soft state question

2009-09-03 Thread Marc Powell

On Sep 3, 2009, at 2:29 PM, shadih rahman wrote:

 All,
according to the definition hard state is reached upon completing  
 the max_check_attempt .  This particular service status information  
 is stating otherwise.  This particular service check has  
 max_check_attempt set to 3.  However it looks like soft state  
 changed into Hard with checking for 3 times.  Is this a bug or am I  
 missing something here.  Please advise on this.  Thanks

 [09-02-2009 13:58:30] SERVICE ALERT: Host  
 B;batteryliebert;WARNING;HARD;1;Status is a WARNING level - SNMP OID  
 does not exist
 [09-02-2009 13:56:30] SERVICE ALERT:  
 HOSTA;batteryliebert;WARNING;SOFT;1;Status is a WARNING level - SNMP  
 agent not responding

Do you have 'is_volatile' enabled? Please post the entire service  
definition from objects.cache if you do not as well as a few prior log  
entries for this service.

--
Marc


--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] NRPE: Unable to read output

2009-09-03 Thread Matthew Litwin
I wrote a perl plugin that seems to work fine when I run it locally on the
remote host as the nagios user, however when I try to execute it via NRPE I
get the old familiar nebulous ³NRPE: Unable to read output². I have
debugging on for NRPE logging and it doesn¹t tell much more. (Note this is
in reverse order is it is out of splunk. IPs are obfuscated)

Sep  3 19:27:01 sjvp00dbs001.XX.com Sep  3 19:27:01 nrpe[12434]: [ID 903583
daemon.debug] Connection from X.X.X.X closed.
host=sjvp00dbs001.XX.com Options sourcetype=syslog Options source=udp:514
Options
589/3/09 7:27:01.000 PM
Sep  3 19:27:01 sjvp00dbs001.XX.com Sep  3 19:27:01 nrpe[12434]: [ID 869297
daemon.debug] Return Code: 1, Output: NRPE: Unable to read output
host=sjvp00dbs001.XX.com Options sourcetype=syslog Options source=udp:514
Options
599/3/09 7:27:01.000 PM
Sep  3 19:27:01 sjvp00dbs001.XX.com Sep  3 19:27:01 nrpe[12434]: [ID 757686
daemon.debug] Command completed with return code 1 and output:
host=sjvp00dbs001.XX.com Options sourcetype=syslog Options source=udp:514
Options
609/3/09 7:27:01.000 PM
Sep  3 19:27:01 sjvp00dbs001.XX.com Sep  3 19:27:01 nrpe[12434]: [ID 462736
daemon.debug] Running command:
/usr/loca/nagios/libexec/check_scrub_backlog.pl -w 1000 -c 2000 -u  -p
 -i 
Sep  3 19:27:01 sjvp00dbs001.XX.com Sep  3 19:27:01 nrpe[12434]: [ID 881351
daemon.debug] Host is asking for command 'check_scrub_backlog' to be run...
host=sjvp00dbs001.XX.com Options sourcetype=syslog Options source=udp:514
Options
629/3/09 7:27:01.000 PM
Sep  3 19:27:01 sjvp00dbs001.XX.com Sep  3 19:27:01 nrpe[12434]: [ID 385967
daemon.debug] Host address is in allowed_hosts
host=sjvp00dbs001.XX.com Options sourcetype=syslog Options source=udp:514
Options
639/3/09 7:27:01.000 PM
Sep  3 19:27:01 sjvp00dbs001.XX.com Sep  3 19:27:01 nrpe[12434]: [ID 654915
daemon.debug] Connection from X.X.X.X port 34549
host=sjvp00dbs001.XX.com Options sourcetype=syslog Options source=udp:514
Options
649/3/09 7:27:01.000 PM
Sep  3 19:27:01 sjvp00dbs001.XX.com Sep  3 19:27:01 nrpe[12434]: [ID 879649
daemon.debug] Handling the connection...

The command that is trying to run looks corrected- arguments are being sent
correctly. I have tested the command using the same user that NRPE is
running as, nagios. I did build in output for error messages in case there
were problems like issues with ENV, but obviously they are not making it to
NRPE. I confirmed my output text has EOLs. What is especially weird is that
the return code makes it back correctly but the output is just ³NRPE: Unable
to read output² even as logged by nrpe.

The output should look like this, which is what it gets locally when ran:
Scrub Backlog Count: 1083

Nothing unusually with that, right?

Any help appreciated, thanks.



Does anyone have any other suggestions on how I might debug this? Since NRPE
is working for all my other plugins
--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] NRPE: Unable to read output

2009-09-03 Thread Matthew Litwin
I guess I should include the plugin script:

#!/usr/bin/perl -w

#check for item count in ListingCatlogue URL

use POSIX;
use strict;
use File::Basename;
use Getopt::Long;


use vars qw(
$opt_critical
$opt_warning
$opt_username
$opt_password
$opt_instance
$opt_help
$opt_usage
$opt_version
   );

sub print_usage();
sub print_help();

#
# Version and Author info
my $author = Matthew Litwin;
my $datemod = September 2, 2009;
my $version = 0.1.0;
# 0.1.0 - initial cannibalization from check_lcs_update.pl

my $progname = basename($0);

my %ERRORS = ('UNKNOWN'  = '-1',
  'OK'   = '0',
  'WARNING'  = '1',
  'CRITICAL' = '2');

my $sqlplus = '/opt/oracle/product/10gRac/bin/sqlplus';
my $sqlfile = '/usr/local/nagios/libexec/check_scrub_backlog.sql';

Getopt::Long::Configure('bundling');
GetOptions
  (
   c=s = \$opt_critical, critical=s = \$opt_critical,
   w=s = \$opt_warning,  warning=s  = \$opt_warning,
   u=s = \$opt_username, username=s = \$opt_username,
   p=s = \$opt_password, password=s = \$opt_password,
   i=s = \$opt_instance, instance=s = \$opt_instance,
   h   = \$opt_help, help   = \$opt_help,
usage  = \$opt_usage,
   V   = \$opt_version,version= \$opt_version
  ) || die Try `$progname --help' for more information.\n;

sub print_usage() {
  print Usage: $progname -w WARNING -c CRITICAL -u username -p password -i
instance\n;
  print$progname --help\n;
  print$progname --version\n;
}

sub print_help() {
  print $progname - check item count in ListingCatlogue URL\n;
  print Options are:\n;
  print   -c, --critical\n;
  print   -w, --warning\n;
  print   -u, --username\n;
  print   -p, --password\n;
  print   -i, --instance\n;
  print   -h, --help  display this help and exit\n;
  print   --usage display a short usage
instruction\n;
  print   -V, --version   output version information and
exit\n;
}

if ($opt_help) {
  print_help();
  exit $ERRORS{'UNKNOWN'};
}

if ($opt_usage || !($opt_critical  $opt_warning  $opt_username 
$opt_password  $opt_instance  $opt_instance))  {
  print_usage();
  exit $ERRORS{'UNKNOWN'};
}

if ($opt_version) {
  print $progname $version\n;
  print $author, $datemod\n;
  exit $ERRORS{'UNKNOWN'};
}

if (!-x $sqlplus) {
  print sqlplus not found or not executable at: $sqlplus\n;
  exit $ERRORS{'UNKNOWN'};
}

if (!-r $sqlfile) {
  print sql command file not found or not readable at: $sqlfile\n;
  exit $ERRORS{'UNKNOWN'};
}
 
# This SQL request returns a row count.
my $scrubcount=`$sqlplus -S $opt_username/$opt_passwo...@$opt_instance
\...@$sqlfile`;

if ($? == 0) {
  chomp($scrubcount);
  $scrubcount =~ s/^\s+//;
  if ($scrubcount =~ m/^\d+$/) {
print Scrub Backlog Count: $scrubcount\n;
my $state;
$state=OK;
if ($scrubcount = $opt_warning) {$state=WARNING;}
if ($scrubcount = $opt_critical) {$state=CRITICAL;}
exit $ERRORS{$state};
  } else {
print Output not a numeric value\n;
exit $ERRORS{'UNKNOWN'};
  }
} else {
  print sqlplus error: $?\n;
  exit $ERRORS{'UNKNOWN'};
}


On 9/3/09 12:48 PM, Matthew Litwin mlit...@stubhub.com wrote:

 I wrote a perl plugin that seems to work fine when I run it locally on the
 remote host as the nagios user, however when I try to execute it via NRPE I
 get the old familiar nebulous ³NRPE: Unable to read output². I have debugging
 on for NRPE logging and it doesn¹t tell much more. (Note this is in reverse
 order is it is out of splunk. IPs are obfuscated)
 
 Sep  3 19:27:01 sjvp00dbs001.XX.com Sep  3 19:27:01 nrpe[12434]: [ID 903583
 daemon.debug] Connection from X.X.X.X closed.
 host=sjvp00dbs001.XX.com Options sourcetype=syslog Options source=udp:514
 Options
 589/3/09 7:27:01.000 PM
 Sep  3 19:27:01 sjvp00dbs001.XX.com Sep  3 19:27:01 nrpe[12434]: [ID 869297
 daemon.debug] Return Code: 1, Output: NRPE: Unable to read output
 host=sjvp00dbs001.XX.com Options sourcetype=syslog Options source=udp:514
 Options
 599/3/09 7:27:01.000 PM
 Sep  3 19:27:01 sjvp00dbs001.XX.com Sep  3 19:27:01 nrpe[12434]: [ID 757686
 daemon.debug] Command completed with return code 1 and output:
 host=sjvp00dbs001.XX.com Options sourcetype=syslog Options source=udp:514
 Options
 609/3/09 7:27:01.000 PM
 Sep  3 19:27:01 sjvp00dbs001.XX.com Sep  3 19:27:01 nrpe[12434]: [ID 462736
 daemon.debug] Running command: /usr/loca/nagios/libexec/check_scrub_backlog.pl
 -w 1000 -c 2000 -u  -p  -i 
 Sep  3 19:27:01 sjvp00dbs001.XX.com Sep  3 19:27:01 nrpe[12434]: [ID 881351
 daemon.debug] Host is asking for command 'check_scrub_backlog' to be run...
 host=sjvp00dbs001.XX.com Options sourcetype=syslog Options source=udp:514
 Options
 629/3/09 7:27:01.000 PM
 Sep  3 19:27:01 sjvp00dbs001.XX.com Sep  3 19:27:01 nrpe[12434]: [ID 385967
 daemon.debug] Host address is in allowed_hosts
 

Re: [Nagios-users] NRPE: Unable to read output

2009-09-03 Thread Matthew Litwin
Please disregard. I see the typo in the NRPE command path. So sorry!


On 9/3/09 12:48 PM, Matthew Litwin mlit...@stubhub.com wrote:

 I wrote a perl plugin that seems to work fine when I run it locally on the
 remote host as the nagios user, however when I try to execute it via NRPE I
 get the old familiar nebulous ³NRPE: Unable to read output². I have debugging
 on for NRPE logging and it doesn¹t tell much more. (Note this is in reverse
 order is it is out of splunk. IPs are obfuscated)
 
 Sep  3 19:27:01 sjvp00dbs001.XX.com Sep  3 19:27:01 nrpe[12434]: [ID 903583
 daemon.debug] Connection from X.X.X.X closed.
 host=sjvp00dbs001.XX.com Options sourcetype=syslog Options source=udp:514
 Options
 589/3/09 7:27:01.000 PM
 Sep  3 19:27:01 sjvp00dbs001.XX.com Sep  3 19:27:01 nrpe[12434]: [ID 869297
 daemon.debug] Return Code: 1, Output: NRPE: Unable to read output
 host=sjvp00dbs001.XX.com Options sourcetype=syslog Options source=udp:514
 Options
 599/3/09 7:27:01.000 PM
 Sep  3 19:27:01 sjvp00dbs001.XX.com Sep  3 19:27:01 nrpe[12434]: [ID 757686
 daemon.debug] Command completed with return code 1 and output:
 host=sjvp00dbs001.XX.com Options sourcetype=syslog Options source=udp:514
 Options
 609/3/09 7:27:01.000 PM
 Sep  3 19:27:01 sjvp00dbs001.XX.com Sep  3 19:27:01 nrpe[12434]: [ID 462736
 daemon.debug] Running command: /usr/loca/nagios/libexec/check_scrub_backlog.pl
 -w 1000 -c 2000 -u  -p  -i 
 Sep  3 19:27:01 sjvp00dbs001.XX.com Sep  3 19:27:01 nrpe[12434]: [ID 881351
 daemon.debug] Host is asking for command 'check_scrub_backlog' to be run...
 host=sjvp00dbs001.XX.com Options sourcetype=syslog Options source=udp:514
 Options
 629/3/09 7:27:01.000 PM
 Sep  3 19:27:01 sjvp00dbs001.XX.com Sep  3 19:27:01 nrpe[12434]: [ID 385967
 daemon.debug] Host address is in allowed_hosts
 host=sjvp00dbs001.XX.com Options sourcetype=syslog Options source=udp:514
 Options
 639/3/09 7:27:01.000 PM
 Sep  3 19:27:01 sjvp00dbs001.XX.com Sep  3 19:27:01 nrpe[12434]: [ID 654915
 daemon.debug] Connection from X.X.X.X port 34549
 host=sjvp00dbs001.XX.com Options sourcetype=syslog Options source=udp:514
 Options
 649/3/09 7:27:01.000 PM
 Sep  3 19:27:01 sjvp00dbs001.XX.com Sep  3 19:27:01 nrpe[12434]: [ID 879649
 daemon.debug] Handling the connection...
 
 The command that is trying to run looks corrected- arguments are being sent
 correctly. I have tested the command using the same user that NRPE is running
 as, nagios. I did build in output for error messages in case there were
 problems like issues with ENV, but obviously they are not making it to NRPE. I
 confirmed my output text has EOLs. What is especially weird is that the return
 code makes it back correctly but the output is just ³NRPE: Unable to read
 output² even as logged by nrpe.
 
 The output should look like this, which is what it gets locally when ran:
 Scrub Backlog Count: 1083
 
 Nothing unusually with that, right?
 
 Any help appreciated, thanks.
 
 
 
 Does anyone have any other suggestions on how I might debug this? Since NRPE
 is working for all my other plugins
 
 --
 Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
 trial. Simplify your report design, integration and deployment - and focus on
 what you do best, core application coding. Discover what's new with
 Crystal Reports now.  http://p.sf.net/sfu/bobj-july
 
 ___
 Nagios-users mailing list
 Nagios-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/nagios-users
 ::: Please include Nagios version, plugin version (-v) and OS when reporting
 any issue. 
 ::: Messages without supporting info will risk being sent to /dev/null

--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] nagios scheduling and Hard/soft state question

2009-09-03 Thread shadih rahman
I don't have  is_volatile enabled.  Below I am pasting my log entries and
service definition.  Thanks


*log entries*

[1251777600] CURRENT HOST STATE: HOST A;UP;HARD;1;FPING OK - HOST A
(loss=0%, rta=0.85 ms)
[1251777600] CURRENT SERVICE STATE: HOST A;batteryliebert;OK;HARD;1;Status
is OK - GXT2-2000RT120 - STATUS NORMAL -
[1251777600] CURRENT SERVICE STATE: HOST A;capacity;OK;HARD;1;SNMP OK - 100
[1251777600] CURRENT SERVICE STATE: HOST A;output_current;OK;HARD;1;SNMP OK
- 32 0.1 RMS Amp
[1251777600] CURRENT SERVICE STATE: HOST A;search_aid;OK;HARD;1;(null)
[1251777600] CURRENT SERVICE STATE: HOST A;temp;OK;HARD;1;SNMP OK - 32
[1251914190] SERVICE ALERT: HOST A;batteryliebert;WARNING;SOFT;1;Status is a
WARNING level - SNMP agent not responding
[1251914200] HOST ALERT: HOST A;DOWN;SOFT;1;FPING CRITICAL - HOST A
(loss=100% )
[1251914310] SERVICE ALERT: HOST A;batteryliebert;WARNING;HARD;1;Status is a
WARNING level - SNMP OID does not exist
[1251914390] HOST ALERT: HOST A;DOWN;SOFT;2;FPING CRITICAL - HOST A
(loss=100% )
[1251914580] HOST ALERT: HOST A;UP;SOFT;3;FPING WARNING - HOST A
[1251914600] SERVICE ALERT: HOST A;batteryliebert;OK;HARD;1;Status is OK -
GXT2-2000RT120 - STATUS NORMAL -
[1251915210] SERVICE ALERT: HOST A;batteryliebert;WARNING;SOFT;1;Status is a
WARNING level - SNMP OID does not exist
[1251915330] SERVICE ALERT: HOST A;batteryliebert;WARNING;SOFT;2;Status is a
WARNING level - SNMP agent not responding
[1251915440] SERVICE ALERT: HOST A;batteryliebert;OK;SOFT;3;Status is OK -
GXT2-2000RT120 - STATUS NORMAL -


*service definition*

define service {
host_name  HOST A
service_description batteryliebert
check_periodnoncritical
check_command   check_liebert_ups
contact_groups  netsys
notification_period extended
initial_state   o
check_interval  5.00
retry_interval  2.00
max_check_attempts  3
is_volatile 0
parallelize_check   1
active_checks_enabled   1
passive_checks_enabled  1
obsess_over_service 1
event_handler_enabled   1
low_flap_threshold  0.00
high_flap_threshold 0.00
flap_detection_enabled  1
flap_detection_options  o,c
freshness_threshold 0
check_freshness 0
notification_optionsc,r,f
notifications_enabled   1
notification_interval   30.00
first_notification_delay0.00
stalking_optionsn
process_perf_data   1
failure_prediction_enabled  1
retain_status_information   1
retain_nonstatus_information1
}




On Thu, Sep 3, 2009 at 4:04 PM, Marc Powell m...@ena.com wrote:


 On Sep 3, 2009, at 2:29 PM, shadih rahman wrote:

  All,
 according to the definition hard state is reached upon completing
  the max_check_attempt .  This particular service status information
  is stating otherwise.  This particular service check has
  max_check_attempt set to 3.  However it looks like soft state
  changed into Hard with checking for 3 times.  Is this a bug or am I
  missing something here.  Please advise on this.  Thanks
 
  [09-02-2009 13:58:30] SERVICE ALERT: Host
  B;batteryliebert;WARNING;HARD;1;Status is a WARNING level - SNMP OID
  does not exist
  [09-02-2009 13:56:30] SERVICE ALERT:
  HOSTA;batteryliebert;WARNING;SOFT;1;Status is a WARNING level - SNMP
  agent not responding

 Do you have 'is_volatile' enabled? Please post the entire service
 definition from objects.cache if you do not as well as a few prior log
 entries for this service.

 --
 Marc



 --
 Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
 trial. Simplify your report design, integration and deployment - and focus
 on
 what you do best, core application coding. Discover what's new with
 Crystal Reports now.  http://p.sf.net/sfu/bobj-july
 ___
 Nagios-users mailing list
 Nagios-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/nagios-users
 ::: Please include Nagios version, plugin version (-v) and OS when
 reporting any issue.
 ::: Messages without supporting info will risk being sent to /dev/null




-- 
Cordially,
Shadhin Rahman
--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: 

Re: [Nagios-users] nagios scheduling and Hard/soft state question

2009-09-03 Thread Marc Powell

On Sep 3, 2009, at 3:38 PM, shadih rahman wrote:

 I don't have  is_volatile enabled.  Below I am pasting my log  
 entries and service definition.  Thanks

Weren't you asking about HOST B;batteryliebert?

 log entries

 [1251914190] SERVICE ALERT: HOST A;batteryliebert;WARNING;SOFT; 
 1;Status is a WARNING level - SNMP agent not responding

Service isn't responding... get's a warning (must be default for that  
plugin?). Nagios now checks the host --

 [1251914200] HOST ALERT: HOST A;DOWN;SOFT;1;FPING CRITICAL - HOST A  
 (loss=100% )

Host is down!

 [1251914310] SERVICE ALERT: HOST A;batteryliebert;WARNING;HARD; 
 1;Status is a WARNING level - SNMP OID does not exist

If service has problem and host is down, retries aren't needed, HARD  
state results.

I haven't looked in depth at the new check logic with the introduction  
of parallel host checks to be absolutely certain but the above seems  
reasonable based on what nagios did in the past.

--
Marc


--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] NRPE gives wrong exit codes

2009-09-03 Thread Ciro Iriarte
2009/9/3 Ciro Iriarte cyru...@gmail.com:
 2009/9/3 Morris, Patrick patrick.mor...@hp.com:
 On Thu, 03 Sep 2009, Ciro Iriarte wrote:

 Hi, i'm trying to monitor a Solaris box using NRPE. The thing is
 nagios sees always an EXIT_CODE=0.

 Running the check by hand on the Solaris host works as expected:

 --
 [solaris ~]$ /usr/local/nagios/libexec/check_disk -w 50% -c 10% -p /kml_
 DISK WARNING - free space: /kml_inst2 101125 MB (28% inode=100%);|
 /kml_inst2=248599MB;174862;314752;0;349725
 [solaris ~]$ echo $?
 1
 --

 But running it from the nagios host I get:
 -
 spmon:~ # /usr/lib/nagios/plugins/check_nrpe -H solaris -c check_disk
 -a 50% 10% /test
 DISK WARNING - free space: /test 101125 MB (28% inode=100%);|
 /test=248599MB;174862;314752;0;349725
 spmon:~ # echo $?
 0
 -

 How is check_disk defined in your nrpe config and is it configured to
 allow arguments?


 I have:

 
 dont_blame_nrpe=1
 command[check_disk]=/usr/local/nagios/libexec/check_disk -w $ARG1$ -c
 $ARG2$ -p $ARG3$
 

 Regards,


It's weird, if I restart the daemon it works, but just for the first execution.

-
spmon:/etc/nagios/objects/services #
/usr/lib/nagios/plugins/check_nrpe -H billbd2 -c check_disk -a 90% 80%
/kml_inst2
DISK CRITICAL - free space: /kml_inst2 76172 MB (21% inode=100%);|
/kml_inst2=273552MB;34972;69944;0;349725
spmon:/etc/nagios/objects/services # echo $?
2
spmon:/etc/nagios/objects/services #
/usr/lib/nagios/plugins/check_nrpe -H billbd2 -c check_disk -a 90% 80%
/kml_inst2
DISK CRITICAL - free space: /kml_inst2 76172 MB (21% inode=100%);|
/kml_inst2=273552MB;34972;69944;0;349725
spmon:/etc/nagios/objects/services # echo $?
0
--

Regards,

-- 
Ciro Iriarte
http://cyruspy.wordpress.com
--

--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] using OCP daemon

2009-09-03 Thread Yungwei Chen
HI,
I am evaluting OCP daemon in order to improve performance of sending check 
results from one nagios machine A to another. The nagios on machine A is also 
using pnp4nagios to show graphical results.
I noticed the following in http://wiki.nagios.org/index.php/OCP_Daemon. Does 
that mean machine A will no longer be able to display the graphical results 
once OCP daemon is deployed on machine A? Thanks.
The drawbacks

 *   Can't use the perfdata files for what they are meant for without modifying 
the daemon. Although the proper fix would be to have a dedicated pipe in Nagios 
for OCHP/OCSP purposes, I might as well implement it in the daemon if there is 
requests for this (use the Talk page or email me...). This would be in the form 
of duplicate file/fifo files where everything received from Nagios pipes 
would be written there.

--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] can nagios take some pro-active actions?

2009-09-03 Thread Allan Clark


On Sep 3, 2009, at 13:18, Leandro Quibem Magnabosco leandro.magnabo...@fcdl-sc.org.br 
 wrote:



Olá Leonardo,

Please note that Nagios uses mostly scripts to check services/disk/ 
etc. and that those scripts that 'tell' nagios the status of the  
service/daemon/disk/etc.
That said, I think you should not focus on Nagios to be proactive  
hence that it's scripts could be used for that.


Let's say you have check_http configured to check www.example.com.
It would connect to www.example.com on port 80 and report if it  
suceeds on sending a cmd or not .

If something goes wrong, it would send a critical message back.
This does not mean that the script is necessarily alerting nagios  
about the problem, it is alerting whatever called it in the first  
place.


What I mean is, you don't *need* nagios to be in the middle of it  
and (IMO) you should not try to integrate it into this kind of  
solution because it would just make things more complicated.


One simple way to implement that would be improving the scripts that  
comes with nagios-plugins.

A simple if statement and some coding after it would do the trick.
If the script already has the capability to check the status of  
something and be aware of the present status and take active  
measures .


Interfere with the  function that prints the message CRITICAL to  
make it, for eg., ssh -T host /etc/init.d/apache2 restart.


I have used the scripts from Nagios in a similar way: a project  
(extotest) using Nagios plugins and autotools' autotest to check all  
critical services before and after complex firewall ACL changes. It's  
similar in that it leverages the good work of many contributors but  
doesn't use the Nagios Core as an engine.


In your case, a cronjob might suffice:

#!bash

case $(check_http -opt -opt) in
OK*)
   exit 0
   ;;
*)
   exec /etc/init.d/httpd restart
   ;;
esac

Allan
--
http://tech.chickenandporn.com/tags/nagios

--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] using OCP daemon

2009-09-03 Thread Thomas Guyot-Sionnest
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 03/09/09 05:38 PM, Yungwei Chen wrote:
 HI,
 
 I am evaluting OCP daemon in order to improve performance of sending
 check results from one nagios machine A to another. The nagios on
 machine A is also using pnp4nagios to show graphical results.
 
 I noticed the following in http://wiki.nagios.org/index.php/OCP_Daemon.
 Does that mean machine A will no longer be able to display the graphical
 results once OCP daemon is deployed on machine A? Thanks.
 
 *The drawbacks *
 
 * Can't use the perfdata files for what they are meant for without
   modifying the daemon. Although the proper fix would be to have a
   dedicated pipe in Nagios for OCHP/OCSP purposes, I might as well
   implement it in the daemon if there is requests for this (use the
   Talk page or email me...). This would be in the form of
   duplicate file/fifo files where everything received from Nagios
   pipes would be written there.

That's only if you use the Nagios performance data files for feeding
pnp4nagios. If you have a command defined for each service that won't be
a proble... Same if you use the central nagios for performance data
processing.

If you need this feature though let me know and I'll add it... It's
pretty simple as long ad you don't need any special file rotation
feature than Nagios provide.

- --
Thomas
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFKoEv86dZ+Kt5BchYRAt53AJ4nurQfe1U4ITXY0lxBeoIidUSr7gCeOBOM
ooyX/vhoAvFSM/lNEhc592E=
=OxiL
-END PGP SIGNATURE-

--
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null