from:"C. Bensend"

Re: [Nagios-users] Distributed monitoring: central collector doesn't seem to be able to run active checks

2013-08-28 Thread C. Bensend

> On 8/28/13 14:43, C. Bensend wrote:
>> Are you saying I just need gearmand running on the collector?
>
> Well, i assumed it. You are the only one which really can tell that.
> You will need a worker on each host which should run checks. If your
> collector should not run any checks, than no worker is necessary.
>
> See http://labs.consol.de/nagios/mod-gearman/#_common_scenarios for a list
> of common setups.

OK, yes, I grok that.  I guess I would want the collector to be *able*
to run checks, if it doesn't get timely information from the pollers.
I'm assuming that's why it's even trying in the first place - it
doesn't see a result in a timely manner, so it thinks it should run
one.

Which circles back to my original question - why can't it run the
check?  Why isn't it finding what it needs to find?  The workers
are running as the nagios user, and I don't see anything that appears
pertinent in the mod_gearman_worker.conf file...  What am I missing?
Neither the gearmand.log nor the mod_gearman_worker.log files seem
to have any complaints (but I haven't bumped up the debug on them yet).

Thanks so much for your help!

Benny

-- 
"No matter how tempted I am with the prospect of unlimited power, I
will not consume any energy field bigger than my head."
  -- #22 on Peter Anspach's Evil
 Overlord list

--
Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more!
Discover the easy way to master current and previous Microsoft technologies
and advance your career. Get an incredible 1,500+ hours of step-by-step
tutorial videos with LearnDevNow. Subscribe today and save!
http://pubads.g.doubleclick.net/gampad/clk?id=58040911&iu=/4140/ostg.clktrk
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Distributed monitoring: central collector doesn't seem to be able to run active checks

2013-08-28 Thread C. Bensend


> On 8/22/13 13:51, C. Bensend wrote:
>> CRITICAL: Return code of 127 is out of bounds. Make sure the plugin
>> youre trying to run actually exists. (worker: collector.domain.org)
>
> Hi,
>
> if this is the collector host, why does it have a mod-gearman worker
> installed? If nagios would have
> run the check by itself, there would be no hint about the worker in the
> error. So it seems like there
> is a worker started on your collector host which then grabs some checks
> but isn't able to execute them.

Oh ho!  I have multiple *gearman* processes running:

ps axuww | grep gearman
gearmand  5662  0.7  0.1 404672  2496 ?Ssl  Aug17 118:29
/usr/sbin/gearmand -d -l /var/log/gearmand/gearmand.log
nagios5712  0.0  0.0  38024   640 ?Ss   Aug17   1:03
/usr/bin/mod_gearman_worker -d
--config=/etc/mod_gearman/mod_gearman_worker.conf
--pidfile=/var/mod_gearman/mod_gearman_worker.pid
nagios   25919  0.0  0.1 137492  3016 ?S07:38   0:00
/usr/bin/mod_gearman_worker -d
--config=/etc/mod_gearman/mod_gearman_worker.conf
--pidfile=/var/mod_gearman/mod_gearman_worker.pid

.. etc ..

Are you saying I just need gearmand running on the collector?  I'm
quite new to gearman, so I might have misunderstood which parts are
necessary where.  I can easily shut down the mod_gearman_worker
service, I just need to understand the consequences.

I assumed that this was a Nagios error - perhaps I just have my
gearman setup configured wrong.

Benny


-- 
"No matter how tempted I am with the prospect of unlimited power, I
will not consume any energy field bigger than my head."
  -- #22 on Peter Anspach's Evil
 Overlord list


--
Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more!
Discover the easy way to master current and previous Microsoft technologies
and advance your career. Get an incredible 1,500+ hours of step-by-step
tutorial videos with LearnDevNow. Subscribe today and save!
http://pubads.g.doubleclick.net/gampad/clk?id=58040911&iu=/4140/ostg.clktrk
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Distributed monitoring: central collector doesn't seem to be able to run active checks

2013-08-28 Thread C. Bensend


> Do you get many of those error messages in the logs at once, or just
> one at a time?
>
> Only one thought: what are the permissions on your $USER$ variables?
> Nagios on my systems setuid() to nonroot after startup, and if it gets
> SIGHUP to reload config, but can't read the file defining $USER*$,
> will act strangely.

Just one at a time, seemingly randomly.  A host here, a service there,
several times a day.  They always almost immediately recover, but I
don't understand why my centralized collector seems to have this issue.

Nagios runs as the nagios user, which can read the resource.cfg file
fine:

ls -ld . ; ls -l nagios-hostname.cfg resource.cfg
drwxrwx--- 6 root nagios 4096 Aug 27 16:02 .
-rw-r--r-- 1 root root   47606 Jul  1 11:18 nagios-hostname.cfg
-rw-r- 1 root nagios  2400 Mar 19 11:25 resource.cfg

Thanks!


-- 
"No matter how tempted I am with the prospect of unlimited power, I
will not consume any energy field bigger than my head."
  -- #22 on Peter Anspach's Evil
 Overlord list


--
Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more!
Discover the easy way to master current and previous Microsoft technologies
and advance your career. Get an incredible 1,500+ hours of step-by-step
tutorial videos with LearnDevNow. Subscribe today and save!
http://pubads.g.doubleclick.net/gampad/clk?id=58040911&iu=/4140/ostg.clktrk
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Distributed monitoring: central collector doesn't seem to be able to run active checks

2013-08-28 Thread C. Bensend


>I'm continuing to iron out the wrinkles with 3.5.1 and distributed
> monitoring.  I'm using mod_gearman to submit and receive events from
> two distributed pollers.
>
>Every now and again, I'll get something similar in the log on the
> centralized collecting machine:
>
> CRITICAL: Return code of 127 is out of bounds. Make sure the plugin
> youre trying to run actually exists. (worker: collector.domain.org)
>
>To me, that suggests that the collector system didn't get a result
> for a host or service in a timely manner from one of the polling
> systems, and so it attempted to run an active check itself.  However,
> it doesn't seem to be able to, and I don't know why.
>
>The collector has the same value for $USER1$, and it has the same
> set of plugins installed on it:
>
> On the collector:
>
> grep USER1 etc/resource.cfg
> $USER1$=/usr/local/nagios/libexec
>
> On the two pollers:
>
> $USER1$=/usr/local/nagios/libexec
> $USER1$=/usr/local/nagios/libexec
>
>The plugins are installed in identical locations on all three systems,
> that's enforced via Puppet.  The 'nagios' user can find and run them on
> the collector:
>
> /usr/local/nagios/libexec/check_nrpe -H 127.0.0.1
> NRPE v2.13
>
>Now, because this is a distributed setup, the collector system is
> not configured to run active checks:
>
> grep ^execute etc/nagios.cfg
> execute_service_checks=0
> execute_host_checks=0
>
>... but *obviously* it's trying to.  Is it failing because it's
> configured to not run them?  If that's the case, the error message is
> not accurate and should be corrected.  If that's *not* the case, why
> can't my collector server run an active check when it believes it needs
> to?
>
>I use NConf to generate my configurations, if that matters.  There are
> a *lot* of hosts/services and quite a few configuration files, so I'm not
> going to paste a slew of information here.  If I'm missing pertinent
> information, please let me know exactly what you want to see and I'll
> get it.

No one has an idea about this?  And no, Andreas, I can't move to
4.0 yet.  ;)

Thanks!

Benny


-- 
"No matter how tempted I am with the prospect of unlimited power, I
will not consume any energy field bigger than my head."
  -- #22 on Peter Anspach's Evil
 Overlord list


--
Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more!
Discover the easy way to master current and previous Microsoft technologies
and advance your career. Get an incredible 1,500+ hours of step-by-step
tutorial videos with LearnDevNow. Subscribe today and save!
http://pubads.g.doubleclick.net/gampad/clk?id=58040911&iu=/4140/ostg.clktrk
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] Distributed monitoring: central collector doesn't seem to be able to run active checks

2013-08-22 Thread C. Bensend


Hey folks,

   I'm continuing to iron out the wrinkles with 3.5.1 and distributed
monitoring.  I'm using mod_gearman to submit and receive events from
two distributed pollers.

   Every now and again, I'll get something similar in the log on the
centralized collecting machine:

CRITICAL: Return code of 127 is out of bounds. Make sure the plugin
youre trying to run actually exists. (worker: collector.domain.org)

   To me, that suggests that the collector system didn't get a result
for a host or service in a timely manner from one of the polling
systems, and so it attempted to run an active check itself.  However,
it doesn't seem to be able to, and I don't know why.

   The collector has the same value for $USER1$, and it has the same
set of plugins installed on it:

On the collector:

grep USER1 etc/resource.cfg
$USER1$=/usr/local/nagios/libexec

On the two pollers:

$USER1$=/usr/local/nagios/libexec
$USER1$=/usr/local/nagios/libexec

   The plugins are installed in identical locations on all three systems,
that's enforced via Puppet.  The 'nagios' user can find and run them on
the collector:

/usr/local/nagios/libexec/check_nrpe -H 127.0.0.1
NRPE v2.13

   Now, because this is a distributed setup, the collector system is
not configured to run active checks:

grep ^execute etc/nagios.cfg
execute_service_checks=0
execute_host_checks=0

   ... but *obviously* it's trying to.  Is it failing because it's
configured to not run them?  If that's the case, the error message is
not accurate and should be corrected.  If that's *not* the case, why
can't my collector server run an active check when it believes it needs
to?

   I use NConf to generate my configurations, if that matters.  There are
a *lot* of hosts/services and quite a few configuration files, so I'm not
going to paste a slew of information here.  If I'm missing pertinent
information, please let me know exactly what you want to see and I'll
get it.

   I'd really appreciate a clue-by-four.  Thanks, folks!  :)

Benny


-- 
"No matter how tempted I am with the prospect of unlimited power, I
will not consume any energy field bigger than my head."
  -- #22 on Peter Anspach's Evil
 Overlord list


--
Introducing Performance Central, a new site from SourceForge and 
AppDynamics. Performance Central is your source for news, insights, 
analysis and resources for efficient Application Performance Management. 
Visit us today!
http://pubads.g.doubleclick.net/gampad/clk?id=48897511&iu=/4140/ostg.clktrk
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Misplaced advice in the Nagios preflight check?

2013-06-11 Thread C. Bensend


Yep, I've had that one enabled for quite some time.  :)


> There is workaround this is how I fixed in our environment
> use_large_installation_tweaks=1 in nagios.cfg see whether this helps this
> removes the warning for you
>
> Regards
> Sunil
>
>
>
>
> On Tue, Jun 11, 2013 at 9:52 PM, Justin T Pryzby
> wrote:
>
>> On Tue, Jun 11, 2013 at 11:12:23AM -0500, C. Bensend wrote:
>> > I can't seem to parse "It doesn't make sense to get a recovery
>> > notification for something you never knew was a problem."
>>
>> see the original language here:
>>
>> http://nagios.sourceforge.net/docs/3_0/notifications.html
>> Note: Notifications about host or service recoveries are only sent out
>> if a notification was sent out for the original problem. It doesn't
>> make sense to get a recovery notification for something you never knew
>> was a problem.
>>
>> And:
>> http://nagios.sourceforge.net/docs/3_0/escalations.html
>> If, after three problem notifications, a recovery notification is sent
>> out for the service, who gets notified?  The recovery is actually the
>> fourth notification that gets sent out.  However, the escalation code
>> is  smart enough to realize that only those people who were notified
>> about  the problem on the third notification should be notified about
>> the  recovery.  In this case, the nt-admins and managers contact
>> groups would be notified of the recovery.
>>
>> (Although, I believe I've either misunderstood the implications of
>> that statement, or run into misbehaviours in that area myself...)
>>
>>
>> --
>> This SF.net email is sponsored by Windows:
>>
>> Build for Windows Store.
>>
>> http://p.sf.net/sfu/windows-dev2dev
>> ___
>> Nagios-users mailing list
>> Nagios-users@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/nagios-users
>> ::: Please include Nagios version, plugin version (-v) and OS when
>> reporting any issue.
>> ::: Messages without supporting info will risk being sent to /dev/null
>>
>
>
>
> --
> Regards
> Sunil Sankar
> --
> This SF.net email is sponsored by Windows:
>
> Build for Windows Store.
>
> http://p.sf.net/sfu/windows-dev2dev___
> Nagios-users mailing list
> Nagios-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null


-- 
"The very existence of flamethrowers proves that sometime, somewhere,
someone said to themselves, 'You know, I want to set those people
over there on fire, but I'm just not close enough to get the job done.'"  
   -- George Carlin


--
This SF.net email is sponsored by Windows:

Build for Windows Store.

http://p.sf.net/sfu/windows-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Misplaced advice in the Nagios preflight check?

2013-06-11 Thread C. Bensend


> see the original language here:
>
> http://nagios.sourceforge.net/docs/3_0/notifications.html
> Note: Notifications about host or service recoveries are only sent out
> if a notification was sent out for the original problem. It doesn't
> make sense to get a recovery notification for something you never knew
> was a problem.
>
> And:
> http://nagios.sourceforge.net/docs/3_0/escalations.html
> If, after three problem notifications, a recovery notification is sent
> out for the service, who gets notified?  The recovery is actually the
> fourth notification that gets sent out.  However, the escalation code
> is  smart enough to realize that only those people who were notified
> about  the problem on the third notification should be notified about
> the  recovery.  In this case, the nt-admins and managers contact
> groups would be notified of the recovery.
>
> (Although, I believe I've either misunderstood the implications of
> that statement, or run into misbehaviours in that area myself...)

Ah.  Well, yes.  :)  I believe those statements are referring to
the filters that Nagios uses to determine whether or not to send
a notification at all.  *That's* not an issue here, the notification
goes out, just like it should.

*My* question is why the sanity check thinks that configuration
doesn't make sense.  I think the answer is probably something to
the effect of:  "I don't know why anyone would want that, so warn
about it."  I don't want to put words in the mouth of any of the
developers that may have touched it, though, so I'm just guessing.

I just want to make sure this is a case of Nagios maybe not giving
the right advice in its sanity check, and *not* that there's something
behind the scenes that I'm not aware of that might actually cause a
problem.  If it's the former, maybe we can get it adjusted for the
next release.  If it's the latter, I hope someone will step forth with
the ClueBat 5000(tm) and give me a good thump.  :)

Thanks, everyone!

Benny


-- 
"The very existence of flamethrowers proves that sometime, somewhere,
someone said to themselves, 'You know, I want to set those people
over there on fire, but I'm just not close enough to get the job done.'"  
   -- George Carlin


--
This SF.net email is sponsored by Windows:

Build for Windows Store.

http://p.sf.net/sfu/windows-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Misplaced advice in the Nagios preflight check?

2013-06-11 Thread C. Bensend


I can't seem to parse "It doesn't make sense to get a recovery
notification for something you never knew was a problem."

Are you saying that since Nagios doesn't consider an unknown a
problem, it won't send a recovery?  Because it does...  And in
this case, I certainly want to know when a service having a
monitoring issue (unknown) recovers.

Not sure what you meant there.

Thanks!

Benny


> This is by design, and it is only a warning message. The config is valid
> and should work as you intended. It doesn't make sense to get a recovery
> notification for something you never knew was a problem. "Unknowns" are
> not
> considered problems in Nagios logic.
>
>
> On Mon, Jun 10, 2013 at 1:25 PM, Chris Beattie 
> wrote:
>
>> On 6/7/2013 9:28 AM, C. Bensend wrote:>
>> > Not real sure why Nagios doesn't think that's a valid config - I
>> > want a contact that will receive only UNKNOWN alerts for services.
>>
>> Have you tried giving that contact the extra options Nagios wants, and
>> then defining a service escalation for that contact with the
>> escalation_options directive set to u?
>>
>> --
>> -Chris
>>
>>
>> --
>> This SF.net email is sponsored by Windows:
>>
>> Build for Windows Store.
>>
>> http://p.sf.net/sfu/windows-dev2dev
>> ___
>> Nagios-users mailing list
>> Nagios-users@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/nagios-users
>> ::: Please include Nagios version, plugin version (-v) and OS when
>> reporting any issue.
>> ::: Messages without supporting info will risk being sent to /dev/null
>>
> --
> This SF.net email is sponsored by Windows:
>
> Build for Windows Store.
>
> http://p.sf.net/sfu/windows-dev2dev___
> Nagios-users mailing list
> Nagios-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null


-- 
"The very existence of flamethrowers proves that sometime, somewhere,
someone said to themselves, 'You know, I want to set those people
over there on fire, but I'm just not close enough to get the job done.'"  
   -- George Carlin


--
This SF.net email is sponsored by Windows:

Build for Windows Store.

http://p.sf.net/sfu/windows-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Misplaced advice in the Nagios preflight check?

2013-06-10 Thread C. Bensend


> Have you tried giving that contact the extra options Nagios wants, and
> then defining a service escalation for that contact with the
> escalation_options directive set to u?

No, I haven't.  It *seems* to be working as I intend.  My question is
more as to why Nagios seems to think it's a bad idea, when it's a
perfectly legitimate configuration.  Are there unforeseen consequences
that I'm not aware of?  Or was it just not a configuration anyone
thought would be useful/valid, so it is warned about?


-- 
"The very existence of flamethrowers proves that sometime, somewhere,
someone said to themselves, 'You know, I want to set those people
over there on fire, but I'm just not close enough to get the job done.'"  
   -- George Carlin


--
This SF.net email is sponsored by Windows:

Build for Windows Store.

http://p.sf.net/sfu/windows-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] Misplaced advice in the Nagios preflight check?

2013-06-07 Thread C. Bensend


Hey folks,

   Still ironing out the wrinkles in my 3.5.0 distributed environment.
Yesterday, I added a new contact, and on preflight check it seemed
to think that what I did wasn't smart:


Jun  6 15:11:02 hostname nagios: Warning: Service recovery notification
option for contact 'cbensend-unknown-only' doesn't make any sense -
specify critical and/or warning options as well


Here's the contact I added that it seems to think is a dumb idea:


define contact {
   contact_namecbensend-unknown-only
   alias           C. Bensend - unknown alerts only
   host_notification_options   n
   service_notification_optionsu,r
   email   m...@myjob.com
   host_notification_period24x7
   service_notification_period 24x7
   host_notification_commands  notify-host-by-email
   service_notification_commands   notify-service-by-email
}


   Not real sure why Nagios doesn't think that's a valid config - I
want a contact that will receive only UNKNOWN alerts for services.
Perfectly valid idea to me; I have a number of services that I
truly do not give a crap about, they trip many times a day and are
critical for some developers, but I don't do anything about them.
I *do*, however, want to know if there's a problem monitoring them,
hence the need to see UNKNOWN alerts and recoveries.

   Is there some reason Nagios would think that's not valid?  Or
should it not complain about that?

   Just curious...  It loaded the config and the contact exists, just
not entirely convinced it's a valid complaint.  :)

Thanks much!

Benny


-- 
"The very existence of flamethrowers proves that sometime, somewhere,
someone said to themselves, 'You know, I want to set those people
over there on fire, but I'm just not close enough to get the job done.'"  
   -- George Carlin


--
How ServiceNow helps IT people transform IT departments:
1. A cloud service to automate IT design, transition and operations
2. Dashboards that offer high-level views of enterprise services
3. A single system of record for all IT processes
http://p.sf.net/sfu/servicenow-d2d-j
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Nagios Plugin for IPTABLES Monitoring

2013-05-31 Thread C. Bensend


OK.  So, what differs when you try that first command when iptables
*is* running?


> Please find the details..
>
> [nagios@server  ~]$ /usr/bin/sudo /sbin/iptables -nvL | /bin/grep 'Chain'
> | /bin/awk '{ print $2 }'| /bin/grep Cid | /usr/bin/wc -l
> 0
> [nagios@server  ~]$ /usr/bin/sudo /sbin/iptables -nvL | /bin/grep Cid |
> /usr/bin/wc -l
> 0
> [nagios@server  ~]$
> [nagios@server ~]$ echo $?
> 0
> [nagios@servef ~]$
>
> Yes, Server = zurich
> -Original Message-
> From: C. Bensend [mailto:be...@bennyvision.com]
> Sent: Friday, 31 May 2013 8:05 PM
> To: nagios-users@lists.sourceforge.net
> Subject: Re: [Nagios-users] Nagios Plugin for IPTABLES Monitoring
>
>
>> Ran as nagios user and please find the details below.  ( iptables
>> Stopped)
>>
>>
>> [nagios@server ~]$ /usr/bin/sudo /sbin/iptables -nvL | /bin/grep
>> 'Chain' | /bin/awk '{ print $2 }'| /bin/grep Cid | /usr/bin/wc -l| echo
>> $?
>> 0
>
> That 'echo $?' was supposed to be on the next line, not a continuation of
> the command.  Can you run that again, but as two separate commands, one
> right after the other?  I want to see the result of your first command
> (the iptables one).
>
>> [nagios@server ~]$ /usr/bin/sudo /sbin/iptables -nvL Chain INPUT
>> (policy ACCEPT 9089 packets, 3303K bytes)
>>  pkts bytes target prot opt in out source
>> destination
>>
>> Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
>>  pkts bytes target prot opt in out source
>> destination
>>
>> Chain OUTPUT (policy ACCEPT 7812 packets, 3436K bytes)
>>  pkts bytes target     prot opt in out source
>> destination
>> [nagios@server ~]$
>
> I'm assuming "server" == "zurich", right?
>
> I wonder if you can cut out the first grep and awk, and just look for
> 'Cid' ?
>
>
>> -Original Message-
>> From: C. Bensend [mailto:be...@bennyvision.com]
>> Sent: Thursday, 30 May 2013 8:44 PM
>> To: nagios-users@lists.sourceforge.net
>> Subject: Re: [Nagios-users] Nagios Plugin for IPTABLES Monitoring
>>
>>
>> I'm assuming that this check is running *on* the host 'zurich'?
>>
>> /var/log/secure should be listing an entry, if sudo is being run.
>>
>> Manually, *as the nagios user*, what happens when you do the following?
>>
>> /usr/bin/sudo /sbin/iptables -nvL | /bin/grep 'Chain' | \
>>/bin/awk '{ print $2 }'| /bin/grep Cid | /usr/bin/wc -l echo $?
>>
>>
>> How about just (again, as the nagios user):
>>
>> /usr/bin/sudo /sbin/iptables -nvL
>>
>>
>>> Please find the details
>>>
>>> Sudoers Definition:-
>>>
>>> nagios zurich= NOPASSWD: /sbin/iptables,
>>> /usr/local/nagios/libexec/check_iptables.sh,
>>> /usr/local/nagios/libexec/check_nrpe
>>>
>>> /var/log/secure:
>>>
>>> su: pam_unix(su:session): session opened for user nagios by
>>> root(uid=0)
>>> su: pam_unix(su:session): session closed for user nagios
>>>
>>>
>>>
>>> -Original Message-
>>> From: C. Bensend [mailto:be...@bennyvision.com]
>>> Sent: Wednesday, 29 May 2013 7:59 PM
>>> To:
>>> nagios-users@lists.sourceforge.net<mailto:nagios-users@lists.sourcefo
>>> rge.net>
>>> Subject: Re: [Nagios-users] Nagios Plugin for IPTABLES Monitoring
>>>
>>>
>>> Where's your sudoers definition that allows the nagios user to run
>>> any commands via sudo?
>>>
>>> And what does /var/log/secure (or equivalent) think about the nagios
>>> user trying to run sudo?
>>>
>>>
>>>> I have tested with nagios user as well.. still no luck with that.
>>>> Could you some one update if you have any solution on this case.
>>>>
>>>> Kind Regards,
>>>> Thilak
>>>>
>>>> From: Deborah Martin [mailto:deborah.mar...@kognitio.com]
>>>> Sent: Tuesday, 14 May 2013 7:30 PM
>>>> To: Nagios Users List
>>>> Subject: Re: [Nagios-users] Nagios Plugin for IPTABLES Monitoring
>>>>
>>>> Ok - if I look at your output, manually,  when the plugin is run as
>>>> the "root" user it produces the correct result.
>>>>
>>>> But, you haven't said what the nrpe user is that is running on the
>>>> remote node  and wh

Re: [Nagios-users] Nagios Plugin for IPTABLES Monitoring

2013-05-31 Thread C. Bensend


> Ran as nagios user and please find the details below.  ( iptables Stopped)
>
>
> [nagios@server ~]$ /usr/bin/sudo /sbin/iptables -nvL | /bin/grep 'Chain' |
> /bin/awk '{ print $2 }'| /bin/grep Cid | /usr/bin/wc -l| echo $?
> 0

That 'echo $?' was supposed to be on the next line, not a continuation
of the command.  Can you run that again, but as two separate commands,
one right after the other?  I want to see the result of your first
command (the iptables one).

> [nagios@server ~]$ /usr/bin/sudo /sbin/iptables -nvL
> Chain INPUT (policy ACCEPT 9089 packets, 3303K bytes)
>  pkts bytes target prot opt in out source
> destination
>
> Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
>  pkts bytes target prot opt in out source
> destination
>
> Chain OUTPUT (policy ACCEPT 7812 packets, 3436K bytes)
>  pkts bytes target prot opt in out source
> destination
> [nagios@server ~]$

I'm assuming "server" == "zurich", right?

I wonder if you can cut out the first grep and awk, and just look
for 'Cid' ?


> -Original Message-
> From: C. Bensend [mailto:be...@bennyvision.com]
> Sent: Thursday, 30 May 2013 8:44 PM
> To: nagios-users@lists.sourceforge.net
> Subject: Re: [Nagios-users] Nagios Plugin for IPTABLES Monitoring
>
>
> I'm assuming that this check is running *on* the host 'zurich'?
>
> /var/log/secure should be listing an entry, if sudo is being run.
>
> Manually, *as the nagios user*, what happens when you do the following?
>
> /usr/bin/sudo /sbin/iptables -nvL | /bin/grep 'Chain' | \
>/bin/awk '{ print $2 }'| /bin/grep Cid | /usr/bin/wc -l echo $?
>
>
> How about just (again, as the nagios user):
>
> /usr/bin/sudo /sbin/iptables -nvL
>
>
>> Please find the details
>>
>> Sudoers Definition:-
>>
>> nagios zurich= NOPASSWD: /sbin/iptables,
>> /usr/local/nagios/libexec/check_iptables.sh,
>> /usr/local/nagios/libexec/check_nrpe
>>
>> /var/log/secure:
>>
>> su: pam_unix(su:session): session opened for user nagios by
>> root(uid=0)
>> su: pam_unix(su:session): session closed for user nagios
>>
>>
>>
>> -Original Message-
>> From: C. Bensend [mailto:be...@bennyvision.com]
>> Sent: Wednesday, 29 May 2013 7:59 PM
>> To:
>> nagios-users@lists.sourceforge.net<mailto:nagios-users@lists.sourceforge.net>
>> Subject: Re: [Nagios-users] Nagios Plugin for IPTABLES Monitoring
>>
>>
>> Where's your sudoers definition that allows the nagios user to run any
>> commands via sudo?
>>
>> And what does /var/log/secure (or equivalent) think about the nagios
>> user trying to run sudo?
>>
>>
>>> I have tested with nagios user as well.. still no luck with that.
>>> Could you some one update if you have any solution on this case.
>>>
>>> Kind Regards,
>>> Thilak
>>>
>>> From: Deborah Martin [mailto:deborah.mar...@kognitio.com]
>>> Sent: Tuesday, 14 May 2013 7:30 PM
>>> To: Nagios Users List
>>> Subject: Re: [Nagios-users] Nagios Plugin for IPTABLES Monitoring
>>>
>>> Ok - if I look at your output, manually,  when the plugin is run as
>>> the "root" user it produces the correct result.
>>>
>>> But, you haven't said what the nrpe user is that is running on the
>>> remote node  and whether the same manual run of the check produces
>>> the same output.
>>> For example, I run remote plugins through nrpe as the "nagios" user
>>> so if I want to manually test a plugin on the remote node, I would
>>> first login as the nagios user to ensure I've got the same
>>> environment that would be used when running via nrpe. It might be
>>> that the variables you have set in the script only work as the root
>>> user. It's never a good idea to test as the root  user but only as
>>> the same user as that used by nagios or nrpe.
>>>
>>> Regards,
>>> Deborah
>>>
>>> From: Thilakraj.Shanmugam
>>> [mailto:thilakraj.shanmu...@canberra.edu.au]
>>> Sent: 14 May 2013 09:58
>>> To: Nagios Users List
>>> Subject: Re: [Nagios-users] Nagios Plugin for IPTABLES Monitoring
>>>
>>> Hi Deborah,  Thanks for the response..  please find the details below.
>>>
>>>
>>> [root@abc libexec]# pwd
>>> /usr/local/nagios/libexec
>>> [root@abc libexec]# ./check_iptables.sh
>>>

Re: [Nagios-users] Nagios Plugin for IPTABLES Monitoring

2013-05-30 Thread C. Bensend


I'm assuming that this check is running *on* the host 'zurich'?

/var/log/secure should be listing an entry, if sudo is being run.

Manually, *as the nagios user*, what happens when you do the following?

/usr/bin/sudo /sbin/iptables -nvL | /bin/grep 'Chain' | \
   /bin/awk '{ print $2 }'| /bin/grep Cid | /usr/bin/wc -l
echo $?


How about just (again, as the nagios user):

/usr/bin/sudo /sbin/iptables -nvL


> Please find the details
>
> Sudoers Definition:-
>
> nagios zurich= NOPASSWD: /sbin/iptables,
> /usr/local/nagios/libexec/check_iptables.sh,
> /usr/local/nagios/libexec/check_nrpe
>
> /var/log/secure:
>
> su: pam_unix(su:session): session opened for user nagios by root(uid=0)
> su: pam_unix(su:session): session closed for user nagios
>
>
>
> -Original Message-
> From: C. Bensend [mailto:be...@bennyvision.com]
> Sent: Wednesday, 29 May 2013 7:59 PM
> To: nagios-users@lists.sourceforge.net
> Subject: Re: [Nagios-users] Nagios Plugin for IPTABLES Monitoring
>
>
> Where's your sudoers definition that allows the nagios user to run any
> commands via sudo?
>
> And what does /var/log/secure (or equivalent) think about the nagios user
> trying to run sudo?
>
>
>> I have tested with nagios user as well.. still no luck with that.
>> Could you some one update if you have any solution on this case.
>>
>> Kind Regards,
>> Thilak
>>
>> From: Deborah Martin [mailto:deborah.mar...@kognitio.com]
>> Sent: Tuesday, 14 May 2013 7:30 PM
>> To: Nagios Users List
>> Subject: Re: [Nagios-users] Nagios Plugin for IPTABLES Monitoring
>>
>> Ok - if I look at your output, manually,  when the plugin is run as
>> the "root" user it produces the correct result.
>>
>> But, you haven't said what the nrpe user is that is running on the
>> remote node  and whether the same manual run of the check produces the
>> same output.
>> For example, I run remote plugins through nrpe as the "nagios" user so
>> if I want to manually test a plugin on the remote node, I would first
>> login as the nagios user to ensure I've got the same environment that
>> would be used when running via nrpe. It might be that the variables
>> you have set in the script only work as the root user. It's never a
>> good idea to test as the root  user but only as the same user as that
>> used by nagios or nrpe.
>>
>> Regards,
>> Deborah
>>
>> From: Thilakraj.Shanmugam [mailto:thilakraj.shanmu...@canberra.edu.au]
>> Sent: 14 May 2013 09:58
>> To: Nagios Users List
>> Subject: Re: [Nagios-users] Nagios Plugin for IPTABLES Monitoring
>>
>> Hi Deborah,  Thanks for the response..  please find the details below.
>>
>>
>> [root@abc libexec]# pwd
>> /usr/local/nagios/libexec
>> [root@abc libexec]# ./check_iptables.sh
>><-  Executing manually script
>> + IPT=/sbin/iptables
>> + GREP=/bin/grep
>> + AWK=/bin/awk
>> + EXPR=/usr/bin/expr
>> + WC=/usr/bin/wc
>> + A=/usr/bin/sudo
>> + E_SUCCESS=0
>> + E_CRITICAL=2
>> + E_UNKNOWN=3
>> ++ /usr/bin/sudo /sbin/iptables -nvL
>> ++ /bin/grep Chain
>> ++ /bin/awk '{ print $2 }'
>> ++ /bin/grep Cid
>> ++ /usr/bin/wc -l
>> + CHAINS=5
>> + '[' 5 -ne 0 ']'
>> + echo 'Firewall is running!'
>> Firewall is running!
>> + exit 0
>> <--  it shows
>> firewall
>> running   ( correct output )
>> [root@abc libexec]#
>>
>>
>> Client - NRPE config file
>>
>> [root@abc libexec]# cat /usr/local/nagios/etc/nrpe.cfg |grep -i
>> iptable
>> command[check_iptables]=/usr/local/nagios/libexec/check_iptables.sh
>> [root@abc libexec]#
>>
>>
>> [root@abc libexec]# ./check_nrpe -H localhost -c check_iptables
>> Firewall is not running
>> <-  executing
>> via
>> check_nrpe   (  wrong output )
>> [root@abc libexec]#
>>
>>
>> NRPE Logs
>> -
>>
>> May 14 18:52:28 abc nrpe[31158]: Added
>> command[check_Partion_db]=/usr/local/nagios/libexec/check_disk -w 15%
>> -c 5% -p /db May 14 18:52:28 abc nrpe[31158]: Added
>> command[check_Partion_app]=/usr/local/nagios/libexec/check_disk -w 15%
>> -c 5% -p /app May 14 18:52:28 abc nrpe[31158]: Added
>> command[check_iptables]=/usr/local/nagios/libexec/check_iptables.sh
>> May 14 18:52:28 abc nrpe[31158]: INFO: SSL/TLS

[Nagios-users] Nagios-Users: please unsubscribe gch...@renegade.com

2013-05-29 Thread C. Bensend


Could one of the list admins unsubscribe gch...@renegade.com?

Their email has been bouncing for a while now:



Delivery has failed to these recipients or groups:

gch...@renegade.com
The e-mail address you entered couldn't be found. Please check the
recipient's
e-mail address and try to resend the message. If the problem continues,
please
contact your helpdesk.

Diagnostic information for administrators:

Generating server: renegade.com

gch...@renegade.com
#550 5.1.1 RESOLVER.ADR.RecipNotFound; not found ##rfc822;gch...@renegade.com


-- 
"The very existence of flamethrowers proves that sometime, somewhere,
someone said to themselves, 'You know, I want to set those people
over there on fire, but I'm just not close enough to get the job
done.'"  -- George Carlin


--
Introducing AppDynamics Lite, a free troubleshooting tool for Java/.NET
Get 100% visibility into your production application - at no cost.
Code-level diagnostics for performance bottlenecks with <2% overhead
Download for free and get started troubleshooting in minutes.
http://p.sf.net/sfu/appdyn_d2d_ap1
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Nagios Plugin for IPTABLES Monitoring

2013-05-29 Thread C. Bensend


Where's your sudoers definition that allows the nagios user to
run any commands via sudo?

And what does /var/log/secure (or equivalent) think about the
nagios user trying to run sudo?


> I have tested with nagios user as well.. still no luck with that.  Could
> you some one update if you have any solution on this case.
>
> Kind Regards,
> Thilak
>
> From: Deborah Martin [mailto:deborah.mar...@kognitio.com]
> Sent: Tuesday, 14 May 2013 7:30 PM
> To: Nagios Users List
> Subject: Re: [Nagios-users] Nagios Plugin for IPTABLES Monitoring
>
> Ok - if I look at your output, manually,  when the plugin is run as the
> "root" user it produces the correct result.
>
> But, you haven't said what the nrpe user is that is running on the remote
> node  and whether the same manual run of the check produces the same
> output.
> For example, I run remote plugins through nrpe as the "nagios" user so if
> I want to manually test a plugin on the remote node, I would first login
> as the nagios user to ensure I've got the same environment that would be
> used when running via nrpe. It might be that the variables you have set in
> the script only work as the root user. It's never a good idea to test as
> the root  user but only as the same user as that used by nagios or nrpe.
>
> Regards,
> Deborah
>
> From: Thilakraj.Shanmugam [mailto:thilakraj.shanmu...@canberra.edu.au]
> Sent: 14 May 2013 09:58
> To: Nagios Users List
> Subject: Re: [Nagios-users] Nagios Plugin for IPTABLES Monitoring
>
> Hi Deborah,  Thanks for the response..  please find the details below.
>
>
> [root@abc libexec]# pwd
> /usr/local/nagios/libexec
> [root@abc libexec]# ./check_iptables.sh
><-  Executing manually script
> + IPT=/sbin/iptables
> + GREP=/bin/grep
> + AWK=/bin/awk
> + EXPR=/usr/bin/expr
> + WC=/usr/bin/wc
> + A=/usr/bin/sudo
> + E_SUCCESS=0
> + E_CRITICAL=2
> + E_UNKNOWN=3
> ++ /usr/bin/sudo /sbin/iptables -nvL
> ++ /bin/grep Chain
> ++ /bin/awk '{ print $2 }'
> ++ /bin/grep Cid
> ++ /usr/bin/wc -l
> + CHAINS=5
> + '[' 5 -ne 0 ']'
> + echo 'Firewall is running!'
> Firewall is running!
> + exit 0
> <--  it shows firewall
> running   ( correct output )
> [root@abc libexec]#
>
>
> Client - NRPE config file
>
> [root@abc libexec]# cat /usr/local/nagios/etc/nrpe.cfg |grep -i iptable
> command[check_iptables]=/usr/local/nagios/libexec/check_iptables.sh
> [root@abc libexec]#
>
>
> [root@abc libexec]# ./check_nrpe -H localhost -c check_iptables
> Firewall is not running
> <-  executing via
> check_nrpe   (  wrong output )
> [root@abc libexec]#
>
>
> NRPE Logs
> -
>
> May 14 18:52:28 abc nrpe[31158]: Added
> command[check_Partion_db]=/usr/local/nagios/libexec/check_disk -w 15% -c
> 5% -p /db
> May 14 18:52:28 abc nrpe[31158]: Added
> command[check_Partion_app]=/usr/local/nagios/libexec/check_disk -w 15% -c
> 5% -p /app
> May 14 18:52:28 abc nrpe[31158]: Added
> command[check_iptables]=/usr/local/nagios/libexec/check_iptables.sh
> May 14 18:52:28 abc nrpe[31158]: INFO: SSL/TLS initialized. All network
> traffic will be encrypted.
> May 14 18:52:28 abc nrpe[31158]: Handling the connection...
> May 14 18:52:28 abc nrpe[31158]: Host is asking for command
> 'check_iptables' to be run...
> May 14 18:52:28 abc nrpe[31158]: Running command:
> /usr/local/nagios/libexec/check_iptables.sh
> May 14 18:52:28 abc nrpe[31158]: Command completed with return code 2 and
> output: Firewall is not running
> May 14 18:52:28 abc nrpe[31158]: Return Code: 2, Output: Firewall is not
> running
>
>
> Kind Regards,
> Thilak
>
>
> From: Deborah Martin [mailto:deborah.mar...@kognitio.com]
> Sent: Tuesday, 14 May 2013 6:44 PM
> To: Nagios Users List
> Subject: Re: [Nagios-users] Nagios Plugin for IPTABLES Monitoring
>
> Hi,
> What is the wrong output being returned ? This might give us all a clue as
> to the cause of the problem.
> When you run the check manually, are you doing this as the same user that
> check_nrpe will use ?
>
> Regards,
> Deborah
>
>
>
> From: Thilakraj.Shanmugam [mailto:thilakraj.shanmu...@canberra.edu.au]
> Sent: 14 May 2013 08:43
> To:
> nagios-users@lists.sourceforge.net
> Subject: [Nagios-users] Nagios Plugin for IPTABLES Monitoring
>
> Greetings!
>
> Could someone send me nagios plugin which is tested and works well for
> monitoring IPTABLES in Linux.
>
> I have tested below script but it is not returning correct output to
> nagios server.
>
> If I execute script manually, it shows correct output...
>
> But if I execute via  ./check_nrpe - H localhost -c check_iptables,  it
> shows wrong output.
>
>
>
> Below is my plugin
> --
>
> #!/bin/bash
> set -x
>
> IPT='/sbin/iptables'
> GREP='/bin/grep'
> AWK='/bin/awk'
> EXPR='/usr/bin/expr'
> WC='/usr/bin/wc'
> A='/usr/bin/sudo'
>
> E_SUCCESS="0"
> E_CRITICAL="2"
> E_UNKNOWN="3"
>
> CH

Re: [Nagios-users] Nagios v3.5.0 transitioning immediately to a HARD state upon host problem

2013-05-25 Thread C. Bensend


> diff -uNp nagios-updated.cfg nagios.cfg
> --- nagios-updated.cfg  Sat May 25 09:05:09 2013
> +++ nagios.cfg  Sat May 25 09:02:37 2013
> @@ -981,9 +981,9 @@ translate_passive_host_checks=0
>
>  # PASSIVE HOST CHECKS ARE SOFT OPTION
>  # This determines whether or not Nagios will treat passive host
> -# checks as being HARD or SOFT.  By default, a single passive host
> -# check result will put a host into an immediate HARD state type.
> -# This can be changed by enabling this option.
> +# checks as being HARD or SOFT.  By default, a passive host check
> +# result will put a host into a HARD state type.  This can be changed
> +# by enabling this option.
>  # Values: 0 = passive checks are HARD, 1 = passive checks are SOFT
>
>  passive_host_checks_are_soft=0
>
>
> Does that make sense?  If I had read something like that, it would
> have been immediately clear to me what was happening.
>
> Thank you so much, Andreas!  On to the next problem with the
> upgrade (something that can wait until next week)...

Sorry, too little caffeine too early, got the files reversed.  Here's
the right diff:

diff -uNp nagios.cfg nagios-updated.cfg
--- nagios.cfg  Sat May 25 10:25:34 2013
+++ nagios-updated.cfg  Sat May 25 10:27:12 2013
@@ -981,9 +981,9 @@ translate_passive_host_checks=0

 # PASSIVE HOST CHECKS ARE SOFT OPTION
 # This determines whether or not Nagios will treat passive host
-# checks as being HARD or SOFT.  By default, a passive host check
-# result will put a host into a HARD state type.  This can be changed
-# by enabling this option.
+# checks as being HARD or SOFT.  By default, a single passive host
+# check result will put a host into an immediate HARD state type.
+# This can be changed by enabling this option.
 # Values: 0 = passive checks are HARD, 1 = passive checks are SOFT

 passive_host_checks_are_soft=0



-- 
"The very existence of flamethrowers proves that sometime, somewhere,
someone said to themselves, 'You know, I want to set those people
over there on fire, but I'm just not close enough to get the job
done.'"  -- George Carlin


--
Try New Relic Now & We'll Send You this Cool Shirt
New Relic is the only SaaS-based application performance monitoring service 
that delivers powerful full stack analytics. Optimize and monitor your
browser, app, & servers with just a few lines of code. Try New Relic
and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_may
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Nagios v3.5.0 transitioning immediately to a HARD state upon host problem

2013-05-25 Thread C. Bensend


> On 2013-05-23 17:43, C. Bensend wrote:
>>
>> Hey folks,
>>
>> I recently made two major changes to my Nagios environment:
>>
>> 1) I upgraded to v3.5.0.
>> 2) I moved from a single server to two pollers sending passive
>> results to one central console server.
>>
>> Now, this new distributed system was in place for several months
>> while I tested, and it worked fine.  HOWEVER, since this was running
>> in parallel with my production system, notifications were disabled.
>> Hence, I didn't see this problem until I cut over for real and
>> enabled notifications.
>>
>> (please excuse any cut-n-paste ugliness, had to send this info from
>> my work account via Outlook and then try to cleanse and reformat
>> via Squirrelmail)
>>
>> As a test and to capture information, I reboot 'hostname'.  This
>> log is from the nagios-console host, which is the host that accepts
>> the passive check results and sends notifications.  Here is the
>> console host receiving a service check failure when the host is
>> restarting:
>>
>> May 22 15:57:10 nagios-console nagios: SERVICE ALERT: hostname;/var disk
>> queue;CRITICAL;SOFT;1;Connection refused by host
>>
>>
>> So, the distributed poller system checks the host and sends its
>> results to the console server:
>>
>> May 22 15:57:30 nagios-console nagios: HOST
>> ALERT:hostname;DOWN;SOFT;1;CRITICAL - Host Unreachable (a.b.c.d)
>>
>>
>> And then the centralized server IMMEDIATELY goes into a hard state,
>> which triggers a  notification:
>>
>> May 22 15:57:30 nagios-console nagios: HOST ALERT:
>> hostname;DOWN;HARD;1;CRITICAL - Host Unreachable (a.b.c.d)
>> May 22 15:57:30 nagios-console nagios: HOST NOTIFICATION:
>> cbensend;hostname;DOWN;host-notify-by-email-test;CRITICAL -
>> Host Unreachable (a.b.c.d)
>>
>>
>> Um.  Wat?  Why would the console immediately trigger a hard
>> state? The config files don't support this decision.  And this
>> IS a problem with the console server - the distributed monitors
>> continue checking the host for 6 times like they should.  But
>> for some reason, the centralized console just immediately
>> calls it a hard state.

*snip*

>
>
> Set passive_host_checks_are_soft=1 in nagios.cfg on your master
> server and things should start working as intended.
>
> --
> Andreas Ericsson   andreas.erics...@op5.se

Oh lord, THANK YOU.  That appears to have fixed that problem, which
was a pain in the ass.  In my defense, I *did* see that option, but
the way I interpreted the comments didn't quite match up with the
behavior I was seeing.  I should have experimented with it, I guess.
A slight adjustment to the comments would have thrown a red flag for
me - perhaps this is just a matter of personal interpretation, but
maybe the comments could be a bit more specific:


diff -uNp nagios-updated.cfg nagios.cfg
--- nagios-updated.cfg  Sat May 25 09:05:09 2013
+++ nagios.cfg  Sat May 25 09:02:37 2013
@@ -981,9 +981,9 @@ translate_passive_host_checks=0

 # PASSIVE HOST CHECKS ARE SOFT OPTION
 # This determines whether or not Nagios will treat passive host
-# checks as being HARD or SOFT.  By default, a single passive host
-# check result will put a host into an immediate HARD state type.
-# This can be changed by enabling this option.
+# checks as being HARD or SOFT.  By default, a passive host check
+# result will put a host into a HARD state type.  This can be changed
+# by enabling this option.
 # Values: 0 = passive checks are HARD, 1 = passive checks are SOFT

 passive_host_checks_are_soft=0


Does that make sense?  If I had read something like that, it would
have been immediately clear to me what was happening.

Thank you so much, Andreas!  On to the next problem with the
upgrade (something that can wait until next week)...

Benny


-- 
"The very existence of flamethrowers proves that sometime, somewhere,
someone said to themselves, 'You know, I want to set those people
over there on fire, but I'm just not close enough to get the job
done.'"  -- George Carlin


--
Try New Relic Now & We'll Send You this Cool Shirt
New Relic is the only SaaS-based application performance monitoring service 
that delivers powerful full stack analytics. Optimize and monitor your
browser, app, & servers with just a few lines of code. Try New Relic
and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_may
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Nagios v3.5.0 transitioning immediately to a HARD state upon host problem

2013-05-23 Thread C. Bensend


> I ran into a similar problem, because my template set the service to "*
> is_volatile=1*".
>
> http://nagios.sourceforge.net/docs/3_0/volatileservices.html

Hrmmm.  Good point...

However, is_volatile does not appear in any of my configuration
files, for any of the Nagios servers.  It isn't set by default,
is it?  The Nagios "config.cgi" page doesn't even list it, and
livestatus (what I use to query my running daemon) doesn't give
it as a column it can query.  I can't imagine it's on by default
in v3.5.0, but I can't really tell if it is or not.

I can try explicitly *disabling* it in all hosts, but I can't
really test that at the moment - out of here for a long weekend
in a few minutes.  If it gets annoying enough over the weekend,
I might *have* to test that theory.

Thank you very much.  I will still appreciate any input others can
give on this question - it just doesn't seem to be behaving
as it's configured!

Benny


-- 
"The very existence of flamethrowers proves that sometime, somewhere,
someone said to themselves, 'You know, I want to set those people
over there on fire, but I'm just not close enough to get the job
done.'"  -- George Carlin


--
Try New Relic Now & We'll Send You this Cool Shirt
New Relic is the only SaaS-based application performance monitoring service 
that delivers powerful full stack analytics. Optimize and monitor your
browser, app, & servers with just a few lines of code. Try New Relic
and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_may
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] Nagios v3.5.0 transitioning immediately to a HARD state upon host problem

2013-05-23 Thread C. Bensend


Hey folks,

   I recently made two major changes to my Nagios environment:

1) I upgraded to v3.5.0.
2) I moved from a single server to two pollers sending passive
   results to one central console server.

   Now, this new distributed system was in place for several months
while I tested, and it worked fine.  HOWEVER, since this was running
in parallel with my production system, notifications were disabled.
Hence, I didn't see this problem until I cut over for real and
enabled notifications.

(please excuse any cut-n-paste ugliness, had to send this info from
my work account via Outlook and then try to cleanse and reformat
via Squirrelmail)

   As a test and to capture information, I reboot 'hostname'.  This
log is from the nagios-console host, which is the host that accepts
the passive check results and sends notifications.  Here is the
console host receiving a service check failure when the host is
restarting:

May 22 15:57:10 nagios-console nagios: SERVICE ALERT: hostname;/var disk
queue;CRITICAL;SOFT;1;Connection refused by host


So, the distributed poller system checks the host and sends its
results to the console server:

May 22 15:57:30 nagios-console nagios: HOST
ALERT:hostname;DOWN;SOFT;1;CRITICAL - Host Unreachable (a.b.c.d)


And then the centralized server IMMEDIATELY goes into a hard state,
which triggers a  notification:

May 22 15:57:30 nagios-console nagios: HOST ALERT:
hostname;DOWN;HARD;1;CRITICAL - Host Unreachable (a.b.c.d)
May 22 15:57:30 nagios-console nagios: HOST NOTIFICATION:
cbensend;hostname;DOWN;host-notify-by-email-test;CRITICAL -
Host Unreachable (a.b.c.d)


   Um.  Wat?  Why would the console immediately trigger a hard
state? The config files don't support this decision.  And this
IS a problem with the console server - the distributed monitors
continue checking the host for 6 times like they should.  But
for some reason, the centralized console just immediately
calls it a hard state.

   Definitions on the distributed monitoring host (the one running
the actual host and service checks for this host 'hostname':

define host {
 host_namehostname
 aliasOld production Nagios server
 address  a.b.c.d
 action_url   /pnp4nagios/graph?host=$HOSTNAME$
 icon_image_alt   Red Hat Linux
 icon_image   redhat.png
 statusmap_image  redhat.gd2
 check_commandcheck-host-alive
 check_period 24x7
 notification_period  24x7
 contact_groups   linux-infrastructure-admins
 use  linux-host-template
}

The linux-host-template on that same system:

define host {
 name linux-host-template
 register 0
 max_check_attempts   6
 check_interval   5
 retry_interval   1
 notification_interval360
 notification_options d,r
 active_checks_enabled1
 passive_checks_enabled   1
 notifications_enabled1
 check_freshness  0
 check_period 24x7
 notification_period  24x7
 check_commandcheck-host-alive
 contact_groups   linux-infrastructure-admins
}

And said command to determine up or down:

define command {
 command_name check-host-alive
 command_line $USER1$/check_ping -H $HOSTADDRESS$ -w
5000.0,80% -c 1.0,100% -p 5
}


Definitions on the centralized console host (the one that notifies):

define host {
  host_namehostname
  aliasOld production Nagios server
  address  a.b.c.d
  action_url   /pnp4nagios/graph?host=$HOSTNAME$
  icon_image_alt   Red Hat Linux
  icon_image   redhat.png
  statusmap_image  redhat.gd2
  check_commandcheck-host-alive
  check_period 24x7
  notification_period  24x7
  contact_groups   linux-infrastructure-admins
  use  linux-host-template,Default_monitor_server
}

The "Default monitor server" template on the centralized server:

define host {
  name Default_monitor_server
  register 0
  active_checks_enabled0
  passive_checks_enabled   1
  notifications_enabled1
  check_freshness  0
  freshness_threshold  86400
}

And the linux-host-template template on that same centralized host:

define host {
   namelinux-host-template
   register0
   max_check_attempts  6
   check_interval  5
   retry_interval  1
   notification_interval   360
   notification_optionsd,r
   active_checks_enabled   1
   passive_checks_enabled  1
   notifications_enabled   1
   check_freshness 0
   check_period24x7
   not

Re: [Nagios-users] Not getting notifications when a service is in an UNKNOWN state

2013-05-23 Thread C. Bensend


> I am not sure what I'm doing wrong, I get notified when it's warning or
> critical but not unknown... I can't figure out why. Any suggestions?
> Below is the service check.
>
> define service{
>  hostgroup_name hostgroup-win-2003,hostgroup-win-2008
>  service_description Windows CPU check
>  check_command check_snmp_load_v1!stand!55!95!!$USER2$
>  use generic-service-pnp
>  notification_optionsu,w,c,r
>  notification_period workhours
>  contactsjeremy.p...@gilbarco.com
>  check_interval  15
>  retry_check_interval10
>  }
>
> and the command definition:
> define command {
>  command_namecheck_snmp_load_v1
>  command_line$USER1$/check_snmp_load.pl -H $HOSTADDRESS$ -C
> $ARG5$ -T $ARG1$ -w $ARG2$ -c $ARG3$ $ARG4$ -f
>  }

And your contact definition?



-- 
"The very existence of flamethrowers proves that sometime, somewhere,
someone said to themselves, 'You know, I want to set those people
over there on fire, but I'm just not close enough to get the job
done.'"  -- George Carlin


--
Try New Relic Now & We'll Send You this Cool Shirt
New Relic is the only SaaS-based application performance monitoring service 
that delivers powerful full stack analytics. Optimize and monitor your
browser, app, & servers with just a few lines of code. Try New Relic
and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_may
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Help with CPU Check Thresholds

2013-03-26 Thread C. Bensend


> This is how I configured the service, my aim is to get an alert when
> the CPU load  ( uptime ) reaches 10% and a critical when there is a
> 20%
>
> check_command   check_nrpe!check_load!10,4,3!20,15,10
> flap_detection_enabled  0
> notifications_enabled   1
> notification_optionsw,u,r,c
> notification_period 24x7
> check_period24x7
> check_interval  1
> max_check_attempts  2
> first_notification_delay0
> notification_interval   1
> }
>
> The problems is that I get WARN when the load is less than that:
> WARNING - load average: 1.77, 1.94, 3.04
> WARNING - load average: 2.11, 2.23, 3.45
> WARNING - load average: 1.90, 3.59, 4.34
> WARNING - load average: 5.65, 5.05, 4.86

You configured it to warn when the 15-minute average is 3.00, and
in your above four examples, the 15-minute averages are all > 3.00.
It is working like you configured it to.

The plugin's output is 1-minute, 5-minute, and 15-minute average.

Benny


-- 
"The very existence of flamethrowers proves that sometime, somewhere,
someone said to themselves, 'You know, I want to set those people
over there on fire, but I'm just not close enough to get the job
done.'"  -- George Carlin


--
Own the Future-Intel® Level Up Game Demo Contest 2013
Rise to greatness in Intel's independent game demo contest.
Compete for recognition, cash, and the chance to get your game 
on Steam. $5K grand prize plus 10 genre and skill prizes. 
Submit your demo by 6/6/13. http://p.sf.net/sfu/intel_levelupd2d
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] check_nt - MEMORY USAGE - incorrect results

2013-01-10 Thread C. Bensend


Not entirely accurate.

I just started troubleshooting a Win2008 R2 system yesterday - it has
16GB of physical RAM + 16 GB pagefile for a total of 32GB of virtual
memory.

The system is using 10.9GB of physical RAM, yet check_nt tells me
it's using 2.69GB.  Completely wrong, even if check_nt was only
talking about physical, only talking about virtual, or talking about
the sum.

Solution?  Remove yet another checkcommand using that outdated
program.

Benny


> this because in your server 2008 you will see that there is a virtual
> memory activated, go to Computer proprieties and see in performences you
> will have for exemple for " R2 (x64) server box (has SQL installed on it)
> 
> 12GB ram installed " 12GB of virtual memory.Finaly, Nagios take the some
> of
> memories ( virual memory + RAM).
>
>
>
> 2013/1/9 Andrew Thompson 
>
>>  Hi all,
>>
>> ** **
>>
>> Using the supplied check_nt plugin to check Memory Usage on Windows
>> servers.
>>
>> ** **
>>
>> Some report correctly, others report a complete load of old tosh!!!
>>
>> ** **
>>
>> I have tried 3 different versions of Windows OS, the version seems to
>> make
>> no odds.
>>
>> Doesnt matter if 32 or 64 bit either.
>>
>> ** **
>>
>> ** **
>>
>> ** **
>>
>> Some examples
>>
>> ** **
>>
>> ** **
>>
>> ** **
>>
>> MY primary domain controller  Windows Server 2008 R2 (x64)  8GB ram
>> installed
>>
>> ** **
>>
>> Output from the check appears correct:
>>
>> Memory usage: total:8205.64 Mb - used: 2902.96 Mb (35%) - free: 5302.67
>> Mb
>> (65%)
>>
>> ** **
>>
>> ** **
>>
>> ** **
>>
>> Another 2008 R2 (x64) server box (has SQL installed on it)  12GB ram
>> installed
>>
>> ** **
>>
>> Output thinks its got 24GB:
>>
>> Memory usage: total:24573.16 Mb - used: 1796.71 Mb (7%) - free: 22776.45
>> Mb (93%) 
>>
>> ** **
>>
>> ** **
>>
>> ** **
>>
>> ** **
>>
>> A Server 2003 Standard (x86) box (an internal test web server)  512MB
>> ram
>> installed
>>
>> ** **
>>
>> Output thinks its got over 1GB:
>>
>> Memory usage: total:1257.50 Mb - used: 333.30 Mb (27%) - free: 924.20 Mb
>> (73%)
>>
>> ** **
>>
>> ** **
>>
>> ** **
>>
>> A Server 2012 (x64) box (with HyperV installed)  28GB ram installed
>>
>> ** **
>>
>> Output thinks tis got 32GB:
>>
>> Memory usage: total:32500.80 Mb - used: 16709.37 Mb (51%) - free:
>> 15791.43
>> Mb (49%)
>>
>> ** **
>>
>> ** **
>>
>> ** **
>>
>> ** **
>>
>> Anybody any ideas as to why check_nt is returning incorrect info. I know
>> its incorrect but Nagios doesnt so where exactly is it reading these
>> values from?
>>
>> ** **
>>
>> Thanks in advance for anybodies input.
>>
>> ** **
>>
>> Regards
>>
>> ** **
>>
>> ** **
>>
>> ** **
>>
>>
>> --
>> Master Java SE, Java EE, Eclipse, Spring, Hibernate, JavaScript, jQuery
>> and much more. Keep your Java skills current with LearnJavaNow -
>> 200+ hours of step-by-step video tutorials by Java experts.
>> SALE $49.99 this month only -- learn more at:
>> http://p.sf.net/sfu/learnmore_122612
>> ___
>> Nagios-users mailing list
>> Nagios-users@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/nagios-users
>> ::: Please include Nagios version, plugin version (-v) and OS when
>> reporting any issue.
>> ::: Messages without supporting info will risk being sent to /dev/null
>>
>
>
>
> --
> Cordialement,
>
>  Omar SADDIKI
>  Master Réseaux et Systèmes
> --
> Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
> MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
> with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
> MVPs and experts. ON SALE this month only -- learn more at:
> http://p.sf.net/sfu/learnmore_122712___
> Nagios-users mailing list
> Nagios-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null


-- 
"The very existence of flamethrowers proves that sometime, somewhere,
someone said to themselves, 'You know, I want to set those people
over there on fire, but I'm just not close enough to get the job
done.'"  -- George Carlin


--
Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
MVPs and experts. ON SALE this month only -- learn more at:
http://p.sf.net/sfu/learnmore_122712
___
Nagios-users mailing list
Na

Re: [Nagios-users] Inconsistency of Nagios

2013-01-03 Thread C. Bensend


> If I understand correctly, I should create some plugin to kill all
> dependency process on a periodic interval. In my observation I did not see
> multiple parent process.

My recommendation was to write a plugin to *detect* multiple parents,
not kill them.

> Currently what I observe is whenever I see such sluggishness then I stop
> nagios service cleanup checkresult directory and start nagios again.

However, with your further note above, I don't think you're getting
multiple daemons running.

Forgive me, I don't recall the full details of your installation -
are you running any sort of NDO module?  NDOUtils?  Are you processing
perfdata?  If so, via what mechanism?


-- 
"The very existence of flamethrowers proves that sometime, somewhere,
someone said to themselves, 'You know, I want to set those people
over there on fire, but I'm just not close enough to get the job
done.'"  -- George Carlin


--
Master HTML5, CSS3, ASP.NET, MVC, AJAX, Knockout.js, Web API and
much more. Get web development skills now with LearnDevNow -
350+ hours of step-by-step video tutorials by Microsoft MVPs and experts.
SALE $99.99 this month only -- learn more at:
http://p.sf.net/sfu/learnmore_122812
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Inconsistency of Nagios

2013-01-03 Thread C. Bensend


I think you'll have to write one...  check_procs is not helpful
in this case, as the daemon forks off many processes to run plugins.
Just checking for number of Nagios processes won't help, as it won't
be aware of parent/child relationships.

Now, mind you, I *do* run check_procs on my Nagios servers just to
make sure I don't have runaways.  But it won't tell me if I have more
than one daemon running.

If you really want to work on this, you'll have to write a plugin
that is able to follow the parent/child relationships (take a look
at the ps man page) and is able to determine if there's more than
one parent process.

I *think* that is a decent direction to go.


> Slightly off topic, but how best to write nagios check that checks for
> this specific behavior (multiple instances of nagios running) ?
>
>
> - Original Message -
> From: "Mike Guthrie" 
> To: "Nagios Users List" 
> Sent: Wednesday, January 2, 2013 3:23:58 PM
> Subject: Re: [Nagios-users] Inconsistency of Nagios
>
>
>
> Typically when I've seen behavior like this, it's because there are
> multiple parent processes of Nagios running, so both instances are
> launching checks, and reaping each others results. Try killing off all
> Nagios processes, and then starting it fresh again to see if that resolves
> the issue.
>
> /etc/init.d/nagios stop
> killall -9 nagios
> /etc/init.d/nagios start
>
>
>
>
> On 1/2/2013 4:38 AM, Srikanth Gumma wrote:
>
>
>
> Hi,
>
>
> I need some help regarding nagios.
>
>
> We have around 500 Linux servers for which we are doing a ping and ssh
> monitoring only. The entire functionality is based on remote and no NRPE
> service is deployed. However I see very inconsistency on nagios
> functionality. sometimes I don't see any updates on the nagios console for
> more than one week.
>
>
> Our Nagios is installed on CentOS6.2 OS and it's the latest version Nagios
> Core 3.4.3. and I could only see some messages like below in
> /var/log/messages
>
>
>
>
>
> 'SSH' on host 'xyz' looks like it was orphaned (results never came back).
> I'm scheduling an immediate check of the service...
>
>
>
>
> any help is highly appreciated.
>
>
>
>
> Regards
> Srikanth
>
> --
> Master Java SE, Java EE, Eclipse, Spring, Hibernate, JavaScript, jQuery
> and much more. Keep your Java skills current with LearnJavaNow -
> 200+ hours of step-by-step video tutorials by Java experts.
> SALE $49.99 this month only -- learn more at:
> http://p.sf.net/sfu/learnmore_122612
>
> ___
> Nagios-users mailing list Nagios-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please
> include Nagios version, plugin version (-v) and OS when reporting any
> issue.
> ::: Messages without supporting info will risk being sent to /dev/null
>
> --
>
>
> Mike Guthrie
> Technical Team
> ___
> Nagios Enterprises, LLC
> Email: mguth...@nagios.com Web: www.nagios.com
> --
> Master Java SE, Java EE, Eclipse, Spring, Hibernate, JavaScript, jQuery
> and much more. Keep your Java skills current with LearnJavaNow -
> 200+ hours of step-by-step video tutorials by Java experts.
> SALE $49.99 this month only -- learn more at:
> http://p.sf.net/sfu/learnmore_122612
> ___
> Nagios-users mailing list
> Nagios-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
>
> --
> Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
> MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
> with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
> MVPs and experts. ON SALE this month only -- learn more at:
> http://p.sf.net/sfu/learnmore_122712
> ___
> Nagios-users mailing list
> Nagios-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
>


-- 
"The very existence of flamethrowers proves that sometime, somewhere,
someone said to themselves, 'You know, I want to set those people
over there on fire, but I'm just not close enough to get the job
done.'"  -- George Carlin


--
Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
with LearnDevNow - 3,200 step-by-step video tutorials

Re: [Nagios-users] Weird Nagios Problem

2012-12-04 Thread C. Bensend


> I have been running Nagios for over a year with no issues.  All of a
> sudden, all of my current loads on my linux servers all go into warning
> state at the same time, showing the exact same load, which then increments
> every hour to critical.  After a while (3 or 4 hours)  they all come back
> down to normal.
>
> Checking on the servers themselves using HTOP shows normal load levels
> throughout the time period.

Hmmm, yeah.  Check that service and checkcommand definition.  I bet
you're actually testing the load on the *Nagios* server, and not
the individual servers you think you're testing it on.

What's the Nagios server's load during that time?  I bet it matches
up...


-- 
"Unless you're a lawyer, you don't understand Oracle licensing.
That applies equally to Oracle employees as well as customers."
  -- Me, 2012-05-10



--
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] Distributed monitoring: v3.4.1 not translating host states like it should

2012-10-30 Thread C. Bensend


Hey folks,

   I am in the process of implementing a distributed monitoring
architecture, and I'm having some problems with host state.  Here
are the specs:

Nagios v3.4.1
RHEL 6.3
Using NSCA to send results to passive collector

   Yes, I have 'translate_passive_host_checks' set on the collector.  :)

   So, the system is up and running, and I do see host alerts in
/var/log/messages on the collector.  However, in the web interface,
all hosts remain "up".  I can go into the host details for a host
that's offline because of Sandy, and it reports a host status of
"UP", with the status information "PING CRITICAL - Packet loss 100%".

   Obviously, the host states coming from the passive monitors are
not being translated.

   Active host and service checks are disabled on the collector,
and enabled on the monitors.  Passive host and service checks are
enabled everywhere, and the collector *is* receiving them.

   I'd appreciate it if someone can help me out here...  I'll
provide whatever details are necessary...

Thanks much!

Benny


-- 
"Unless you're a lawyer, you don't understand Oracle licensing.
That applies equally to Oracle employees as well as customers."
  -- Me, 2012-05-10



--
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_sfd2d_oct
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Nagios Plugin Log Pattern Notification

2012-09-04 Thread C. Bensend


> http://labs.consol.de/lang/en/nagios/check_logfiles/
>
> check_logfiles is one of the more powerful plugins.

Couldn't agree more.  The consol.de guys are great!

Benny


-- 
"Death rays, advanced technology or not, no creature wants to be
stabbed in their hoo-hoo."-- Seen on zombiehunters.org


--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Dynamic warning/critical thresholds

2012-07-10 Thread C. Bensend


>> You've already received two replies, both stating that you'll
>> likely have to write some code to do it.  I'm not aware of
>> any common plugins out there that calculate rates of change and
>> alert appropriately.  Maybe they exist, but I don't recall
>> seeing any of them.
>>
>> Have you tried any of the plugin sites?
>>
>>
>
> Oh, I didn't receive any replies. Presumably the mails got lost in the
> ether.
>
> I'm happy to write code - I just wondered if there was a built-in way of
> doing this.

Not to my knowledge, no - the standard Nagios plugins don't know
about rate of change, and I haven't run across many (any?) third-
party plugins that do.

The difficult part is retaining state - yes, it's simple to use
a statefile, but if you have a lot of services you could end
up with thousands of state files.  It can become pretty ugly to deal
with them.

Your original message (and consequently, the replies you missed)
can be found here:

http://marc.info/?l=nagios-users&m=134037453807273&w=2


-- 
"Death rays, advanced technology or not, no creature wants to be
stabbed in their hoo-hoo."-- Seen on zombiehunters.org


--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Dynamic warning/critical thresholds

2012-07-10 Thread C. Bensend


> On 22/06/12 15:11, Jonathan Gazeley wrote:
>> I've got a bunch of Nagios plugins that monitor things like
>> DNS/HTTP/RADIUS hits per second.
>>
>> I've set what I believe to be sensible max/min warning thresholds but
>> what I really want is dynamic thresholds. If some quantity suddenly
>> doubles or halves, I'd like an alert.
>>
>> For example, if I usually serve 10 DNS lookups per second, and suddenly
>> it is doing 20 per second, that isn't a "fault" but I would like to know
>> about it, because it might mean there is a problem with the network in
>> general.
>>
>> Is there a way of doing this?
>>
>
> Any ideas?

You've already received two replies, both stating that you'll
likely have to write some code to do it.  I'm not aware of
any common plugins out there that calculate rates of change and
alert appropriately.  Maybe they exist, but I don't recall
seeing any of them.

Have you tried any of the plugin sites?


-- 
"Death rays, advanced technology or not, no creature wants to be
stabbed in their hoo-hoo."-- Seen on zombiehunters.org


--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] check_logfiles

2012-07-08 Thread C. Bensend


>  Here's the run of the command I am trying
>
>  [db:~] root% /opt/nagios/libexec/check_logfiles
> --logfile=/u01/app/oracle/admin/ecom/bdump/alert_ecom1.log --tag=oracle
> --rotation=linux --criticalpattern='ORA-00600' --warningpattern='ORA-*'
> OK - no errors or warnings|oracle_lines=0 oracle_warnings=0
> oracle_criticals=0 oracle_unknowns=0
>
>
>  This is what is in that logfile -
>
> [db07:~] root% grep 'ORA-00600'
> /u01/app/oracle/admin/ecom/bdump/alert_ecom1.log
> ORA-00600 - This is only a test.. please disregard

Try using the allyoucaneat option to test on the command line...
IIRC, check_logfiles will only check a reasonable number of lines in
the log file the first time, and from that point on only new ones.  If
that ORA-00600 is a long ways back, check_logfiles may not grok it.
The allyoucaneat option should force the plugin to check *all* lines
in the file.

Benny


-- 
"Death rays, advanced technology or not, no creature wants to be
stabbed in their hoo-hoo."-- Seen on zombiehunters.org


--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Dynamic warning/critical thresholds

2012-06-22 Thread C. Bensend


> I've got a bunch of Nagios plugins that monitor things like
> DNS/HTTP/RADIUS hits per second.
>
> I've set what I believe to be sensible max/min warning thresholds but
> what I really want is dynamic thresholds. If some quantity suddenly
> doubles or halves, I'd like an alert.
>
> For example, if I usually serve 10 DNS lookups per second, and suddenly
> it is doing 20 per second, that isn't a "fault" but I would like to know
> about it, because it might mean there is a problem with the network in
> general.
>
> Is there a way of doing this?

There's always a way.  :)

However, in this case, you're probably going to have to write a
plugin to do it.  You're asking to alert on a rate of change, and
I can't think of any of the stock plugins that do that.  Keeping state
between polling runs is something that can get a big ugly.

Do some rooting around the plugin community (the Nagios Exchange
and/or the Monitoring Exchange) to see if you can find some
examples of rate-aware plugins.  While it's not rate that it's
tracking, I know the check_iptraf*.pl plugins will at least keep
state between polling cycles, so that might be somewhere to start.

Benny


-- 
"Death rays, advanced technology or not, no creature wants to be
stabbed in their hoo-hoo."-- Seen on zombiehunters.org


--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] check_procs returning wrong data

2012-06-19 Thread C. Bensend


> I finally got it working but it was not that easy. As I am using CentOS 5,
> by default the requiretty value in the /etc/sudoers file is activated, so
> I
> had to edit it like this:
> #Defaultsrequiretty
> nagios ALL=(ALL) NOPASSWD:/usr/local/nagios/libexec/check_procs
>
> And the command in the .cfg file would be like this:
> command[check_total_procs]=sudo /usr/local/nagios/libexec/check_procs -w
> 150 -c 200

It's a bit safer to use this right before the user and command
definition:

Defaults:nagios !requiretty

That way, you're leaving the restriction in place for *other* users,
you're just overriding it for the nagios user.

Benny


-- 
"Death rays, advanced technology or not, no creature wants to be
stabbed in their hoo-hoo."-- Seen on zombiehunters.org


--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] High Service Check Latency

2012-05-22 Thread C. Bensend


> I've some broker modules to handle sql logging and distributed setup.

I bet you're using NDOUtils.  I wouldn't recommend that.  I couldn't
keep a Nagios server with under 6000 services limping along when
NDOUtils was running.  Eventually, the check latencies would go
through the roof and the entire server would get farther and farther
behind.

I went to Livestatus.  It took me all of 20 minutes to adjust my
reports to use the new interface, and I haven't restart my Nagios
daemon since (other than normal maintenance).


-- 
"The problem with quotes on the internet is that it's very hard to
verify their authenticity."   -- Abraham Lincoln



--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] does monitoring stop while nagios is flushing queued items?

2012-05-18 Thread C. Bensend


> My installation is centreon + nagios.
> Sometimes I need to do maintenance on mysql so I stop ndo2db and let
> nagios cache the result first.
> And when I start ndo2db, nagios will start flushing the items.
> I notice from the service perf data file, the data stops coming (or nagios
> not polling new data) while nagios is flushing queued item.
>
> I just want to confirm whether it is the behavior of nagios? If yes, any
> workaround for this?

This is one of the big reasons I stopped using NDOUtils - the broker
would regularly block the Nagios process.  So yes, you're correct -
your NDOUtils broker is blocking, and nothing is happening during
these periods of maintenance.

That, and the check latency.  With NDOUtils, I couldn't let my
Nagios daemons run a full week without restarting them or the
check latencies would shoot through the roof (that's a full restart,
not a reload).


-- 
"The problem with quotes on the internet is that it's very hard to
verify their authenticity."   -- Abraham Lincoln



--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] How many hosts and services are you monitoring with Nagios?

2012-05-17 Thread C. Bensend


> Nothing bad about using a VM, they just fall over horribly (generally
> speaking) when you try to push the virtual machine's virtual CPU cores
> and disk hard :p - kudos to you for making that work and pretty
> interesting setup!
>
> Thanks for sharing.

Happily, I haven't had the [dis]pleasure of hitting that particular
scenario.

A more intelligent design would be to have physical server pairs,
as I really don't like the reliance on our VM infrastructure.
However, the ESX team has their own monitoring, and with the
stability they've shown (knock on wood, heh) it wasn't deemed
worth it to increase our physical host count for this purpose.

Soo close to getting rid of all of our physical boxes...


-- 
"The problem with quotes on the internet is that it's very hard to
verify their authenticity."   -- Abraham Lincoln



--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] How many hosts and services are you monitoring with Nagios?

2012-05-17 Thread C. Bensend


> What kinds of numbers of hosts and services are you all monitoring?
> Which add-ons / distributed frameworks are you using?

At my ${CURRENT_JOB}, I'm monitoring around 600 hosts with just under
6000 services on a single VM running RHEL 5.  I do process perfdata
on the same node, and replicate all config data and state data to
a warm standby (also a VM).  Replication is done via MySQL replication
(for the config data) and NSCA (for the state data).  A custom perl
program dumps the extended state data (disabled notifications,
acknowledgements, etc) for import if needed.

Yes, I know, VM bad.  :)  Just not bad enough to spend real dollars on
more physical hosts.

This year, I will be bringing up a second pair of monitoring hosts at
a secondary data center, with much the same architecture.

Benny


-- 
"The problem with quotes on the internet is that it's very hard to
verify their authenticity."   -- Abraham Lincoln



--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Querying nagios object information through command line

2012-05-16 Thread C. Bensend


> Is there a script or a module that can be called through a command line
> and can retrieve nagios object definition , host , service?
>
> I am thinking of calling from php program.
>
> I found that config.cgi has an ability of fetching the object definition
> but it seems that it returns html info.
>
> It would be helpful if someone can share their thoughts.

Livestatus can do this, and it's MUCH quicker/more lightweight/better
(IMHO) than NDOUtils.

http://mathias-kettner.de/checkmk_livestatus.html

Benny


-- 
"The problem with quotes on the internet is that it's very hard to
verify their authenticity."   -- Abraham Lincoln



--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] nagios backup

2012-05-12 Thread C. Bensend


> Backup done successfully. All hosts are imported and being monitored, but
> a
> few things are not working, like SMS messages and mail messages.
>
> Does it need to be backed up from some directory?

You need to examine your notification commands.  You may have used
some third party software to send SMS messages, and that may not
be installed on the new system.  Also, your email configuration
may not be the same or may be incomplete on the new system.


-- 
"The problem with quotes on the internet is that it's very hard to
verify their authenticity."   -- Abraham Lincoln



--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Performance data not being returned

2012-05-10 Thread C. Bensend


> I've narrowed it down to a stage where running the plugin directly
> returns the right results, but running the plugin through check_nrpe on
> localhost returns this:
>
> [jg4461@dhcp1 log]$ /usr/lib64/nagios/plugins/check_nrpe -H localhost -c
> check_dhcpd_pools
> OK - all pools less than 80% full |
>
> What could cause NRPE to truncate the results in such a way?

Too much data?

Are you using SSL?

I don't know that I've seen this behavior before - it's always
been *invalid* perfdata that have caused this issue for me.


-- 
"The problem with quotes on the internet is that it's very hard to
verify their authenticity."   -- Abraham Lincoln



--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Performance data not being returned

2012-05-10 Thread C. Bensend


> The plugin is being executed through NRPE. Executing the plugin by hand
> seems to return valid perfdata:
>
> [jg4461@dhcp1 ~]$ /usr/lib64/nagios/plugins/check_dhcpd_pools
> OK - all pools less than 80% full | 'resnet-wireless-652'=43.769%;80;90,
> 'resnet-wireless-653'=47.923%;80;90,
> 'resnet-wireless-654'=46.201%;80;90,
> 'resnet-wireless-655'=44.681%;80;90,
> 'resnet-wireless-656'=47.720%;80;90,
> 'resnet-wireless-657'=47.112%;80;90,
> 'resnet-wireless-658'=42.452%;80;90, 'resnet-wireless-659'=0.304%;80;90,
> 'resnet-wireless-ratelimited-660'=1.114%;80;90,
> 'resnet-wireless-onlinepayment-661'=0.405%;80;90,
> 'resnet-wireless-onlinepayment-662'=0.405%;80;90,
> 'resnet-wireless-onlinepayment-663'=0.304%;80;90,
> 'resnet-wireless-consoles-665'=1.114%;80;90,
> 'resnet-wireless-message-666'=0.000%;80;90,
> 'resnet-wireless-instructions-667'=8.056%;80;90

http://nagiosplug.sourceforge.net/developer-guidelines.html#AEN201

I think you might try spaces, not commas.  I have developed a
number of plugins, and I've never used anything but spaces to
delimit the performance data.  If Nagios doesn't believe that's
valid data, it's going to ignore it.


-- 
"The problem with quotes on the internet is that it's very hard to
verify their authenticity."   -- Abraham Lincoln



--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] 2 Nagios boxes running together in different locations

2012-05-09 Thread C. Bensend


> Interesting  - How does it work though - I mean if the firewall plays up
> at
> Site A, it thinks everything in Site B is down - so Nagios GUI marks
> everything as down - what happens then if say a server in Site B does
> actually go down - we will not get alerted to that?

That's correct.  But, your proposed configuration wouldn't solve
this problem - if the firewall fails, the Nagios servers can't
contact each other anyway, so they could never agree on what's up
and what's down.

> I made a slight error in my original description - when the firewall "goes
> down" it cant contact anything at both locations, not just Site A, due to
> the fact that the protected interface stays up but just denies all
> traffic.
>
> We are currently working on this with GTA but im losing the will to live
> with 300 texts virtually every night!!

I've dealt with this situation before, and I've ended up
implementing two mostly standalone Nagios systems.  They each
check their own site, so if their external network goes away they
are still able to monitor and alert for the things they're
responsible for (you have to use out-of-band notifications of
course).  They also each check each other's *site*, ala the other
site's firewall, so the Nagios server at site A can alert and let
you know if site B goes away, but it *doesn't* try to alert you
for all of the hosts and services at site B.


-- 
"The problem with quotes on the internet is that it's very hard to
verify their authenticity."   -- Abraham Lincoln



--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] 2 Nagios boxes running together in different locations

2012-05-09 Thread C. Bensend


> We have a bit of a tempermental firewall at the moment that keeps "going
> down" thus resulting in everything appearing down to Nagios in Location A
> and it alerting like a loonatic for all hosts/services (88/156)

You could monitor the firewall, and configure it to be the parent of
the hosts behind it.  That way, when it "goes down", you only get the
alert for the firewall crapping out, and not all of the hosts that
depend on it.


-- 
"The problem with quotes on the internet is that it's very hard to
verify their authenticity."   -- Abraham Lincoln



--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] notifications

2012-05-01 Thread C. Bensend


> I am using Nagios 3.3.1
>
> I have got notifications by SMS working now
>
> Is there a way of defining what notifications go to email, what go to SMS
> and what can go to both.
>
> I would like this to apply to escalations as well if possible

I create two Nagios contacts for each person at my site, one for
email alerts and one for SMS alerts.  I then place the appropriate
contacts in each contactgroup, according to which type of alert
should be sent.

   Then, for each host/service, I include the appropriate
contactgroups.  For example, my Exchange servers' CPU services
get the exchange-admins-email contactgroup, which only sends
email to their contacts.  The Exchange servers' database
services, however, get the exchange-admins-pagers group, so they
get SMS'ed for database problems.

Benny


-- 
"The problem with quotes on the internet is that it's very hard to
verify their authenticity."   -- Abraham Lincoln



--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] use_large_installation_tweaks

2012-04-04 Thread C. Bensend


> Does anyone has configured it ?
>
> Is it really a good way to follow to reduce memory usage ?

For me, it was a good way to reduce memory and CPU, and it helped
with check latencies.

Although, the absolute best way to reduce check latencies for me
has been to dump NDOUtils.  Good lord, that was awful, had to restart
Nagios three times a week.

Benny


-- 
"The problem with quotes on the internet is that it's very hard to
verify their authenticity."   -- Abraham Lincoln



--
Better than sec? Nothing is better than sec when it comes to
monitoring Big Data applications. Try Boundary one-second 
resolution app monitoring today. Free.
http://p.sf.net/sfu/Boundary-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Building a reliable uptime monitoring model

2012-03-20 Thread C. Bensend


> So I was wondering how is everyone reliably checking and notifying the
> intended audience of server reboots with high rate of success.

I use check_logfiles from the Consol.de guys to watch for the actual
event or log entry specifying a reboot.  I don't count on the server
being down long enough to trigger a host down/host up alert.

Benny


-- 
"The problem with quotes on the internet is that it's very hard to
verify their authenticity."   -- Abraham Lincoln



--
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here 
http://p.sf.net/sfu/sfd2d-msazure
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Can someone at Nagios Enterprises please take a look at old.nagios.org?

2012-03-16 Thread C. Bensend


Nope, the actual list of commands linked at the bottom of that page.

Benny


> This one?
>
> http://nagios.sourceforge.net/docs/3_0/extcommands.html
>
> On Fri, Mar 16, 2012 at 7:40 PM, C. Bensend  wrote:
>
>>
>> I've been trying to get to the external commands reference for several
>> hours, keep getting "Error connecting to MySQL server"...
>>
>> Thanks!
>>
>> Benny
>>
>>
>> --
>> "The problem with quotes on the internet is that it's very hard to
>> verify their authenticity."   -- Abraham Lincoln
>>
>>
>>
>>
>> --
>> This SF email is sponsosred by:
>> Try Windows Azure free for 90 days Click Here
>> http://p.sf.net/sfu/sfd2d-msazure
>> ___
>> Nagios-users mailing list
>> Nagios-users@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/nagios-users
>> ::: Please include Nagios version, plugin version (-v) and OS when
>> reporting any issue.
>> ::: Messages without supporting info will risk being sent to /dev/null
>>
> --
> This SF email is sponsosred by:
> Try Windows Azure free for 90 days Click Here
> http://p.sf.net/sfu/sfd2d-msazure___
> Nagios-users mailing list
> Nagios-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null


-- 
"The problem with quotes on the internet is that it's very hard to
verify their authenticity."   -- Abraham Lincoln



--
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here 
http://p.sf.net/sfu/sfd2d-msazure
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] Can someone at Nagios Enterprises please take a look at old.nagios.org?

2012-03-16 Thread C. Bensend


I've been trying to get to the external commands reference for several
hours, keep getting "Error connecting to MySQL server"...

Thanks!

Benny


-- 
"The problem with quotes on the internet is that it's very hard to
verify their authenticity."   -- Abraham Lincoln



--
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here 
http://p.sf.net/sfu/sfd2d-msazure
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] Performance data not being written to file with 3.3.1

2012-03-05 Thread C. Bensend


Hey folks,

   So, I have the following setup after some re-architecting this past
weekend:

* Primary Nagios server running 3.2.3
* Secondary Nagios server running 3.3.1, receiving all check results
  via NSCA

   Everything should be identical between the primary and secondary
servers, other than the secondary system not running active checks
and having notifications disabled.  Each system should process its
own performance data.

   However, I'm wrestling with the new secondary server...  I have
it configured to write host and service perfdata to a file, and then
npcd processes that perfdata from there.  Unfortunately, the host and
service perfdata files are being written with no data in them (0
bytes).

   I'm *getting* the perfdata from the primary host - I can view it
in the secondary hosts' web interface.  It's there.  But for some
reason, the secondary Nagios daemon isn't writing that data into the
file, but it *is* creating the file.

   I don't know of any reason this shouldn't work...  Does anyone
with more knowledge of the nuts-n-bolts know why a passive Nagios
daemon (no active checks, all data received via NSCA) wouldn't
write the perfdata it receives?  It thinks it has data - the host
and service perfdata files are created and removed as the Nagios
daemon creates them, and the process_* commands process them.

   I'll provide whatever details are necessary, I just want to
verify the basic premise of my setup before flooding you with
information.  :)

Thanks much!

Benny


-- 
"The problem with quotes on the internet is that it's very hard to
verify their authenticity."   -- Abraham Lincoln



--
Try before you buy = See our experts in action!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-dev2
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Disk I/O monitoring in Nagios

2012-03-02 Thread C. Bensend


> Would like to ask how to add Disk I/O monitoring on Nagios? We are using
> NSClient++ agent. Do we still have to use specific "check_io" or
> something like that to monitor it?
>
> If there is a documentation, we would be glad to look into it.

I use the CheckCounter functionality built in to NSClient++ to
monitor the performance counter for the particular volume.  It
works very well, and I get performance data to graph it with
PNP4Nagios.

Benny


-- 
"The problem with quotes on the internet is that it's very hard to
verify their authenticity."   -- Abraham Lincoln



--
Virtualization & Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing 
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Nagios 3.2.3 -> 3.3.1 upgrade path

2012-03-02 Thread C. Bensend


>> I just want to make sure my 3.2.3 system and my 3.3.1 system
>> will be able to talk.  :)
>>
>
> They will, so no worries there.

Fantastic.  Thanks, Andreas!

Benny


-- 
"The problem with quotes on the internet is that it's very hard to
verify their authenticity."   -- Abraham Lincoln



--
Virtualization & Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing 
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] Nagios 3.2.3 -> 3.3.1 upgrade path

2012-03-01 Thread C. Bensend


Hey folks,

   I'm planning a migration to 3.3.1, and I had a quick question
for those of you that have done it.

   I have a manual failover setup, with one monitoring node that
sends all results to another warm standby system via NSCA.  If I
rebuild one system to 3.3.1 and the active monitoring node remains
on 3.2.3 for a week or two, are there going to be any issues?  I
want to be sure they're compatible enough to run for a short time,
so I'm not rebuilding my entire environment in an afternoon.

   Normally, I'd just upgrade the software and go, but I'm taking
this opportunity to make some other adjustments to my system, so
I'll be doing bare-metal installs from the OS up.

   I just want to make sure my 3.2.3 system and my 3.3.1 system
will be able to talk.  :)

Thanks much!

Benny


-- 
"The problem with quotes on the internet is that it's very hard to
verify their authenticity."   -- Abraham Lincoln



--
Virtualization & Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing 
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] NRPE allowed_hosts directive

2012-02-29 Thread C. Bensend


> tried putting the IP addresses of all the hosts in the network. However,
> when I assign this variable to all the IP addresses (which is very long),

U...  Just how many Nagios servers do you HAVE?

That configuration option is to list the Nagios servers that will
be polling your NRPE daemon, not "all the hosts in the network".

Just wanted to make sure you're understanding the option correctly...

Benny


-- 
"The problem with quotes on the internet is that it's very hard to
verify their authenticity."   -- Abraham Lincoln



--
Virtualization & Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing 
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Monitoring log files of oracle DB on windows server using nagios

2012-02-27 Thread C. Bensend


> Need the best solution to monitor log files of the DB server actually
> oracle DB log files on windows server. Please suggest what can be the best
> way to achieve this.

check_logfiles from the Consol.de guys is your friend.  It even
groks the Oracle log file formats.

Benny


-- 
"The problem with quotes on the internet is that it's very hard to
verify their authenticity."   -- Abraham Lincoln



--
Try before you buy = See our experts in action!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-dev2
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Eventlog monitoring through NSClient++

2012-02-24 Thread C. Bensend


> We actually use check_logfiles with NSClient so haven't seen this, and we
> have tons of rules.  Might be worth looking at.  Not that anything is
> wrong with NSClient :)  just check_logfiles also has more regex and
> options.

+1

The consol.de guys are awesome, and check_logfiles is another
example of their excellent contributions to the community.

Benny


-- 
"The problem with quotes on the internet is that it's very hard to
verify their authenticity."   -- Abraham Lincoln



--
Virtualization & Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing 
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] monitoring dhcp

2012-01-24 Thread C. Bensend


> I don't think you'll have much trouble getting this via SNMP. It is
> defined in the MIB on a per-scope basis, suggest you go a snmpwalk on
> the OID I gave earlier and see what you get.
>
> MIB excerpt:
>
>scopeTable OBJECT-TYPE
>SYNTAX  SEQUENCE OF ScopeTableEntry
>ACCESS  read-only
>STATUS  mandatory
>DESCRIPTION
>"A list of subnets maintained by the server"
>::= { dhcpScope 1 }
>
>scopeTableEntry  OBJECT-TYPE
>SYNTAX  ScopeTableEntry
>ACCESS  read-only
>STATUS  mandatory
>DESCRIPTION
>   "This is the row corresponding to a subnet"
> INDEX   { subnetAdd }
>::= { scopeTable 1 }
>
> ScopeTableEntry ::= SEQUENCE {
>   subnetAdd
>   IpAddress,
>
>   noAddInUse
>   Counter,
>
>   noAddFree
>   Counter,
>
>   noPendingOffers
>   Counter
>
>   }
>subnetAdd  OBJECT-TYPE
>SYNTAX  IpAddress
>ACCESS  read-only
>STATUS  mandatory
>DESCRIPTION
>   "This is the subnet address "
>::= { scopeTableEntry 1 }
>
>
>noAddInUse  OBJECT-TYPE
> SYNTAX  Counter
> ACCESS  read-only
> STATUS  mandatory
>DESCRIPTION
>  "This is the no. of addresses in use"
>   ::= { scopeTableEntry 2 }
>
>noAddFree  OBJECT-TYPE
> SYNTAX  Counter
> ACCESS  read-only
> STATUS  mandatory
>DESCRIPTION
>  "This is the no. of addresses that are free "
>   ::= { scopeTableEntry 3 }
>
>noPendingOffers  OBJECT-TYPE
> SYNTAX  Counter
> ACCESS  read-only
> STATUS  mandatory
>DESCRIPTION
>  "This is the no. of addresses that are currently in the offer
>  state"
>   ::= { scopeTableEntry 4 }
>
>END

Thank you, Giles!  This doesn't help me (I don't have SNMP enabled
on my hosts and don't plan on doing so), but it's good to know
for the future...

That's certainly better than what is exposed elsewhere...

Benny


-- 
"Cats land on their feet. Toast lands peanut butter side down. A cat
with toast strapped to its back will hover above the ground in a state
of quantum indecision."   -- Unknown


--
Keep Your Developer Skills Current with LearnDevNow!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-d2d
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] monitoring dhcp

2012-01-24 Thread C. Bensend


> We've a Windows 2008 server with DHCP role. There is an option to display
> the statics of the scope.
> Total Addresses ..
> In Use .. %
> Available ..%
>
> Is it possible to gt these information available in Nagios.

I haven't had much luck with this.  Microsoft doesn't expose hardly
any of this data via WMI or any other interface that I've found.
I haven't looked at Powershell yet, mostly because many of my
servers do not have it installed (2003 -vs- 2008).

The best I've been able to do is watch the event log for DHCP
server complaints about a scope getting close to consumed.  Even
*that* has been problematic, as the DHCP server service seems to
arbitrarily decide when it wants to complain.  I ended up writing
a custom plugin that watches the event log for those events, parsing
the output, and deciding on whether it's appropriate to alert.

Benny


-- 
"Cats land on their feet. Toast lands peanut butter side down. A cat
with toast strapped to its back will hover above the ground in a state
of quantum indecision."   -- Unknown


--
Keep Your Developer Skills Current with LearnDevNow!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-d2d
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] GSM gateway - virtualization ...

2012-01-18 Thread C. Bensend


> I'm running Nagios on OpenSUSE. OpenSUSE is a virtual machine under the
> Windows 2008 HYPER-V. I'd like to send SMS messages instead of e-mails,
> because mails can't be delivered in case the SMTP server is down. I
> don't know  any specific hardware for GSM gateway and I have no idea how
> to use it in virtual environment, how to connect it to RS232? port in
> the host machine and forward it to the Nagios on the SUSE virtual??
> Would anyone be so kind and give me an advice and will share his opinions?

We run Nagios servers as virtual machines on mid-sized ESX clusters,
so our VMs might end up on any given ESX host at any given time.  As
a result, I virtualized our serial modem using the IOLAN product from
Perle.  It is basically a network-to-serial adapter.  Our serial USR
modem plugins into it, and it plugs into the network.

Then, on the Nagios servers (RHEL), the Perle software is installed
and creates a "serial port" that Nagios talks to.  That virtual
port simply talks to the IOLAN device over the network, and acts as
a local serial modem.  Now, our Nagios servers can end up anywhere
they like, and the modem always stays "local" as far as Nagios is
concerned.

It's not infallible, every now and again the two components lose
communications, but I check for that via Nagios.  There certainly
exists the possibility that we could:

1) Lose communications with the virtual modem

AND

2) Have a widespread network outage

at the same time that would completely clobber notifications, but
the chances are pretty minor, and the costs are very reasonable.

Benny


-- 
"Cats land on their feet. Toast lands peanut butter side down. A cat
with toast strapped to its back will hover above the ground in a state
of quantum indecision."   -- Unknown


--
Keep Your Developer Skills Current with LearnDevNow!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-d2d
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Scheduled downtime for mass hosts?

2012-01-12 Thread C. Bensend


> Ah Benny, I didn't know you could schedule maintenance for each individual
> host group
>
> I have my 86 hosts arranged into 4 hostgroups so I will just do this.
>
> 4 clicks and job done, thought I was in for the long haul by clicking all
> 86 hosts 1 by 1.
>
> Thanks, you have saved me a load of time!

You can issue commands to both hostgroups and servicegroups...  It
makes it much easier to deal with outages and scheduled maintenance.
And because hosts and services can be in more than one group, you
can arrange them however you like according to roles, network
placement, or even physical site.  :)

Benny


-- 
"Cats land on their feet. Toast lands peanut butter side down. A cat
with toast strapped to its back will hover above the ground in a state
of quantum indecision."   -- Unknown


--
RSA(R) Conference 2012
Mar 27 - Feb 2
Save $400 by Jan. 27
Register now!
http://p.sf.net/sfu/rsa-sfdev2dev2
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Scheduled downtime for mass hosts?

2012-01-12 Thread C. Bensend


> Tonight I will be forming maintenance on over 50 of my servers and will be
> taking firewall and routing links out.
>
> I have 86 hosts that this will affect.
>
> Im going to put them into scheduled downtime in Nagios.
>
> I have my hosts divided into hostgroups.
>
> Is there a quick way to schedule all 86 hosts into Nagios downtime rather
> than having to click on each host in the web GUI and doing them
> individually?

This is precisely why I have a hostgroup that contains *all* my hosts.
I just issue a single downtime command for the host*group*, and voila.

Benny


-- 
"Cats land on their feet. Toast lands peanut butter side down. A cat
with toast strapped to its back will hover above the ground in a state
of quantum indecision."   -- Unknown


--
RSA(R) Conference 2012
Mar 27 - Feb 2
Save $400 by Jan. 27
Register now!
http://p.sf.net/sfu/rsa-sfdev2dev2
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Changing the version of nagios that appears in the e-mail notifications

2011-11-30 Thread C. Bensend


> I would like to change the version that is displayed to reflect that of
> the
> release that is currently on the server. What file(s) do I
> need to modify in order to accomplish this?

Your email notifications are just another command, so you need to
update the definition of that command.  It may be in misccommands.cfg,
but really, you're the only one that can answer that one as it could
be in *any* of the config files.  :)

'grep -r' would be your friend here.

Benny


-- 
"Cats land on their feet. Toast lands peanut butter side down. A cat
with toast strapped to its back will hover above the ground in a state
of quantum indecision."   -- Unknown


--
All the data continuously generated in your IT infrastructure 
contains a definitive record of customers, application performance, 
security threats, fraudulent activity, and more. Splunk takes this 
data and makes sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-novd2d
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Check

2011-10-06 Thread C. Bensend


> It works fine, but I prefer to use an other method, most lighter than the
> check_by_ssh.
> Do you know an other way to do that, via SNMP for exemple.

I run NRPE on my Linux systems...  It is much lighter than using
check_by_ssh.

Benny


-- 
"Open your door, or I open your wall."
 -- Seen on an image on fukung.net


--
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2dcopy1
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Suggestions for checking DHCP Scopes

2011-09-23 Thread C. Bensend


> Does anyone have any suggestions for checking DHCP scopes on Windows
> servers.  I saw one util on Nagios Exchange that uses a vbs script but I
> have no idea how I would set that up.

Define "checking DHCP scopes"?  Do you need to make sure DHCP is running?
Do you need to make sure DHCP clients can get addresses?  Do you need to
check how many IPs are free in the scopes?  Each of the above could be
a different failure mode.

Personally, I gave up on checking for lease availability, we have way
too many scopes, VLANs, etc.  I now do the following:

1) Check to make sure the DHCP Server service is running
2) Via a custom setup between consol.de's excellent check_logfiles
   plugin and a perl wrapper I wrote, check for an event 1020 in the
   system event log and parse the output

#2 was a pain, as Windows apparently has no hard-and-fast way to
check on IP availability in the scopes, and randomly logs a 1020
whenever it feels you just don't have enough addresses left.  With
the wrapper program, I parse the 1020 events and apply my own
thresholds to determine good, bad, or ugly.

Microsoft:  why oh why do you not expose the IP address utilization
via performance counters or WMI?

Benny


-- 
"Open your door, or I open your wall."
 -- Seen on an image on fukung.net


--
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security
threats, fraudulent activity, and more. Splunk takes this data and makes
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2dcopy2
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] muti-sessions in Nagios

2011-08-05 Thread C. Bensend


> is there a possbility to ceate multi-sessions in nagios
>
> my aim is create many sessions for administrator and i want that an
> administrator (central) look all the maps but the ohter look just thare
> maps
>
> for example :
>
> admin central : in site 0 : supervise all sites
> admin Nubmer1: in site 1 ; supervise the parc of a thos site 1
>
> admin Nubmer2: in site 2 ; supervise the parc of a thos site
>
> and so on

If I understand your question correctly, Nagios pretty much does
that out-of-the-box.

By default, if you use authentication, authenticated users will only
be able to see/issue commands to hosts and services for which they
are contacts.

So, if admin1 is a contact for hosts A, B, and C, while admin2 is
a contact for hosts D and E, then admin1 will only see his/her three
hosts, and admin2 will only see his/her two.

Benny


-- 
"Open your door, or I open your wall."
 -- Seen on an image on fukung.net


--
BlackBerry® DevCon Americas, Oct. 18-20, San Francisco, CA
The must-attend event for mobile developers. Connect with experts. 
Get tools for creating Super Apps. See the latest technologies.
Sessions, hands-on labs, demos & much more. Register early & save!
http://p.sf.net/sfu/rim-blackberry-1
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Nagios force host/service check

2011-07-12 Thread C. Bensend


> Thanks for the response. Below are entries that are made in
> ssl_access_log:
>
> When I click on "Re-schedule the next check of this service", it creates
> the following entry:
>
> 139.222.121.213 - xca10...@uea.ac.uk [12/Jul/2011:11:03:10 +0100] "GET
> /nagios/cgi-bin/cmd.cgi?cmd_typ=7&host=cn001&service=cpu HTTP/1.1" 200
> 3143
>
> When I click on the "commit" button after clicking the "force check" tick
> box, it then creates the following two entries:
>
> 139.222.121.213 - - [12/Jul/2011:11:03:14 +0100] "POST
> /nagios/cgi-bin/cmd.cgi HTTP/1.1" 401 490
> 139.222.121.213 - xca10...@uea.ac.uk [12/Jul/2011:11:03:14 +0100] "POST
> /nagios/cgi-bin/cmd.cgi HTTP/1.1" 200 1314
>
> It's not clear what the URL is.

OK, so, it's not a GET, it's a POST.  I'll let someone more familiar
with that comment...

Doing this via the web interface seems cumbersome, but it might be
possible to do it via the command file somehow.  Not like that's
much better...

http://old.nagios.org/developerinfo/externalcommands/commandlist.php
http://nagios.sourceforge.net/docs/nagioscore/3/en/extcommands.html

Good luck!

Benny


-- 
"You were doing well until everyone died."
-- "God", Futurama



--
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Nagios force host/service check

2011-07-07 Thread C. Bensend


> Does anyone know the HTTP(S)-GET command to force check a host/service? I
> would like a host to execute the HTTP(S)-GET command to force Nagios to
> check the status as it is booting up.
>
> Any help will be greatly appreciated. Thanks in advance.

Watch your http access log, and execute the same command with a web
browser.

Take that URL and use wget with the --http-user and --http-password
options.

Voila!

Benny


-- 
"You were doing well until everyone died."
-- "God", Futurama



--
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Nagios Issue of not detecting down/up status of server and delay of mails notifications

2011-06-17 Thread C. Bensend


> Is this a valid issue with nagios or is there any way to scale it up. How
> can a network/sever admin can believe on it if this works like this.

This is a configuration issue...  I monitor some 700 hosts and
6000+ services on a single host and my notifications go out
instantly (once max_check_attempts has been hit).

Choose a host that has shown this problem, and grep your Nagios
log for it...  Copy-n-paste the entries here.  That will show
what Nagios thinks about the situation.

Benny


--
"You were doing well until everyone died."
-- "God", Futurama



--
EditLive Enterprise is the world's most technically advanced content
authoring tool. Experience the power of Track Changes, Inline Image
Editing and ensure content is compliant with Accessibility Checking.
http://p.sf.net/sfu/ephox-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] How to monitor specific windows services using nsclient++

2011-06-08 Thread C. Bensend


> Thanks Benny,..but still i couldn't understand is check_nrpe is used for
> monitoring windows servers because what i know it's for monitoring remote
> linux servers only.. If yes do i need to install check_nrpe on my Nagios
> Server..
>
> Also i am already monitoring these basic things but i want to monitor
> specific services for e.g say mssql running or down..Similarly other
> important windows services...

NSClient++ listens for NRPE requests as well, on TCP port 5666.  Hence,
if you have NSClient++ installed on your Windows systems, you can use
check_nrpe to talk to them.  And yes, then you'd need to install the
check_nrpe tool on your Nagios server.

I prefer using check_nrpe, I only use check_nt for a very small number
of services.

The command definition I gave you will check a service on a remote
Windows server to see if it's running or not.  So, open up your
Windows services snap-in, and you can check any of the services
listed the same way.


-- 
"You were doing well until everyone died."
-- "God", Futurama



--
EditLive Enterprise is the world's most technically advanced content
authoring tool. Experience the power of Track Changes, Inline Image
Editing and ensure content is compliant with Accessibility Checking.
http://p.sf.net/sfu/ephox-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] How to monitor specific windows services using nsclient++

2011-06-08 Thread C. Bensend


> Actually i want to monitor specific windows services using nagios and
> nsclient++ agent installed on Windows servers...

OK.

$USER1$/check_nrpe -H $HOSTADDRESS$ -u -c CheckServiceState -a
   ShowAll "$ARG1$"="$ARG2$"

Then, in your Nagios service command definition, call that command
with two arguments:

1) The service name from the Windows services snap-in
2) started or stopped, according to what state you want the service
   to be in during normal operation

>  Also i don't know which critical windows services to
> monitor exactly but my boss says it should be done... Can you people give
> me
> some help regarding this..

On our Windows systems, we monitor a basic set of standard services:

* Disk volume space
* CPU utilization
* Memory utilization
* My NSClient++ version
* My NSClient++ configuration version
* My plugins version
* Windows version (for informational purposes only)

Benny


-- 
"You were doing well until everyone died."
-- "God", Futurama



--
EditLive Enterprise is the world's most technically advanced content
authoring tool. Experience the power of Track Changes, Inline Image
Editing and ensure content is compliant with Accessibility Checking.
http://p.sf.net/sfu/ephox-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Getting Started with Nagios

2011-05-30 Thread C. Bensend


> I then restart the web server, expecting to see a new host inf1
>
> But the host count has not increased and I can't see any reference to the
> host.  I also can not see the new host group I defined.
>
> So obviously I am missing something fundamental.
>
> Thanks for any incite you care to share :)

"I then restart the web server"...  If you mean that literally, as
in you restarted Apache, that won't change anything for Nagios.

Apache only provides the web server for the UI, it has nothing to
do with Nagios.  The Nagios daemon is the one you need to restart
(or more accurately, you can send it a SIGHUP signal) to pick up
on your configuration file changes.

Now, if you *did* restart Nagios and your changes aren't appearing
in the web interface, do the following:

1) Stop the Nagios daemon
2) Now, go stop the *other* Nagios daemon(s)

It is a *very* common problem, especially when people are just
starting out, to accidentally start more than one Nagios daemon.
Changes are made, *one* of the Nagios daemons are restarted, while
the other continues to happily run the old configuration (and
show up in the web interface).

Benny


-- 
"You were doing well until everyone died."
-- "God", Futurama



--
Simplify data backup and recovery for your virtual environment with vRanger. 
Installation's a snap, and flexible recovery options mean your data is safe,
secure and there when you need it. Data protection magic?
Nope - It's vRanger. Get your free trial download today. 
http://p.sf.net/sfu/quest-sfdev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Services are dependent on the host they run on?

2011-05-27 Thread C. Bensend


> Yesterday for instance, a host went down because of a hd controller
> failure, and I received 22 sms..

I apologize if this has already been stated, I haven't been following
this thread too closely.

When this happened, was the host down *in a network sense*, or was
it just down in a user sense?  Ie, was it still pingable?

A situation I've dealt with in the past is that a host's network
stack might still be "alive enough" (ie, pingable), while the host
itself is sitting at a kernel panic or locked up.  In that case,
if you're using ping for the host check, Nagios would have no way
of knowing that the host is down, because it still responds.

In those [rare] cases, I've had to define a second command that
requires a more intelligent response from a host, and then used
that as the host check command.  Notable examples would be old
school Sun machines, which are still pingable when they're sitting
at the OK> prompt (ie, operating system is not running).

Just a thought.

Benny


-- 
"You were doing well until everyone died."
-- "God", Futurama



--
vRanger cuts backup time in half-while increasing security.
With the market-leading solution for virtual backup and recovery, 
you get blazing-fast, flexible, and affordable data protection.
Download your free trial now. 
http://p.sf.net/sfu/quest-d2dcopy1
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Default Acknowledge Behavior

2011-05-13 Thread C. Bensend


> Can the default behavior for acknowledging an event be changed. As in can
> the default be changed from the "Sticky Acknowedgement" being always
> checked, to always unchecked?
>
> I have read a post on the internet that this is hard coded and you would
> have to change the source and recompile in order to accomplish this.
> http://article.gmane.org/gmane.network.nagios.user/54147
>
> This post is very old...maybe this has changed in 3.2.3?

This behavior cannot be changed without hacking cgi/cmd.c.  It's a
simple change:  on lines 951 and 977 (this is Nagios 3.2.3), remove
the CHECKED from that line.  That will make the checkboxes default
to *not* checked.

I believe this is a much more sane default than checked.

Benny


-- 
"You were doing well until everyone died."
-- "God", Futurama



--
Achieve unprecedented app performance and reliability
What every C/C++ and Fortran developer should know.
Learn how Intel has extended the reach of its next-generation tools
to help boost performance applications - inlcuding clusters.
http://p.sf.net/sfu/intel-dev2devmay
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] pnp4nagios?

2011-04-14 Thread C. Bensend


> I've configured pnp and it has been successfully graphing data since the
> actual installation. However, I have nagios perfdata logs from the past
> year (host_perfdata.log, service_perfdata.log) that are not being parsed
> that I want to be included.

...

> Is there a certain method, configuration I need to follow if I want to
> include this historical data?

H...  RRD databases expect to be updated in a sequential
fashion, on regular-ish intervals.  I'm not sure that you can "go
back" and add the past data.

I will defer to those on the list that are more familiar with
RRD - is that even possible?

Benny


-- 
"You were doing well until everyone died."
-- "God", Futurama



--
Benefiting from Server Virtualization: Beyond Initial Workload 
Consolidation -- Increasing the use of server virtualization is a top
priority.Virtualization can reduce costs, simplify management, and improve 
application availability and disaster protection. Learn more about boosting 
the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] pnp4nagios?

2011-04-14 Thread C. Bensend


> Anyone here using pnp4nagios for graphing? I'm having some configuration
> issues and wanted to see if there was someone who could assist?

Sure, I know several of us use it.  What issues are you having?

Benny


-- 
"Hairy ape nads."-- Colleen, playing Neverwinter Nights



--
Benefiting from Server Virtualization: Beyond Initial Workload 
Consolidation -- Increasing the use of server virtualization is a top
priority.Virtualization can reduce costs, simplify management, and improve 
application availability and disaster protection. Learn more about boosting 
the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Why is check_openmanage so slow on PowerEdge R510?

2011-04-14 Thread C. Bensend


> Unfortunately OMSA has no info on when the charge cycle is expected to
> be finished, or how long it has been in its current learn/charge state:
>
>   # omreport storage battery controller=1
>   Battery 0 on Controller PERC 6/E Adapter (Slot 1)
>
>   Controller PERC 6/E Adapter (Slot 1)
>   ID: 0
>   Status: Non-Critical
>   Name  : Battery 0
>   State : Charging
>   Recharge Count: Not Applicable
>   Max Recharge Count: Not Applicable
>   Predicted Capacity Status : Ready
>   Learn State   : Requested
>   Next Learn Time   : 0 hours
>   Maximum Learn Delay   : 7 days 0 hours
>   Learn Mode: Auto
>
> I could make the plugin record it, but then I would violate my principle
> that the plugin should be stateless... Introducing state in the plugin
> complicates things.

Hmmm, that's unfortunate that they don't track a duration or start
time.  :(

And no, I fully agree - plugins should be stateless.  Keeping track
of state is an ugly, error-prone business.

Benny


-- 
"Hairy ape nads."-- Colleen, playing Neverwinter Nights



--
Benefiting from Server Virtualization: Beyond Initial Workload 
Consolidation -- Increasing the use of server virtualization is a top
priority.Virtualization can reduce costs, simplify management, and improve 
application availability and disaster protection. Learn more about boosting 
the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Why is check_openmanage so slow on PowerEdge R510?

2011-04-14 Thread C. Bensend


> At one time we had a battery that didn't finish charging for a week,
> called Dell and got a replacement battery. This was during a regular
> charge cycle. In your case I would give it a few more days.

...

> But, as we in fact did experience a case where the battery never
> finished charging I would advice against this. We just ignore the
> battery charge warnings unless they persist for days. It can be
> annoying, but we decided that we can live with it :)

Trond,

   Is there anything in OMSA that tells how *long* a battery has
been charging?  I simply got so tired of the charging warnings
that I blacklisted the bat_charge totally, but I'd still like to
detect that type of error - where the battery never finishes
charging.

   If OMSA has it, it would be great to have the option within
check_openmanage to specify a length of time threshold for battery
charging.  :)

Benny


--
"Hairy ape nads."-- Colleen, playing Neverwinter Nights



--
Benefiting from Server Virtualization: Beyond Initial Workload 
Consolidation -- Increasing the use of server virtualization is a top
priority.Virtualization can reduce costs, simplify management, and improve 
application availability and disaster protection. Learn more about boosting 
the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Web Hostgroups View / Screen ... hostgroups missing ...

2011-03-23 Thread C. Bensend


> I have a User defined that, when I login, and click on Host Groups, only
> one of the two hostgroups for which he is a member gets listed  Although
> all of the hosts are listed if I go into the Hosts section 
>
> For example, if I go under Hosts -> HostA, it shows 'Member of GroupA,
> GroupB'  If I click on either Group A or Group B, it gives me the error:
>
> "It appears as though you do not have permission to view information for
> any of the hosts you requested...
> If you believe this is an error, check the HTTP server authentication
> requirements for accessing this CGI
> and check the authorization options in your CGI configuration file."

I'll give you pretty good odds you're *not* a contact for one
of those hosts...  See the message I posted last month:

http://marc.info/?l=nagios-users&m=129788210317124&w=2

Could this be the same situation?

Benny


-- 
"Hairy ape nads."-- Colleen, playing Neverwinter Nights



--
Enable your software for Intel(R) Active Management Technology to meet the
growing manageability and security demands of your customers. Businesses
are taking advantage of Intel(R) vPro (TM) technology - will your software 
be a part of the solution? Download the Intel(R) Manageability Checker 
today! http://p.sf.net/sfu/intel-dev2devmar
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] NSClient++. Monitoring the devices behind the Firewall.

2011-03-15 Thread C. Bensend


> If you're looking to do this without cooperation from the client
> and their security folks, you're going to run into problems.  If
> they want you to monitor their hosts, they have to provide some
> manner of accessing them.

Just to be thorough, passive monitoring is also a possibility.
In that case, each of the clients would be configured to send the
service check results to the Nagios server, and would probably
not require any changes to the firewall.

However, I choose to use active monitoring, so I cannot help
with that setup, nor would I necessarily recommend it.

Benny


-- 
"Hairy ape nads."-- Colleen, playing Neverwinter Nights



--
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] NSClient++. Monitoring the devices behind the Firewall.

2011-03-15 Thread C. Bensend


> The question I have is the same of already reported in the link
> http://nsclient.com/nscp/discussion/topic/466#-1. The diagram and scenario
> is the same reported in the link
> http://nsclient.com/nscp/wiki/doc/usage/nagios/nrpe but with a second
> remote
> Firewall.
>
> Basically, I know how to configure a remote Windows computer with a fix
> TCP-IP address but I have no idea how to configure a remote Windows
> NSClient
> or an NRPE UNIX client installed behind a remote Firewall. The remote
> subnet
> has a NAT in this case and how the Nagios server can reach a remote client
> in this scenario?
> Any idea?

Well, each of the clients behind the firewall needs to be
individually addressable somehow.  You can do this in several
ways, here are two:

1) Assign ports on the firewall to NAT to the individual clients
   behind it.  Ie, assign port 45000 to be NATed to client 1, port
   5666.  Assign port 45001 to be NATed to client 2, port 5666,
   etc.  Then, on your Nagios server, use the IP of your firewall
   and the individual ports to communicate with the clients.

2) Assign multiple IPs to the firewall, and NAT each IP and port
   X (by default, 5666) to the clients behind it.

If you're looking to do this without cooperation from the client
and their security folks, you're going to run into problems.  If
they want you to monitor their hosts, they have to provide some
manner of accessing them.

In either of the examples above, I would strongly recommend that
they assign firewall rules to allow connections to the clients'
NSClient++ services *only* from your Nagios server.  Don't leave
those ports open to the unwashed masses.

A VPN between your sites is also an option.

Benny


-- 
"Hairy ape nads."-- Colleen, playing Neverwinter Nights



--
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Verifying the email delivery completion

2011-03-13 Thread C. Bensend


> I had been successfuly using check_smtp to verify the SMTP service.
> Few days ago, one of our SMTP servers was still listening on 25 but
> messages
> where all rejected with a 451 error. (451 mail server temporarily rejected
> message (#4.3.0))
> Is there any way to verify the email delivery completion?

Check out check_email_delivery:

http://exchange.nagios.org/directory/Plugins/Email-and-Groupware/check_email_delivery/details

It will check email *delivery*, not just listening on a port - from
email submission via SMTP through the reception of the same email
to a mailbox via POP3 or IMAP, and will alert upon problems at any
phase in the process.

Benny


-- 
"Hairy ape nads."-- Colleen, playing Neverwinter Nights



--
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] A question on nagios group nagcmd

2011-03-10 Thread C. Bensend


> My question is, the docs say to create the group nagcmd and add nagios &
> wwwrun to the group in order to allow external commands to be submitted
> thru the web interface.
>
> What external commands are we talking about here?  Are we talking about
> the service commands from the check screen ( disable checks, schedule
> downtime, etc)  .
>
> Is is safe to assume that if wwwrun was not in the nagcmd group baaa
> things will happen in the web console?  Not anywhere near my system so I
> cant try to see what would happen.
>
> Any thoughts before I rewrite the central password system to put both a
> regular user & a daemon user in the same file  ?

What "regular" user?  nagios?

Because really, from your description above, both "nagios" and
"wwwrun" users should be daemon users, so you should be able to
have them in the same file and avoid the problem.

After all, the nagios user is the one that Nagios daemon will be
running as...

Benny


-- 
"Hairy ape nads."-- Colleen, playing Neverwinter Nights



--
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Fwd: Availability Report

2011-03-10 Thread C. Bensend


>  Its ignoring my timeperiod and generating availability record. I want
>  to generate availability for timeperiod 06:00 to 22:00 so i created the
>  below timeperiod
>
>  define timeperiod {
>  timeperiod_name 16x7
>  alias   For avialability report
>  sunday  06:00-22:00
>  monday  06:00-22:00
>  tuesday 06:00-22:00
>  wednesday   06:00-22:00
>  thursday06:00-22:00
>  friday  06:00-22:00
>  saturday06:00-22:00
>  }
>
>   While generating availability report i am selecting this timeperiod in
>  Report time period option but availability report is ignoring this.

If indeed it's ignoring your timeperiod and calculating the
availability on a 24x7 basis, it sounds like a flaw in the
availability report.

If you're versed in C, you might check avail.c in the cgi/ directory
of the source distribution.  Perhaps you can find the flaw and
submit a patch!

Benny


-- 
"Hairy ape nads."-- Colleen, playing Neverwinter Nights



--
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Fwd: Availability Report

2011-03-10 Thread C. Bensend


> I am
> selecting custom time period only but the availability report is not
> taking the Report time period option it always go for 00:00 to
> 24:00.

Ah, yes, I see that in your original message, sorry.

H, I don't use the custom reporting...  Is it just *displaying*
the 00:00 - 24:00 time and is actually generating availability
data taking your timeperiod into account, or is it *ignoring* your
timeperiod and generating availability for 00:00 - 24:00?

Benny


-- 
"Hairy ape nads."-- Colleen, playing Neverwinter Nights



--
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Availability Report

2011-03-10 Thread C. Bensend


> I need to generate availability for a service in a
> particular timeperiod. I have created one timeperiod in nagios from
> 06:00 to 22:00 every day. While creating availability report in report
> time period i am selecting that timeperiod but the report always
> generating from 00:00 to 24:00. Can anyone please help me on this.

If you select a custom report period, you can select not only the
days you want included, but the timeperiod.

Benny


-- 
"Hairy ape nads."-- Colleen, playing Neverwinter Nights



--
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Ho to Deploy massively Nagios on more than 200 Windows servers?

2011-03-04 Thread C. Bensend


> Hi Gurus,I have to deploy  Nagios plugins on more than 200 Windows
> servers.
> Configuration: Nagios Server runs Nagios3.06 on linux Centos 5.5.So i
> would like to know how to do it  massively instead of server by server
> ?Thanks for your help.

The same way as you deploy any other software to your more than
200 Windows servers...  How do you apply your patches?  How do
you distribute other software?

I use SCCM to do it on my network.  You could also script it from
one of the Windows machines, or any number of other methods.

Benny


-- 
"Hairy ape nads."-- Colleen, playing Neverwinter Nights



--
What You Don't Know About Data Connectivity CAN Hurt You
This paper provides an overview of data connectivity, details
its effect on application quality, and explores various alternative
solutions. http://p.sf.net/sfu/progress-d2d
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Error in performance-data-output

2011-03-02 Thread C. Bensend


> Could it be that this is a Windows issue, or perhaps NSClient++?
>
> Any NSClient++ users here who can confirm if this is the case? I'm
> thinking that perhaps the underscore character '_' is throwing off
> Windows or NSClient++.

I use NSClient on hundreds of hosts, and I haven't noticed any
issues with underscores yet...

Benny


-- 
"Hairy ape nads."-- Colleen, playing Neverwinter Nights



--
Free Software Download: Index, Search & Analyze Logs and other IT data in 
Real-Time with Splunk. Collect, index and harness all the fast moving IT data 
generated by your applications, servers and devices whether physical, virtual
or in the cloud. Deliver compliance at lower cost and gain new business 
insights. http://p.sf.net/sfu/splunk-dev2dev 
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] check_hpasm and check_openmanage over nrpe in windows

2011-02-25 Thread C. Bensend


> I have encountered an issue with wanting to monitor HP and Dell servers
> running windows OS .
> The main issue is that due to security issue we can not use the
> NSClient++ internal functionality , but use the NRPE module option .
> I have not used the nrpe on windows boxes extensively before and wanted
> to know if anyone has deployed the check_hpasm and check_openmanage on
> windows boxes where the nagios server can only access  the NPRE port and
> not have SNMP access  direct to the server ?

Sure, that's the only way I monitor my Windows systems.

I don't use check_hpasm, but I use the hell out of check_openmanage
(thank you, Trond!):

[NRPE handlers]
command[check_openmanage]=check_openmanage.exe -e -p -b
bat_charge=ALL/ctrl_fw=ALL/ctrl_driver=ALL
--omreport=F:\dellopenmanage\oma\bin\omreport.exe

(sorry about line wrap)

OMSA is prone to timeouts as its an "expensive" test to run, but it's
saved our bacon many times.

Benny


-- 
"Hairy ape nads."-- Colleen, playing Neverwinter Nights



--
Free Software Download: Index, Search & Analyze Logs and other IT data in 
Real-Time with Splunk. Collect, index and harness all the fast moving IT data 
generated by your applications, servers and devices whether physical, virtual
or in the cloud. Deliver compliance at lower cost and gain new business 
insights. http://p.sf.net/sfu/splunk-dev2dev 
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Which GUI to configure Nagios 3 ?

2011-02-24 Thread C. Bensend


> I know that there are nice GUI to configure Nagioswhich one do you
> know/use
> ?

I'm a big, big fan of NConf.

Benny


-- 
"Hairy ape nads."-- Colleen, playing Neverwinter Nights



--
Free Software Download: Index, Search & Analyze Logs and other IT data in 
Real-Time with Splunk. Collect, index and harness all the fast moving IT data 
generated by your applications, servers and devices whether physical, virtual
or in the cloud. Deliver compliance at lower cost and gain new business 
insights. http://p.sf.net/sfu/splunk-dev2dev 
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Hostgroups: if not contact for one host, none are available

2011-02-17 Thread C. Bensend


> this isn't a bug.
> by reviewing the source codes, you can find that Nagios(more
> precisely, the CGIs) just do this way.
> i have no clue why Nagios won't show "partial" hostgroups if one has
> no access to all host members.
> maybe for performance issue?

If this behavior is intention, I'd love to know why...  It seems
utterly broken to me, and while I have no vote to cast, I'd love
to see it changed.

While this seems like just a logic thing, I'd have to dig into
the code to see if I'm smart enough to come up with a diff.  :)

Any developers have an opinion/comment?

Benny


-- 
"Hairy ape nads."-- Colleen, playing Neverwinter Nights



--
The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE:
Pinpoint memory and threading errors before they happen.
Find and fix more than 250 security defects in the development cycle.
Locate bottlenecks in serial and parallel code that limit performance.
http://p.sf.net/sfu/intel-dev2devfeb
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] Hostgroups: if not contact for one host, none are available

2011-02-16 Thread C. Bensend


Hey folks,

   This has bitten me a few times now, so I figured I'd better report
it...

   If I have hostgroup "bob":

host1
host2
host3
host4
host5

and contact "frank" is a contact for hosts 1, 2, 3, and 4 (but NOT
5), frank will not be able to view the *hostgroup*.  It gives the
usual "It appears you do not have permissions ..." error.

   *Surely* this can't be intentional, can it?  Why the heck would
you _want_ that behavior?  I would expect it to display the hosts
in the group (viewing a host you're a contact for will show all
services, even if you're not a contact for all), or at worst just
the members of the group the user is a contact for, but not deny
access to the entire hostgroup.

   In my environment, I have accidentally added a host to the
wrong hostgroup.  When I do this and the users of the hostgroup
aren't contacts for this new one that I misplaced, the users lose
access to the entire hostgroup.

   Am I being dense, or is this a bug?  3.2.3 on RHEL 5.5, BTW.

Thanks!

Benny


-- 
"Hairy ape nads."-- Colleen, playing Neverwinter Nights



--
The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE:
Pinpoint memory and threading errors before they happen.
Find and fix more than 250 security defects in the development cycle.
Locate bottlenecks in serial and parallel code that limit performance.
http://p.sf.net/sfu/intel-dev2devfeb
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Monitoring unmounted partition

2011-02-08 Thread C. Bensend


> Im having a problem with check_nrpe. Im monitoring a partition
> /mnt/2 f.e. If i dont have this partition mounted, it just
> returns the value of "/" witout sending any error. 
> 
> How can i get an alert when the partition isn`t mounted.

Oooof, plain text, please.

I don't know that what you want is possible - if the partition isn't
mounted, the OS can't read any information about it.

Is this a local partition or a remote filesystem (ala NFS)?  If it's
remote, you might use the -X flag to check_disk to exclude any of the
local filesystem types, so at least you'd get an error if it's not
mounted instead of returning the information for /...

Benny


--
"Hairy ape nads."-- Colleen, playing Neverwinter Nights



--
The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE:
Pinpoint memory and threading errors before they happen.
Find and fix more than 250 security defects in the development cycle.
Locate bottlenecks in serial and parallel code that limit performance.
http://p.sf.net/sfu/intel-dev2devfeb
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] monitoring Windows 2008 event log?

2011-02-04 Thread C. Bensend


> forgive my ignorance, but nsclient can check the event log?
>
> I wouldn't blame Steve, I think he had a baby not so long ago

NSClient++ can, yes.

*shrug*  This was like a year ago or so...  If he's busy, that's
fine and understandable.  Just *say* so, don't just ignore your
users, especially when they're trying to point out problems.
I'm not mad at the guy or anything, his software just wasn't
usable for me.

I have to correct myself - I use Consol.de's check_logfiles.exe
for my event log stuff.  My bad - I found the built-in NSClient++
eventlog stuff a bit cumbersome.

http://labs.consol.de/lang/en/nagios/check_logfiles/

Lausser has been great in helping as well as adding features and
fixing bugs.  Sorry for the confusion with NSClient++...  I use
NSClient++ to execute check_logfiles.exe on the remote clients.

Benny


-- 
"Hairy ape nads."-- Colleen, playing Neverwinter Nights



--
The modern datacenter depends on network connectivity to access resources
and provide services. The best practices for maximizing a physical server's
connectivity to a physical network are well understood - see how these
rules translate into the virtual world? 
http://p.sf.net/sfu/oracle-sfdevnlfb
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] monitoring Windows 2008 event log?

2011-02-04 Thread C. Bensend


> Anybody know a good way to monitor Windows 2008 event logs?
>
> Steve Shipway's beta NagEventLog for win2k8 to run on my server
>
> http://www.steveshipway.org/software/nagevlog-setup-1.9.2.exe
>
> Any ideas would be most appreciated

I found NagEventLog to be unreliable, and Steve stopped answering
my questions.

NSClient++ is very reliable, and I haven't looked back.

Benny


-- 
"Hairy ape nads."-- Colleen, playing Neverwinter Nights



--
The modern datacenter depends on network connectivity to access resources
and provide services. The best practices for maximizing a physical server's
connectivity to a physical network are well understood - see how these
rules translate into the virtual world? 
http://p.sf.net/sfu/oracle-sfdevnlfb
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Monitoring temperatures on Cisco equipment

2011-01-27 Thread C. Bensend


> I think you misunderstand.  Those two plugins return WARNING or CRITICAL
> if
> one of the two things occur:
>
> 1) If the ciscoEnvMonTemperatureState is not "normal".
> 2) If the passed -w and -c values are less
> than ciscoEnvMonTemperatureStatusValue.
>
> What I'm asking is why #2 is _required_.  I can understand it as an
> optional
> check if you want to override the device's defaults, but not as mandatory
> behavior.  Cisco devices are smart and know when they're warm or hot.
>  That's the purpose of the ciscoEnvMonTemperatureState.  I'm just trying
> to
> find out why folks feel that overriding Cisco's defaults is necessary
> behavior.

While I don't have any insider knowledge into *why* it is the way
it is, I'll take a guess - most third-party plugins come into
existence because they satisfied someone's specific needs.  Perhaps
the original author needed to further narrow the range of good-vs-bad,
who knows?

I'd say modify it to your needs.  :)

Benny  (yes, *that* Benny, hi Jeffrey)


-- 
"Hairy ape nads."-- Colleen, playing Neverwinter Nights



--
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires 
February 28th, so secure your free ArcSight Logger TODAY! 
http://p.sf.net/sfu/arcsight-sfd2d
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Writing nrpe commands

2011-01-24 Thread C. Bensend


>  I'm in need of using nrpe to get information from log files, and
> I'm stumped on where to find guidance on doing so. Any pointers to the
> right information or the right place to ask (if this isn't it)?

I would take a look at the sample nrpe.cfg that comes with NRPE for
examples of commands, and then check out consul.de's excellent
check_logfiles plugin to do the work with the logs.

Benny


-- 
"Hairy ape nads."-- Colleen, playing Neverwinter Nights



--
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires 
February 28th, so secure your free ArcSight Logger TODAY! 
http://p.sf.net/sfu/arcsight-sfd2d
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Problem with check_openmanage

2011-01-24 Thread C. Bensend


> $ check_openmanage -H myserver -C public
> Power Supply 0 [AC] needs attention: Presence detected, Failure detected,
> AC lost

You have a power supply #0, it is plugged in, but it has no AC
input.  Someone tripped over a cable.

> Voltage sensor 14 [PS 2 Voltage 2] is
> INTERNAL ERROR: Use of uninitialized value $reading in sprintf at
> /usr/lib/nagios/plugins/check_openmanage line 3565.

Whoopsie, that looks like a bug in check_openmanage.  Trond is
excellent about fixing issues, I'd expect to hear from him
shortly.

Benny


-- 
"Hairy ape nads."-- Colleen, playing Neverwinter Nights



--
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires 
February 28th, so secure your free ArcSight Logger TODAY! 
http://p.sf.net/sfu/arcsight-sfd2d
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] qpage - OT

2010-12-16 Thread C. Bensend


>>It would have been nice to see your qpage.cf file...  ;)
> That seems obvious, see below
>
>>Be sure you have 'parity=even' in your config.  When you run a test
> with verbose and interactive flags set, do you fail five or six times
> before you get that message?
>
> I've never tried the interactive flag, I will do so.  As far as the
> failures go when I had the retry set to 20 it would to fail 5 times in a
> row and then reset the modem or something, I can't fully interpret the
> logs, and then retry again possible 20 times? as in 20 sets of 5.
>
> The interactive (-i) option seems to require a page to be sent right
> now.  As of yet I have been unable to get a failure when sending a page
> manually but I think I've really only sent a small number 10-20 pages
> manually.  The only times it has failed so far is when it's running in
> daemon mode.  Do you guys use USB modems with qpage?  These problems got
> much worse after switching to a USB modem.

I didn't notice anything glaringly incorrect with your config...

The reason I asked about parity is because I got the exact same
error with Verizon and Sprint, except that qpage would decide
that the page was not sent (when it had been), so it would retry
five times (thereby sending five identical pages).

That issue went away when I had a palm-forehead moment and added
the 'parity=even' to my config.

Benny


-- 
"I'm no meteorologist, but I'm pretty sure it's rainin' bitches!"
 -- Cleveland, "Family Guy"



--
Lotusphere 2011
Register now for Lotusphere 2011 and learn how
to connect the dots, take your collaborative environment
to the next level, and enter the era of Social Business.
http://p.sf.net/sfu/lotusphere-d2d
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] qpage - OT

2010-12-13 Thread C. Bensend


> qpage error:
> <502 MESSAGE REJECTED - STX OR EOT EXPECTED>

It would have been nice to see your qpage.cf file...  ;)

Be sure you have 'parity=even' in your config.  When you run
a test with verbose and interactive flags set, do you fail
five or six times before you get that message?

Benny


-- 
"I'm no meteorologist, but I'm pretty sure it's rainin' bitches!"
 -- Cleveland, "Family Guy"



--
Lotusphere 2011
Register now for Lotusphere 2011 and learn how
to connect the dots, take your collaborative environment
to the next level, and enter the era of Social Business.
http://p.sf.net/sfu/lotusphere-d2d
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] high latency

2010-12-02 Thread C. Bensend


> Yeah, for giggles I went back further through the archives last night
> and found stuff back to 2.x series, and not much has seemed to help.  I
> killed some of my mis-behaving active checks, and that dropped to about
> 20 seconds, then went up to about 35-50.  So while that's better, I have
> A LOT more hosts and service checks to add, and am afraid it'll go nuts
> when I dump more on.  I think I've tried about all the config options I
> could find and some helped, some didn't seem to, but  there should be
> plenty of horsepower on the machine to run this much faster so not sure
> why it's not.

Hey Dan,

   I too have been wrestling alligators with service and host
check latencies averaging around 60s, and increasing to 100+
(sometimes to 300) after a few reloads during the day.

   This morning, I enabled the use_large_installation_tweaks
option.  As of a minute ago, my host check latency is now
averaging 2.116s, and service check latency is averaging 0.748s.

   I didn't see if you had tried this yet, it might be something
to consider.

Benny


-- 
"No matter how many shorts we have in the system, my guards will
be instructed to treat every surveillance camera malfunction as a
full-scale emergency."
   -- Peter Anspach's Evil Overlord List, #67



--
Increase Visibility of Your 3D Game App & Earn a Chance To Win $500!
Tap into the largest installed PC base & get more eyes on your game by
optimizing for Intel(R) Graphics Technology. Get started today with the
Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs.
http://p.sf.net/sfu/intelisp-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Change "Procs Critical" threshold

2010-11-22 Thread C. Bensend


>> From the help for check_load (which I'm assuming you're using
>> in the command definition):
>>
>> Usage:check_load [-r] -w WLOAD1,WLOAD5,WLOAD15 -c CLOAD1,CLOAD5,CLOAD15
>>
>> So, in your service definition, you're telling check_load that you
>> want to trigger a critical condition if the 15 minute average is 4.0.
>>
>> Yours is 4.06.  So, yes, it's critical.  :)
>>
>> WLOAD1  = 1 minute average warning threshold
>> WLOAD5  = 5 minute average warning threshold
>> WLOAD15 = 15 minute average warning threshold
>>
>> CLOAD1  = 1 minute average critical threshold
>> CLOAD5  = 5 minute average critical threshold
>> CLOAD15 = 15 minute average critical threshold
>>
>> If you want your 15 minute average to *not* trigger a critical, you
>> need to adjust that last value (4.0) to something higher.
>>
>> Benny
>>
>>
>
> That sounds logical, and this is what I've adjusted:
> check_command   check_local_load!8.0,5.0,4.0!12.0,7.0,6.0
>
> but I've restarted the nagios process and the alert still persists. I dont
> see anything in nagios.log or /var/log/messages related to this either.
>
> what could I be missing?

Kill your Nagios daemon.

Now, kill the *other* Nagios daemon you have running.

If you make changes to your config file and send Nagios a SIGHUP (or
restart it) and the changes don't seem to "stick", you might have
multiple Nagios daemons running, one with an old config (that still
thinks 4.00 is a critical threshold), while the new daemon is
receiving the changes you mean to make.

This is a common issue, and it's easy to fix.  Shut down your daemon
via whatever method you have (service nagios stop, pkill, etc).  Then,
wait 30 seconds or so to allow outstanding service checks to wrap up,
and see if there are still Nagios processes hanging around.  If they
are, kill them too.  Wait another 30 seconds, rinse and repeat until
there are no more Nagios processes.

At that point, restart Nagios.  Do your changes take affect now?

Benny


-- 
"No matter how many shorts we have in the system, my guards will
be instructed to treat every surveillance camera malfunction as a
full-scale emergency."
   -- Peter Anspach's Evil Overlord List, #67



--
Increase Visibility of Your 3D Game App & Earn a Chance To Win $500!
Tap into the largest installed PC base & get more eyes on your game by
optimizing for Intel(R) Graphics Technology. Get started today with the
Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs.
http://p.sf.net/sfu/intelisp-dev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Change "Procs Critical" threshold

2010-11-19 Thread C. Bensend


> I had another question regarding adjusting these thresholds, this time on
> localhost. It regards the Current Load parameter, which is giving me a
> Critical Load average of -- 2.47, 3.43, and 4.06
>
> in localhost.cfg, /usr/local/nagios/etc/objects/localhost.cfg, I have this
>
> define service{
> use local-service ; Name of
> service template to use
> host_name   localhost
>   service_description Current Load
> check_command
> check_local_load!5.0,4.0,3.0!10.0,6.0,4.0
>   }
>
> which I actually went and adjusted to :
>
> check_command check_local_load!7.0,4.0,3.0!10.0,6.0,4.0
>
> I restarted the Nagios service..but this didn't have any effect -- the
> status information still reads the same -- Critical Load Average - 2.47,
> 3.43, 4.06

>From the help for check_load (which I'm assuming you're using
in the command definition):

Usage:check_load [-r] -w WLOAD1,WLOAD5,WLOAD15 -c CLOAD1,CLOAD5,CLOAD15

So, in your service definition, you're telling check_load that you
want to trigger a critical condition if the 15 minute average is 4.0.

Yours is 4.06.  So, yes, it's critical.  :)

WLOAD1  = 1 minute average warning threshold
WLOAD5  = 5 minute average warning threshold
WLOAD15 = 15 minute average warning threshold

CLOAD1  = 1 minute average critical threshold
CLOAD5  = 5 minute average critical threshold
CLOAD15 = 15 minute average critical threshold

If you want your 15 minute average to *not* trigger a critical, you
need to adjust that last value (4.0) to something higher.

Benny


-- 
"No matter how many shorts we have in the system, my guards will
be instructed to treat every surveillance camera malfunction as a
full-scale emergency."
   -- Peter Anspach's Evil Overlord List, #67



--
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2 & L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today
http://p.sf.net/sfu/msIE9-sfdev2dev
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

1 2 3 >

1 - 100 of 207 matches

Mail list logo