Re: [Nagios-users] Any numbers on sizing a nagios server?

2008-11-04 Thread Aaron Devey
We're using a similar hardware config to the one Jake mentioned below
and nagios 2.10.  We push nagios slave servers to anywhere between 6000
and 7000 services per slave (though service latency starts to climb
quickly around 7000.)  Most services are checked every 5 minutes.  The
master server handles all host checks.

We've been able to scale this config up to about 7800 hosts and 5
services in a single datacenter.  This includes 1 nagios master and 8
nagios slaves.

Your mileage will vary and as Jake mentioned, your environment may have
a large impact on expected performance.

-Aaron Devey


Paulus, Jake wrote:
> Nagios performance is very much specific to your environment. Nagios 3.x
> is also MUCH faster than Nagios 2.x because of parallel host checks (and
> other features.) Our performance is summed up below but your millage my
> vary:
> 
> Primary server:
> Nagios 3.0.3
> Dual, quad-core processors @  2.4 GHz and 4GB of RAM
> ~650 hosts, 1550 services
> System load averages 0.8, 0.6, 0.5 (CPUs are mostly idle with spikes
> when hundreds of checks get kicked off at once)
> 
> Average service check latency 0.3 seconds
> Average host check latency 2.5 seconds
> 
> All of our service checks are active, mostly snmpget and snmpbulkget and
> lots of pings - largely checking every 2-5 minutes. Most of our service
> checks are bash and perl scripts (we don't use the embedded Perl
> interpreter.) We also collect and parse perfdata for graphing and run
> Cacti and other very small MySQL-driven webapps on this same server. The
> server is definitely a little over-kill but the price was right and it
> was purchased with Nagios 2.x in mind - once again, Nagios 3.x is much
> faster. Our environment is also not "tuned" for performance other than
> to put in sane timeouts for service checks so they don't sit around
> waiting too long.
> 
> 
> Thanks, -Jake
> 
> 
> -Original Message-
> From: Edgar Matzinger [mailto:[EMAIL PROTECTED]
> Sent: Tuesday, November 04, 2008 1:56 PM
> To: Nagios Mailinglist
> Subject: [Nagios-users] Any numbers on sizing a nagios server?
> 
> LS,
> 
>   I've searched the internet (maybe I look in the wrong places) but I
> can't find any numbers on sizing a nagios server. Are there any numbers
> out there amongst you and are you willing to share?
> 
> Thanks, regards, Edgar.
> --
> |\  /| :: Addr: Valid Eindhoven B.V.
>  /  | \/ | : Edgar R. Matzinger :   t.a.v. E.R.
> Matzinger
> /   || ::   Paradijslaan 36
> \  /|  /\| ::   5611 KN Eindhoven
>  \/   /  \ : Valid Eindhoven BV :
>   \  /\  / ::
>\/ |\/  ::
>   |::
> Disclaimer: Any comments, opinions made are mine, etc ...
> 
> 
> 
> -
> This SF.Net email is sponsored by the Moblin Your Move Developer's
> challenge
> Build the coolest Linux based applications with Moblin SDK & win great
> prizes
> Grand prize is a trip for two to an Open Source event anywhere in the
> world
> http://moblin-contest.org/redirect.php?banner_id=100&url=/
> <http://moblin-contest.org/redirect.php?banner_id=100&url=/>
> ___
> Nagios-users mailing list
> Nagios-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
> 
> -
> This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
> Build the coolest Linux based applications with Moblin SDK & win great
> prizes
> Grand prize is a trip for two to an Open Source event anywhere in the world
> http://moblin-contest.org/redirect.php?banner_id=100&url=/
> <http://moblin-contest.org/redirect.php?banner_id=100&url=/>
> ___
> Nagios-users mailing list
> Nagios-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
> 


-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coole

Re: [Nagios-users] Event handler

2008-05-14 Thread Aaron Devey
Drew Weaver wrote:
>  $a = $argv[0];
> $b = $argv[1];
> $c = $argv[2];
> $d = $argv[3];
> $handle = fopen(“output”, “a+”);
> $content = “$a - $b - $c - $d\n”;
> $go = fwrite($handle, “$content”);
> ?>

You'll want to specify the full path to the 'output' file.  Nagios won't
necessarily call it from the same working directory that you used from
the shell.

-Aaron

-
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] NRPE vs NSCA benchmarking

2008-04-24 Thread Aaron Devey
Maurizio Pinotti wrote:
> NSCA PROS/CONS: the opposite

It's important to note that multiple NSCA results can be sent per
connection.  This makes it slightly more load/network friendly when you
have a lot of services.  However, taking advantage of this benefit will
increase the complexity of your check submission script.

-Aaron

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Hosts reboots too fast for check_alive notification

2008-04-24 Thread Aaron Devey
You could have the server fire a script during reboots that submits a
check result to nagios via NSCA.  It might be a little more elaborate
than what you were looking for, but it will always catch a reboot even
when a host check misses it.

-Aaron


Rodrick Brown wrote:
> When one of my hosts reboots I’m never notified about the outage.
> Currently I’m using a custom script  S99bootnotify to alert me when a
> host comes online, is there any way to shorten the polling for
> check_alive? I find it strange that a host could reboot and nagios not
> detect that outage.
> 
>  
> 
> Thanks.
> 
>  
> 
> ---
> 
> Rodrick R. Brown
> 
> Director, Systems Engineering
> 
> Ballista Securities, LLC
> 
> 120 Wall St. Suite 2400
> 
> P: 646 307 4709
> 
> C: 347 702 0012
> 
> F: 646 219-5872
> 
> E: rbrown(at)ballistasec.com
> 
>  
> 
> 
> 
> 
> -
> This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
> Don't miss this year's exciting event. There's still time to save $100. 
> Use priority code J8TL2D2. 
> http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
> 
> 
> 
> 
> ___
> Nagios-users mailing list
> Nagios-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when reporting 
> any issue. 
> ::: Messages without supporting info will risk being sent to /dev/null


-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Host checks under Nagios 1.x

2008-04-21 Thread Aaron Devey
I had a similar problem to this.  I only wanted to know if a
not-so-important device had been down for an hour or more.

Here's what I ended up doing:
I disabled the host check (by having it call an "always-ok" checkcommand
that always returns 0.)  I then added a 'PING' service to the host with
a max_check_attempts of 7, and a retry_check_interval of 10 minutes.

The pitfall being that I no longer receive 'HOST DOWN' alerts for that
host.  I instead receive alerts for a failing 'PING' service.

-Aaron


Andrew Cruse wrote:
> I've got an interesting problem with a particular setup.  I'm monitoring a
> number of servers that the main Nagios installation doesn't have direct
> network access to, so I pass all of the host and service checks through an
> NRPE installation that can communicate with both Nagios and the servers
> being monitored.  A little tweaking with check timeouts and whatnot and this
> setup works pretty nicely.  I've run into a problem where for some reason,
> the NRPE server periodically stops responding to NRPE requests.  Haven't
> gotten to the bottom of that (Connection refused) yet.  Service checks are
> able to handle the problem fine as the duration of the NRPE outage is much
> shorter than the time it takes for the services to go into a hard critical
> state.  The problem is, once the first service check goes through and goes
> into a soft critical state, that triggers the host checks which also fail
> (host checks go through NRPE as well) and immediately generate a
> notification.  I'd like to find a way to make the host checks a little more
> forgiving as well.
> 
> A few things I've thought of or tried:
> 
> 1.  I tried bumping up the host check retries to 30, but since the checks
> immediately fail with "connection refused" it runs through all 30 tries
> within just a few seconds.  I also worry about this leading to unneeded load
> on the Nagios server since this is generally going to cause check_nrpe to be
> run 30 times, for each of the ~20 servers in this setup.
> 
> 2.  Extending the timeout on the check_nrpe commands doesn't help because
> "connection refused" is returned immediately.
> 
> 3.  Switching to a passive setup is probably the way to go, but for now am
> trying to avoid all the reconfiguration needed to move in that direction.
> 
> 
> Ideally what I'd like to be able to do is have the host checks retry on a
> particular interval (i.e. once per second) rather than instantly after the
> previous executed.  Is there a way to do this?
> 
> Incidentally, while typing up this email I was actually able to find the
> root problem with the NRPE setup.  NRPE was being called via Xinetd which
> wasn't configured to allow enough simultaneous connections for a single
> service.  Thus when it started getting hammered with NRPE requests as a
> result of the host check configuration it would stop allowing NRPE
> connections for 30 seconds.  A quick change to the Xinetd config file seems
> to have solved the problem.
> 
> I'm still interested to know how anyone handles the situation where a host
> may be unresponsive to host checks for a period of time yet you only wish to
> fire off a notification after a specific period of time.  Would a wrapper
> around the host check be the only way to handle it?
> 
> Andrew
> 
> 
> -
> This SF.net email is sponsored by the 2008 JavaOne(SM) Conference
> Don't miss this year's exciting event. There's still time to save $100.
> Use priority code J8TL2D2.
> http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
> ___
> Nagios-users mailing list
> Nagios-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
> 


-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] how to get the current temp in a warning messagesent

2007-11-19 Thread Aaron Devey
Instead of $DATETIME try $SHORTDATETIME$

-Aaron

Randy Paries wrote:
>
> Andreas
> changing $OUTPUT$ to $SERVICEOUTPUT$ worked!!
> thanks
>
> currently it is set to Date/Time: $DATETIME
> and this always is blank
> Thanks
>
>
>
> On Nov 19, 2007 3:46 PM, Andreas Ericsson <[EMAIL PROTECTED]> wrote:
> >
> > Randy Paries wrote:
> > > Hello,
> > > I have the following service:
> > > 
> > > define service{
> > > usegeneric-service
> > > host_name  bart
> > > service_description   Probe #1 Temperature
> > > is_volatile0
> > > check_period   24x7
> > > max_check_attempts   2
> > > normal_check_interval  5
> > > retry_check_interval 1
> > > contact_groupsall_admins
> > > notification_interval   120
> > > notification_period 24x7
> > > notification_options   w,u,c,r
> > > check_command  check_temptraxf!/dev/ttyS0!1!74!78
> > > }
> > > 
> > >
> > > when i get a warning i get the message below. Is there a way to
> > > include the current temp in the warning?
> > > thanks for any help
> > >
> > > 
> > > * Nagios  *
> > >
> > > Notification Type: PROBLEM
> > >
> > > Service: Probe #1 Temperature
> > > Host: Bart #1
> > > Address: 192.168.0.214
> > > State: WARNING
> > >
> > > Date/Time: $
> > >
> > > Additional Info:
> > >
> > > $
> > > 
> > >
> >
> > You're using the wrong macros. Try changing $OUTPUT$ to
> $SERVICEOUTPUT$ in
> > your service notification macro. I don't know which one you want for
> > date/time, but a glance at the documentation will tell you.
> >
> > --
> > Andreas Ericsson   [EMAIL PROTECTED]
> > OP5 AB www.op5.se
> > Tel: +46 8-230225  Fax: +46 8-230231
> >
>
> -
> This SF.net email is sponsored by: Microsoft
> Defy all challenges. Microsoft(R) Visual Studio 2005.
> http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
> ___
> Nagios-users mailing list
> Nagios-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
>


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Notifications

2007-11-19 Thread Aaron Devey
Forgot to include the list in the CC.

Aaron Devey wrote:
> I don't see any obvious problems with your service definitions.  Did you
> find out TrendMicro was down for 6 hours by reviewing the nagios logs? 
> If so, that means nagios at least saw the service had a problem.  If you
> found out it was down by some other means, perhaps you can check the
> nagios logs to make sure nagios saw a "critical" or "warning" problem
> with the service.
>
> Also, If you have log_notifications turned on, try examining the logs of
> the timeperiod it was down.  If you don't see any attempts to send a
> notification for TrendMicro on lg03, then it's likely a configuration
> problem somewhere. 
>
> Finding it is the hard part. :)  The first places I would check are the
> service_notification_period, service_notification_options, and
> service_notification_commands for the contacts in the 'mis' group. 
> Follow the service_notification_commands to make sure the command it
> points to is set up correctly as well.  If there are no problems there,
> I'd make sure there are no service escalations for that service.
>
> If that doesn't help, I have no idea what the problem could be. :)
>
> Good luck,
>
> -Aaron
>
>
> Jerad Riggin wrote:
>   
>> define service{
>> namegeneric-service ; Generic
>> service name
>> active_checks_enabled   1   ; Active
>> service checks are enabled
>> passive_checks_enabled  1   ; Passive
>> service checks are enabled/accepted
>> parallelize_check   1   ; Active
>> service checks should be parallelized (Don't disable)
>> obsess_over_service 1   ; We should
>> obsess over this service (if necessary)
>> check_freshness 0   ; Default is
>> to NOT check service 'freshness'
>> notifications_enabled   1   ; Service
>> notifications are enabled
>> event_handler_enabled   1   ; Service
>> event handler is enabled
>> flap_detection_enabled  1   ; Flap
>> detection is enabled
>> process_perf_data   1   ; Process
>> performance data
>> retain_status_information   1   ; Retain
>> status information across program restarts
>> retain_nonstatus_information1   ; Retain
>> non-status information across program restarts
>> register0   ; DONT
>> REGISTER THIS DEFINITION - NOT A REAL SERVICE, JUST A TEMPLATE!
>> }
>>
>>
>> define service{
>> use generic-service
>> namewindows-service
>> is_volatile 0
>> check_period24x7
>> max_check_attempts  5
>> normal_check_interval   3
>> retry_check_interval1
>> notification_interval   15
>> notification_period 24x7
>> register0
>> }
>>
>> define service{
>> use windows-service
>> namecheck-trend
>> notification_optionsw,u,c,r
>> check_command   check_nt!SERVICESTATE!-d
>> SHOWALL -l ofcservice
>> register0
>> }
>>
>> define service{
>> use check-trend
>> service_description TrendMicro
>> contact_groups  mis
>> # hostgroup_namewindows-clients
>> host_name   lg03
>> }
>>
>>
>> 
>
>
>   


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Notifications

2007-11-19 Thread Aaron Devey
What are your notification options set to?  In 2.9 the default is "none"
so if you didn't specify them for that service, it won't alert.  If
that's not the answer, perhaps you can paste your the definitions for
your service, contact, and notification command?

-Aaron


Jerad Riggin wrote:
>
> I have a nagios 2.9 install.  I have one host with multiple services
> being monitored.  On the 16th the host didn't respond to a ping (the
> server rebooted), and recovered within 3 minutes.  I received an
> e-mail for both the failure and recovery.  I am also monitoring some
> windows services on the same box using NsClient++.  It shows on the
> same day that after it recovered the TrendMicro virus process was down
> for 6 hours.  I didn't receive an e-mail during this entire time.  It
> is set at 5 max attempts, 3 normal check, and 1 retry with a
> notification interval of 15 minutes.  It should have at least notified
> once but it didn't.  Any ideas?
>
> -
> This SF.net email is sponsored by: Microsoft
> Defy all challenges. Microsoft(R) Visual Studio 2005.
> http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
> ___
> Nagios-users mailing list
> Nagios-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
>


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] check_nrpe trouble

2007-10-17 Thread Aaron Devey
Hiamal Llanos wrote:
>
> But if I run the command on the terminal window it works happily:
> $ sudo -u nagios /usr/lib/nagios/plugins/check_nrpe -H otherhost -c
> check_load
> OK - load average: 0.00, 0.00, 0.00|load1=0.
>
What does your check_nrpe checkcommand look like?  You'll want to verify
that the syntax matches the syntax you used above, and that nagios is
running as the 'nagios' user you specified above.


-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] how to use servicedependency?

2007-09-26 Thread Aaron Devey
If I am reading your question right, the dependency works, but currently
you get alerts for sv1.dummy1 AND sj2.router1, and you only want alerts
for sj2.router1.  If this is the case, you could try setting up
sv1.dummy1 so that it doesn't alert.  Unfortunately, you might run into
problems with getting sj2.router1 to recognize a recovery if sv1.dummy1
recovers first.  You could try a circular dependency (and I'm not even
sure if you can do that in nagios) where sj2.router1 only runs if
sv1.dummy1 is failing, and sv1.dummy1 only runs if sj2.router1 is
passing.  But then you might get a problem where neither check runs
because sv1.dummy1 is passing, and sj2.router1 is failing.

This is a difficult problem to solve with service dependencies. 
Basically you want  to go critical if both  AND  fail.  But recover if either  OR  pass. 
Unfortunately, the way your service dependency works, the status of
  is directly tied to the status of .  And 
never updates if  passes.  So you really need  to
determine the status of both checks and alert accordingly, or you need
an event handler for  to submit an 'OK' status for 
when it's passing.

The first of those two options is definitely the easiest.  It simply
consists of a small shell script that runs  and if 
fails, returns the status of .  Consider a script such as the
following:

#!/bin/bash
CHECK_ONE="/path-to-checks/check_ping -H $1 -t 2 -p 2 -w 500,50% -c 999,99%"
CHECK_TWO="/path-to-checks/check_ping -H $2 -t 2 -p 2 -w 500,50% -c 999,99%"

if $CHECK_ONE >/dev/null 2>&1; then
  echo "Check one OK."
  exit 0
else
  exec $CHECK_TWO
fi

Replacing your own check commands in CHECK_ONE and CHECK_TWO of course. 
The first one would be the equivalent of your "check-link" command.  The
second would be the equivalent of your "check_nrpe!check_router1"
command.  Note that in this case I used $1 and $2, so the first argument
to the script would be the first host to check, the second argument
would be a second hostname.  You don't have to use arguments and could
just hard-code the values into your script, but it makes the script more
scalable if your installation grows.  The second check is ONLY executed
if the first one fails.

This way you only need one host, one service, and no dependencies.  If
you named your checkcommand "check_double" the service would be
something like:

define service {
use service-template
host_name sj2
service_description sj2.router1
check_command check_double!first_hostname!second_hostname
}

Good luck!

-Aaron


Jeremy C. Reed wrote:
>
> (I posed a couple weeks ago, but only got one response which was different
> than what I think I want to do.)
>
> I am running Nagios 2.9.
>
> I want: if a check_ping fails then I don't want an alert sent to me
> unless a second test (check_nrpe to a remote system that does the same
> check_ping) fails.
>
> I am reading http://nagios.sourceforge.net/docs/2_0/dependencies.html
> (I was looking at 3_0 last time.) And I am looking at
> http://www.linickx.com/blog/archives/271/how-to-monitor-wordpress-with-nagios/
>
> Where is execution_failure_criteria and notification_failure_criteria
> documented for 2.9?
>
> Can someone please provide an example of only sending a problem alert if
> two different check_commands fail and the second check_command is not done
> if the first one is OK?
>
> This is what I have:
>
> define service {
>  use service-template
> host_name sj2
> service_description sj2.router1
> check_command check_nrpe!check_router1
> }
>
> # The "dependent" is the object that needs something.
> define servicedependency {
> dependent_host_name sj2
> dependent_service_description sj2.router1
> host_name sv1
> service_description sv1.dummy1
> # o = fail on an OK state, the dependent service will not be actively
> # checked if the master service is in OK
> execution_failure_criteria o
> #   notification_failure_criteria o
> }
>
> define service {
>  use service-template
>host_name sv1
>  service_description sv1.dummy1
>check_command check-link
> }
>
>
> But I am getting two alerts if both don't return OK. I only want one
> alert. Also I am unsure how to use the execution_failure_criteria and
> notification_failure_criteria.
>
> And I do not want my "sj2.router1" to even be checked if the first
> "sv1.dummy1" is successful. But if sv1.dummy1 fails, then I want the
> sj2.router1 check to happen. And if it fails then send my alert.
>
>
>
>   Jeremy C. Reed
>
> -
> This SF.net email is sponsored by: Microsoft
> Defy all challenges. Microsoft(R) Visual Studio 2005.
> http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
> ___
> Nagios-users mailing list
> Nagios-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> :::

Re: [Nagios-users] Regex Services?

2007-07-31 Thread Aaron Devey
I've never used regex in services.  Assuming it's possible in this case,
a workaround for the comma might be:

^(host|gost)\d\d?\.\w+\.\w+$


-Aaron


Kerry Milestone wrote:
> When trying to verify with Nagios, it seems to stop reading the string 
> on the first comma it comes accross.  The Nagios documentation is a 
> little light with how to use regex other than using wildcards.
>
> Is what I am trying to do, actually possible?
>
> Cheers.
>
>
> Error: Could not find any host matching '^(host|gost)\d{1'
>
>
>   
>> kerry,
>>
>> note
>>
>> ^(host|gost)\d{1,2}\.\w+\.$
>>
>> change it to 
>> ^(host|gost)\d{1,2}\.\w+\.\w+$
>>
>> and test again.
>>
>> Learner
>>   
>> 
>
>
> -
> This SF.net email is sponsored by: Splunk Inc.
> Still grepping through log files to find problems?  Stop.
> Now Search log events and configuration files using AJAX and a browser.
> Download your FREE copy of Splunk now >>  http://get.splunk.com/
> ___
> Nagios-users mailing list
> Nagios-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when reporting 
> any issue. 
> ::: Messages without supporting info will risk being sent to /dev/null
>
>   


-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Notification on Stalk

2007-05-08 Thread Aaron Devey
The event handler might work but doesn't it stop executing after the
service enters a hard state? 0 (At least, that's how I understood nagios
2.x to work, perhaps nagios 3.x differs in this regard.)

A clever workaround to this might be to use the performance processing
options built into nagios. Perhaps by using the
'service_perfdata_command' and 'process_performance_data' nagios
options, and the 'process_perf_data' service directive, you could call a
script to process the data, log to a database, send emails, etc.  This
is not 100% ideal, since nagios should be handling the notifications/emails.

Instead of sending the emails, the above said script could also submit a
'critical' passive check to a single volatile service if the status is
critical AND has changed... but at this point I think I'm making this
more complicated than it needs to be.

-Aaron


Patrick Morris wrote:
> Why not use an eventhandler that parses the plugin output?
>
>
> -
> This SF.net email is sponsored by DB2 Express
> Download DB2 Express C - the FREE version of DB2 express and take
> control of your XML. No limits. Just data. Click to get it now.
> http://sourceforge.net/powerbar/db2/
> ___
> Nagios-users mailing list
> Nagios-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when reporting 
> any issue. 
> ::: Messages without supporting info will risk being sent to /dev/null
>
>   


-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Notification on Stalk

2007-05-08 Thread Aaron Devey
For what it's worth, I have been looking for a solution similar to this
as well.  What I'd really like to see is an "event_stalking_options"
parameter in nagios where the event handler is called based on the
stalking options.  In your case, the easiest (but probably most
annoying) solution might be to set the notification interval to the same
value as your check interval.  If your check is running every 5 minutes,
and your notification interval is set to fire off every 5 minutes, then
each notification sent out will have the latest check results.

-Aaron

Petersen, Mark wrote:
> I've searched high and low for the answer to this.  It seems that
> because nagios just checks exit status, its not easy to create a
> notification on stalking.  I'm wondering if I can definte additional
> exit codes as critical (without modifying the source,) or if there is
> another soltuion to this.
>
> For instance, say I'm checking disk space.  Warn at 85%, Crit at 90%.  I
> also want a notification at 95,96,97,98,99,100%.  I could easily exit 95
> for 95%, 96 for 96%, etc.  I believe this creates an unknown message.
> If I exit at 96, since this is a different exit code (but still unknown)
> would I get another notification?  I know, I can test this, but it seems
> clunky and I don't like the unknown status issue for historical
> tracking.
>
> Volatile services with passive checks that only submit on change is
> another option, but this presents issues with needing to do freshness
> checking and wanting to have active checks as much as possible.
>
> Are there any other solutions to this problem?  I know from a few
> archive threads there isn't much demand for this, but it seems like
> anytime you turn on stalking this would be a nice option (why wouldn't
> you want to be notified as your array degrades as per the example for
> stalking.)  Looking at the docs I don't see anything in 3.0 that will
> help with this either.
>
> Thanks,
> Mark
>
>
> -
> This SF.net email is sponsored by DB2 Express
> Download DB2 Express C - the FREE version of DB2 express and take
> control of your XML. No limits. Just data. Click to get it now.
> http://sourceforge.net/powerbar/db2/
> ___
> Nagios-users mailing list
> Nagios-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when reporting 
> any issue. 
> ::: Messages without supporting info will risk being sent to /dev/null
>
>   


-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Using nagios for reporting on non-machine data such as employees?

2007-04-24 Thread Aaron Devey
Kelly,
I admire your determination to use nagios in many versatile ways. 
Unfortunately, nagios is probably not the best fit in this case.  This 
is especially true if you really intend to use passive checks instead of 
leveraging nagios' powerful scheduling or notification features.  Plus, 
adding a new employee could be a potential nightmare if many systems are 
involved. The effort spent in getting nagios services set up and 
reto-fitted the way you need would be much better spent on a more 
conventional solution. Your situation is fairly unique, so I am unable 
to think of any ready-to-go solutions.  In the long run it's probably 
easiest to periodically upload all this data to a database and put some 
php scripts together so you can view reports on that data over a web 
interface.
-Aaron Devey

Kelly Jones wrote:

>We have various systems that keep track of employee data: when an
>employee was last paid, hours of sick/vacation leave accrued,
>employee's laptop's last IP address (from DHCP server), last time
>employee's laptop was backed up (from backup server), whether employee
>is on-lave/traveling, whether the employee has been receiving email
>(vs employee's mailbox being full, account not setup properly, etc),
>etc.
>
>I realized we could use nagios' "passive service checks" to have the
>various systems upload employee data to our nagios server, but was
>wondering if this was fitting a round peg into a square hole.
>
>Is nagios a good tool for monitoring things that aren't machines? If
>not, what would be a good tool?
>
>One concern: nagios tends to treat data as almost "binary"-- either
>something is good (green) or bad (red) [yes, yellow + "unknown" also
>exist, but it's still almost binary]. In some cases, we're just
>looking to create an "employee status report" page that has text data
>on the employee (pushed from various servers), without necessarily
>categorizing the data as "good" or "bad".
>
>  
>


-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null