[Nagios-users] Query About NAGIOS PLUGIN

2009-03-30 Thread Amit Kumar
A  application is running  on remote server ,I  want  to  execute a perl
script on that remote server  to cehck it's availability and  other stuffs
locally on that remote server and  whatever  output comes  after the
execution of that script  it  should be  available on the  nagios local
monitoring server: Please help me  out, I have   done  following  steps:

Modification on Local Nagios Server:
1.  /usr/local/nagios/nrpe.cfg  made an entry like :

  command[check_nrpe]=/usr/local/nagios/libexec/check_nrpe -H


2.  /usr/local/nagios/etc/objects/services.cfg  made a following entry
   define service{
use generic-service
service_description Netcool status
check_commandcheck_nrpe!check_impact
}
restarted the nagios server and the nrpe server on  Nagios localserver



Modification done on the remote server
1.  /usr/local/nagios/nrpe.cfg  made an entry like :

 command[check_impact]=/root/amit/nagioscheck/nagios_impact.pl
-p=
   this perl  script  resides on the remote  server
restarted the nrpe server on  remote server.


Now  My  question is how   can  I get  this  working? I  tried most of the
things  available  on  internet but nothing seems to be  working.
Am I missing any  step or  any  local configuration , then please  specify.
It is giving me  hard  time

NRPE version is same  on both remote and  local nagios server  i.e 2.0  and
the check_nrpe  utlity version on the local nagios server is  also 2.0.
Please  help me for the same.
Thanks in advance,


Amit Kumar
--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] Interesting problem while trying to monitor Oracle RAC services

2009-03-30 Thread Kumar, Ashish
Hello,

We are facing an interesting but strange issue while trying to monitor
Oracle RAC services.

Oracle RAC is running on AIX 5.3 and nagios is running on Fedora Core 9.

The scripts we are using to monitor Oracle RAC services on AIX are as follows

-
$ cat check_oracle_services.sh

#!/usr/bin/ksh
# found on the Internet
RSC_KEY=$1

/oracle/crs_home/bin/crs_stat -u | awk \
    'BEGIN { FS="="; state = 0; } \
    $1~/NAME/ && $2~/'$RSC_KEY'/ {appname = $2; state=1}; \
    state == 0 {next;} \
    $1~/TARGET/ && state == 1 {apptarget = $2; state=2;} \
    $1~/STATE/ && state == 2 {appstate = $2; state=3;} \
    state == 3 {printf "%-45s %-18s\n", appname, appstate; state=0;}'
-

$ cat check_oracle_services.pl

#!/usr/bin/env perl

use strict;
use Getopt::Std;

my %return_value = (
    OK => 0,
    CRIT => 2,
    UNKNOWN => 3
);

my $message = "nagios";
my $exit_status;

my %opt=();
getopts("p:h", \%opt);

sub usage(){
    print "Usage: $0 -p service_name\n";
    exit $return_value{'UNKNOWN'};
}

usage() if defined $opt{'h'};

my $SERVICE = $opt{'p'} if defined $opt{'p'} || usage();

# the following code was added to make sure that nrpe was not getting confused
# with dotted argument
if ($SERVICE =~ "foo") {
    $SERVICE = "ora.foo.bar.inst";
}

my $PIPED = qx/ ksh check_oracle_services.sh $SERVICE/;
print $PIPED;

if ($PIPED =~ /OFFLINE/g) {
    $exit_status = $return_value{'CRIT'};
    $message = "Critical: $SERVICE is not running.";
} else {
    $exit_status = $return_value{'OK'};
    $message = "OK: $SERVICE is running.";
}

print "$message\n";
exit $exit_status;
-

When we try to run this script on AIX (local system) the output is as follows:

[srv01@/home/nagios/nrpe/libexec]$ perl check_oracle_services.pl -p foo
ora.foo.bar.inst OFFLINE
Critical: ora.foo.bar.inst is not running.

[srv01@/home/nagios/nrpe/libexec]$ perl check_oracle_services.pl -p
ora.foo.bar.inst
ora.foo.bar.inst OFFLINE
Critical: ora.foo.bar.inst is not running.

The service indeed is offline

[srv01@/home/nagios/nrpe/libexec]$ perl check_oracle_services.pl -p
ora.foodb.bardb1.inst
ora.foodb.bardb1.inst ONLINE on srv01
OK: ora.foodb.bardb1.inst is running.


Now when we try to run the same thing from nagios server it shows the
services are online even if they are not

[r...@nagios libexec]# ./check_nrpe -n -H 10.0.10.20 -c
check_oracle_services -a ora.foo.bar.inst
OK: ora.foo.bar.inst is running.

[r...@nagios libexec]# ./check_nrpe -n -H 10.0.10.20 -c
check_oracle_services -a foo
OK: ora.foo.bar.inst is running.

This is strange that we get the correct status when scripts are
executed locally but wrong status when the scripts are executed
remotely.

Has anyone faced a similar issue?  I would appreciate if someone could
give some insights on this.

Thanks

--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Query About NAGIOS PLUGIN

2009-03-30 Thread Kumar, Ashish
> Now  My  question is how   can  I get  this  working? I  tried most of the
> things  available  on  internet but nothing seems to be  working.
> Am I missing any  step or  any  local configuration , then please  specify.
> It is giving me  hard  time

Have you tried "print"ing the message in the end?

It's hard to comment unless you show the contents of script.

--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Add users to contact group

2009-03-30 Thread Kumar, Ashish
> I would like to add my Active Directory users to nagios contact group.
> I have edited /etc/nagios/objects/contacts.cfg file and added:
>
> define contactgroup{
>        contactgroup_name       ad
>        alias                      ad
>        members                 user1, user2
>        }

first you have to define user1 and user2 in your
/etc/nagios/objects/contacts.cfg

define contact{
   contact_name user1
   use generic-contact
   aliasuser1
   email   us...@example.com
}

define contact{
   contact_nameuser2
   use generic-contact
   alias   user2
   email   us...@example.com
}


> On the same file I defined contact:
>
> define contact{
>        contact_name            ad
>        use                     generic-contact
>        alias                   ad
>        email                   email
> }


then define the contactgroup

define contactgroup{
   contactgroup_name   ad
   alias   ad
   members user1, user2
   }

--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Query About NAGIOS PLUGIN

2009-03-30 Thread Amit Kumar
How  can I run the check_nrpe on the local server?
Please provide me the info.  will it  be  running  as  a service  like  nrpe
in the local server, or is it something  else ,Please clarify
Here  is the content of the  script residing on the remote  server   I am
doing  just a simple check using the currently used  files.
#!/usr/bin/perl
use Getopt::Long;

 Impact ###
$PathOfBaselineFile = "/local/netcool/nagios/baseline";


my %ERRORS=('OK'=>0,'WARNING'=>1,'CRITICAL'=>2,'UNKNOWN'=>3,'DEPENDENT'=>4);
#Argument processing
my $args = GetOptions (
   "paname|p=s" => \$paname);
$pa = substr($paname,0,6);

if($paname ne '') {
# parameter check
if (( -e "/usr1/netcool/omnibus/etc/$paname.conf") && ( -e
"$PathOfBaselineFile/$paname.conf"))
{
$imp_cmd = `cat /usr1/netcool/omnibus/etc/$paname.conf | grep
ncoadmin | egrep -v lmgrd | egrep -v DeackStuckEvents.sh`;
$imp_cmd_base = `cat $PathOfBaselineFile/$paname.conf | grep
ncoadmin | egrep -v lmgrd | egrep -v DeackStuckEvents.sh`;
if($imp_cmd eq $imp_cmd_base)
{
 print "OK: Impact Command line is same ! \n";
}
else
{
 print "CRITICAL: Impact Command line is not same ! \n";
exit $ERRORS{'CRITICAL'};
}
}
else
{
print "$paname configuration file doesn't exist!","\n";
exit $ERRORS{'CRITICAL'};
}

Thanks,
Amit

On 3/30/09, Amit Kumar  wrote:
>
> A  application is running  on remote server ,I  want  to  execute a perl
> script on that remote server  to cehck it's availability and  other stuffs
> locally on that remote server and  whatever  output comes  after the
> execution of that script  it  should be  available on the  nagios local
> monitoring server: Please help me  out, I have   done  following  steps:
>
> Modification on Local Nagios Server:
> 1.  /usr/local/nagios/nrpe.cfg  made an entry like :
>
>   command[check_nrpe]=/usr/local/nagios/libexec/check_nrpe -H
> 
>
> 2.  /usr/local/nagios/etc/objects/services.cfg  made a following entry
>define service{
> use generic-service
> service_description Netcool status
> check_commandcheck_nrpe!check_impact
> }
> restarted the nagios server and the nrpe server on  Nagios localserver
>
>
>
> Modification done on the remote server
> 1.  /usr/local/nagios/nrpe.cfg  made an entry like :
>
>  command[check_impact]=/root/amit/nagioscheck/nagios_impact.pl
> -p=
>this perl  script  resides on the remote  server
> restarted the nrpe server on  remote server.
>
>
> Now  My  question is how   can  I get  this  working? I  tried most of the
> things  available  on  internet but nothing seems to be  working.
> Am I missing any  step or  any  local configuration , then please  specify.
> It is giving me  hard  time
>
> NRPE version is same  on both remote and  local nagios server  i.e 2.0
> and  the check_nrpe  utlity version on the local nagios server is  also 2.0.
>
> Please  help me for the same.
> Thanks in advance,
>
>
> Amit Kumar
>
>
>
>
>
>
--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Interesting problem while trying to monitor Oracle RAC services

2009-03-30 Thread Giorgio Zarrelli
Hi,

check the environment of the users launching the script. Which user do you
"use" to launch the script locally? And which one from remote?

Giorgio

Kumar, Ashish (xml.de...@gmail.com) scritto:
>
> Hello,
>
> We are facing an interesting but strange issue while trying to monitor
> Oracle RAC services.
>
> Oracle RAC is running on AIX 5.3 and nagios is running on Fedora Core 9.
>
> The scripts we are using to monitor Oracle RAC services on AIX are as follows
>
> -
> $ cat check_oracle_services.sh
>
> #!/usr/bin/ksh
> # found on the Internet
> RSC_KEY=$1
>
> /oracle/crs_home/bin/crs_stat -u | awk \
>     'BEGIN { FS="="; state = 0; } \
>     $1~/NAME/ && $2~/'$RSC_KEY'/ {appname = $2; state=1}; \
>     state == 0 {next;} \
>     $1~/TARGET/ && state == 1 {apptarget = $2; state=2;} \
>     $1~/STATE/ && state == 2 {appstate = $2; state=3;} \
>     state == 3 {printf "%-45s %-18s\n", appname, appstate; state=0;}'
> -
>
> $ cat check_oracle_services.pl
>
> #!/usr/bin/env perl
>
> use strict;
> use Getopt::Std;
>
> my %return_value = (
>     OK => 0,
>     CRIT => 2,
>     UNKNOWN => 3
> );
>
> my $message = "nagios";
> my $exit_status;
>
> my %opt=();
> getopts("p:h", \%opt);
>
> sub usage(){
>     print "Usage: $0 -p service_name\n";
>     exit $return_value{'UNKNOWN'};
> }
>
> usage() if defined $opt{'h'};
>
> my $SERVICE = $opt{'p'} if defined $opt{'p'} || usage();
>
> # the following code was added to make sure that nrpe was not getting confused
> # with dotted argument
> if ($SERVICE =~ "foo") {
>     $SERVICE = "ora.foo.bar.inst";
> }
>
> my $PIPED = qx/ ksh check_oracle_services.sh $SERVICE/;
> print $PIPED;
>
> if ($PIPED =~ /OFFLINE/g) {
>     $exit_status = $return_value{'CRIT'};
>     $message = "Critical: $SERVICE is not running.";
> } else {
>     $exit_status = $return_value{'OK'};
>     $message = "OK: $SERVICE is running.";
> }
>
> print "$message\n";
> exit $exit_status;
> -
>
> When we try to run this script on AIX (local system) the output is as follows:
>
> [srv01@/home/nagios/nrpe/libexec]$ perl check_oracle_services.pl -p foo
> ora.foo.bar.inst OFFLINE
> Critical: ora.foo.bar.inst is not running.
>
> [srv01@/home/nagios/nrpe/libexec]$ perl check_oracle_services.pl -p
> ora.foo.bar.inst
> ora.foo.bar.inst OFFLINE
> Critical: ora.foo.bar.inst is not running.
>
> The service indeed is offline
>
> [srv01@/home/nagios/nrpe/libexec]$ perl check_oracle_services.pl -p
> ora.foodb.bardb1.inst
> ora.foodb.bardb1.inst ONLINE on srv01
> OK: ora.foodb.bardb1.inst is running.
>
>
> Now when we try to run the same thing from nagios server it shows the
> services are online even if they are not
>
> [r...@nagios libexec]# ./check_nrpe -n -H 10.0.10.20 -c
> check_oracle_services -a ora.foo.bar.inst
> OK: ora.foo.bar.inst is running.
>
> [r...@nagios libexec]# ./check_nrpe -n -H 10.0.10.20 -c
> check_oracle_services -a foo
> OK: ora.foo.bar.inst is running.
>
> This is strange that we get the correct status when scripts are
> executed locally but wrong status when the scripts are executed
> remotely.
>
> Has anyone faced a similar issue?  I would appreciate if someone could
> give some insights on this.
>
> Thanks
>
> --
> ___
> Nagios-users mailing list
> Nagios-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when reporting 
> any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
>


--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] nagios 3.0.3 on FreeBSD defunct process

2009-03-30 Thread Gian Paolo Buono
Hi ,
in dmesg I don'have any log :( ...
In top i have 50% of i/o for nagios ...is normal or i too hight..?

[r...@nagios ~]# top -m io -d1
last pid:   944;  load averages:  3.25,  4.45,
4.78
up 2+22:45:38  11:17:11
104 processes: 4 running, 100 sleeping
CPU: % user, % nice, % system, % interrupt, % idle
Mem: 128M Active, 1891M Inact, 296M Wired, 896K Cache, 112M Buf, 693M Free
Swap: 4096M Total, 1936K Used, 4094M Free

  PID USERNAMEVCSW  IVCSW   READ  WRITE  FAULT  TOTAL PERCENT COMMAND
  937 www0 10  0  0  0  0   0.00%
extinfo.cgi
 1316 nagios25461482 4236551 89 1661948 3626424 5288461  49.63%
nagios
  697 www   19  1  0  0  0  0   0.00% httpd
97859 www   30  2  0  0  0  0   0.00% httpd
93467 www  102 15  0  0  0  0   0.00% httpd
97887 www   33  0  0  0  0  0   0.00% httpd
97891 www   20  1  0  0  0  0   0.00% httpd

Thank you very much... bye...


On Fri, Mar 27, 2009 at 9:18 PM, Roy Sigurd Karlsbakk wrote:

> On 27. mars. 2009, at 16.43, Gian Paolo Buono wrote:
>
> > 2 251 0   2935M  2180M 28764   0   0   0 12098   0 137  385 64794
> > 30784 34 24 41
>
>
> That is 251 processes blocking for some reason, and that is quite bad.
> Check kernel logs (dmesg) and so on. Perhaps you have a faulty drive
> or something? Memory doesn't seem to be an issue.
>
> roy
> --
> Roy Sigurd Karlsbakk
> (+47) 97542685 / 98013356
> r...@karlsbakk.net
> --
> I all pedagogikk er det essensielt at pensum presenteres
> intelligibelt. Det er en elementært imperativ for alle pedagoger å
> unngå eksessiv anvendelse av idiomer med fremmed opprinnelse. I de
> fleste tilfeller eksisterer adekvate og relevante synonymer på norsk.
>
>
>
> --
> ___
> Nagios-users mailing list
> Nagios-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
>
--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Interesting problem while trying to monitor Oracle RAC services

2009-03-30 Thread Kumar, Ashish
> check the environment of the users launching the script. Which user do you
> "use" to launch the script locally? And which one from remote?

Thank you for the response.

On remote system (AIX) I am using nagios user to execute the script.
Since the user nagios cannot execute crs_stat we have made group
oinstall as it's secondary group.

$ id
uid=207(nagios) gid=1(staff) groups=300(oinstall)

On nagios server I have tried executing it as root user as well as
nagios user but the problem remains.

Thanks

--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] check_ping with maximum MTU

2009-03-30 Thread Andreas Ericsson
Vian Vian wrote:
> Hello,
> 
> I wonder what is default MTU in check_ping? How can i maximize check_ping
> command to 1472 MTU?
> 

Why would you want an MTU of 1472 for something that in 99.9% of its
uses transfers only 64 bytes per packet?

-- 
Andreas Ericsson   andreas.erics...@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225  Fax: +46 8-230231

Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.

--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] MySQL.SOCK Location in NDO2DB

2009-03-30 Thread Ken Netzorg
Hopefully this question can be posted here as I haven't found a group
specific to NDO2DB  =)
I just installed NDO2DB and am looking to put the data into a MySQL DB on
the same server using UNIX socks. MySQL has been configured to put the sock
file in /tmp/mysql.sock but when I start NDO2DB, it is looking for the file
in /var/lib/mysql/mysql.sock.

I have looked through the ndo2db.cfg file and cannot fine a location to
specify where ndo2db should look to find the sock file. Is there an
undocumented line I can add to override the default location? I know I can
always fall back to adding a link myself in the /var/lib/mysql directory to
/tmp, but I'd rather not go that route and have soft links here and there
and just configure it through the central config file. (I've already
specified the override location through the mysql.default_socket directive
in the PHP.INI file and PHP is working fine.)

I looked at the ./configure --help output and it doesn't look like I can
configure it through re-compiling

Thoughts?

Thanks!
Ken
--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] NRPE vs. check_by_ssh

2009-03-30 Thread Andreas Ericsson
Kevin Keane wrote:
> Andreas Ericsson wrote:
>> Kevin Keane wrote:
>>> Christopher McAtackney wrote:
 2009/3/25 Kevin Keane :
  
> I think you are comparing apples and oranges here, because in most
> situations that I can think of, the decision is dictated by the 
> network
> topology. If you are exclusively on a trusted private network,
> check_by_ssh really doesn't offer any benefits. Conversely, if your
> topology involves the Internet or some other untrusted network (WiFi),
> then you wouldn't want NRPE in the first place.
>
> The only exception to the above that I can think of is when it 
> comes to
> deciding between using check_by_ssh over an untrusted network, vs. 
> NRPE
> through some other kind of tunnel or VPN. But in that case, you'd 
> incur
> encryption overhead either way, and the comparison is very different
> from the question you asked.
>
> All that said: I don't have any first-hand experience, but I suspect
> that the impact of establishing 2200 ssh connections in a five-minute
> span (assuming that you are using a five-minute check interval) is
> pretty substantial. The main impact actually lies in establishing and
> tearing down the connections, key negotiations etc.; the encryption
> during the data phase probably has only limited impact because most
> checks only transmit a few bytes back and forth.
>
> SSH does much better with longer-duration connections when the keys 
> are
> already exchanged. This is even more true if you have a router-based
> VPN, because in that case the overhead is offloaded to a different 
> machine.
>
> So if you have the option of sending the checks as NRPE through one 
> or a
> few long-term VPNs: you are probably going to be better off. Of 
> course,
> in the big picture, your mileage may vary.
> 
 Firstly, thanks for the detailed explanation of the issues involved in
 this choice Kevin, it's been very helpful.

 I'm curious though, could you elaborate on why NRPE is unsuitable if
 communication with my remote hosts is going to go via the Internet? Is
 it not sufficient that NRPE uses SSL? This may be more of a network
 security question than a Nagios one, but I've no real experience in
 either area unfortunately, so I appreciate any info you can give here.
   
>>> No, you are right. I wasn't aware that NRPE could use SSL. In that 
>>> case, NRPE would be pretty much the same in terms of performance as SSL.
>>>
>>> That said, I am generally concerned from a security standpoint about 
>>> any kind of active checks going over the Internet. This is because if 
>>> you are monitoring, in your example, 200 hosts, you have to poke 
>>> holes into 200 firewalls (or into one firewall, and then set up SSL 
>>> or SSH keys on 200 hosts). That's 200 potential security holes all 
>>> over the place with little or no control, and on machines that may 
>>> not necessarily be hardened for access from the outside world. Worse 
>>> - active checks, by nature, cause a program to be launched and 
>>> executed on the monitored client, and usually with very high 
>>> permissions. You said that you check 2000 services, so that's 2000 
>>> plugins (give or take a few). What if a hacker found a way to 
>>> compromise one of your 2000 plugins? You'd have a privilege 
>>> escalation issue along with remote-launch capability. On 200 clients.
>>>
>> Very high permissions are normally not needed.
> Depends on the plugin, but I'm not sure that this is generally true. For 
> instance, something as simple as log file analysis either requires root 
> permission on Linux; log files aren't readable by anybody else, or it 
> requires that you relax file permissions or security somewhere else.

If you do the insane version of log analysis, yes. A sane setup is to
have filters trigger on certain patterns and have the filtering program
log its results somewhere that Nagios can read. The actual logs need
never (and should never) be readable by the Nagios user.

> On 
> Windows, I'm running my monitoring agent (by default) as the Local 
> System account (most Windows services do that anyway). That has 
> basically full access to everything, but nothing on the network.
> 

Well, Windows is an aberration wrt privilege separation and that's
not going to change in the near future because privilege separation
makes things hard for home users. I'm sure you can create limited
accounts under Windows too though. Otherwise I doubt any security-
minded organization would use it.

> Of course check_ping, check_tcp etc. don't usually need such high 
> permissions.

check_ping actually requires root permissions on most systems. Or
rather, the program doing the actual pinging does, since it has to
open a raw socket.

>> I prefer using NRPE because
>> of two reasons:
>> 1. It provides a rather simple way of specifyin

Re: [Nagios-users] Best reporting tools

2009-03-30 Thread Andreas Ericsson
Andrew Davis wrote:
> I'm looking to add some reporting functionality to Nagios... something 
> that can report on hostgroups and servicegroups, among others. 
> NagiosExchange lists NagiosSLA and NaReTo. NagiosSLA looks good but is 
> flagged as being for 2.x only and we're running 3.x. NaReTo looks like 
> something to look at as well, but is somewhat old.
> 
> Can you guys share what you're doing via addons or customizations (or 
> even the built in, stock tools) for reporting (daily, weekly, monthly, 
> etc)? Any options preferred over others?
> 

We're using our own home-brewed version of Availability and SLA reporting.
We had a look at what was out there and a lot of testing determined that
none of them scaled as well as what we'd hoped. Instead we had to write
our own, which uses quite a lot of tricks to get 100% accurate numbers
while at the same time supporting huge installations with rather modest
disk space, memory and cpu power requirements.

It uses an API to get the figures, so if you have some other tool where
you can feed raw numbers and get reports formatted specifically to your
needs (instead of barcharts, piecharts and fairly boring numbers), that
would work too.

You can download the sources from http://www.op5.org. There's a sandbox
set up at www.op5.com/sandbox where you can create reports to see how
it works and looks. Contact me off-list if you want a password without
registering.

Let me know if you need help getting started.

-- 
Andreas Ericsson   andreas.erics...@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225  Fax: +46 8-230231

Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.

--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] check_ntp on blade systems

2009-03-30 Thread Andreas Ericsson
Chris wrote:
> I'm using check_ntp for some of my blade systems and it says: "NTP
> CRITICAL: No response from NTP server" By using the verbose mode I can
> see it's sending:
> 
> "sending request to peer 0
> re-sending request to peer 0
> re-sending request to peer 0
> re-sending request to peer 0
> re-sending request to peer 0
> re-sending request to peer 0
> re-sending request to peer 0
> re-sending request to peer 0
> re-sending request to peer 0
> 
> NTP CRITICAL: No response from NTP server"
> 
> I used tcpdump from the blade server and can see packets coming in
> from the nagios server.
> 
> I use check_ntp to monitor many of my other systems without any problem.
> 
> I also do lots of other checks (including various SNMP checks) on the
> blade system via Nagios without any problem.
> 

It seems your blade server isn't running an NTP daemon in listening
mode.

-- 
Andreas Ericsson   andreas.erics...@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225  Fax: +46 8-230231

Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.

--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Uniquely Identifiable Events in Nagios.log?

2009-03-30 Thread Andreas Ericsson
Christopher McAtackney wrote:
> Hi all,
> 
> I was wondering if it is possible to configure Nagios to produce
> uniquely identifiable entries in the nagios/var/nagios.log file?
> 
> The reason I ask, is that I would like to parse this log file for
> service check results and perform further processing based on the
> values discovered there. The trouble is, that as far as I can see,
> Nagios uses a time-stamp which is only accurate to the second, and so
> my log files have lines which all have the same time-stamp. Is there a
> way to increase the accuracy of this time-stamp perhaps? Or any other
> suitable solution to the general problem of identifying log entries?
> 

Since you almost certainly want to maintain a single process to handle
the logged lines, why not just write a tail-like program that parses
them one by one as they're written? After all, you'd hardly want to
slog through all the log-entries multiple times anyway.
With this solution, I find it hard to see a need for uniqueness. should
it happen that you still want to be able to uniquely identify lines, I
believe a hash over the last 15 or so lines should suffice, assuming
it's sufficiently strong (say, SHA1 or something). The chance of running
into a collision should be very slim, and if it happens you can just
increase the number of hashed lines.

-- 
Andreas Ericsson   andreas.erics...@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225  Fax: +46 8-230231

Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.

--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Max concurrent service checks

2009-03-30 Thread Andreas Ericsson
Gian Paolo Buono wrote:
> Hi,
> I don't have any nfs mount on this server, and can not find the problem..
> I think that the problem is the raid controller...
> 

That would indeed put processes in uninterruptable IO, since the kernel
will refuse to let processes run while it's waiting for a response from
the hardware.

> [r...@server /usr/local/etc/nagios]# dmesg | grep -i raid
> aac0:  port 0x5000-0x50ff mem
> 0xc9e0-0xc9ff,0xc7fe-0xc7ff irq 17 at device 0.0 on pci4
> aac0: ServeRAID 8k-l  , aac driver 2.0.0-1
> aacd0:  on aac0
> 
> but i dont have any log on this ..any suggest ?
> 

Try using the same hardware but with a different kernel (Windows or Linux)
that has another driver. If the driver *or* the controller is broken,
you'll get unkillable processes.

If the raid hardware is broken, you need to replace the hardware.
If the BSD raid driver is buggy, you need to either get new (and better
supported) hardware, or change the OS.


-- 
Andreas Ericsson   andreas.erics...@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225  Fax: +46 8-230231

Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.

--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Dependent service checks don't fail when depended-on service check fails

2009-03-30 Thread Andreas Ericsson
Jarrod Moore wrote:
> On Thu, Mar 26, 2009 at 7:57 PM, Andreas Ericsson  wrote:
>> Jarrod Moore wrote:
>>> Hello everyone,
>>>
>>> I have a couple of related questions regarding service dependencies in
>>> Nagios and their limitations. I have two service checks (let's call
>>> them A and B) and service A depends on service B to function
>>> correctly. I want to set Nagios up so that if service B crashes then
>>> both services A and B are put into the critical state in Nagios. I've
>>> tried using service dependencies in Nagios to represent this behaviour
>>> but have yet to be successful. I can only get it to suppress
>>> notifications of service A if both services go down.
>>>
>> This is expected behaviour. If A is truly dependant on B, then A will
>> turn into a non-ok state of its own volition rather than as a result
>> of any dependency magic. Dependencies are designed as a means of
>> suppressing notifications. Otherwise, you would *always* get a
>> notification for B first, and a minute or so later from A (actually,
>> without the dependency you could get from A first).
>>
>>> Is there a way to do what I'm trying to do here? I'd have thought it
>>> would be logical that if a service depends on another service and the
>>> service depended on dies then all services depending on it would fail
>>> their checks as well, but there;s probably some scenario where it
>>> doesn't work so well. I've had a look through the mailing list
>>> archives and found someone had asked a similar question to the
>>> nagios-devel list about 2.5 years ago and didn't end up getting an
>>> answer, so I thought I might ask whether solutions to this type of
>>> problem had been developed since then.
>>>
>> They haven't. You're using dependencies the wrong way, really. If
>> A is truly dependent on B and doesn't go into a non-ok state after
>> B has crashed, then your check isn't doing what it's supposed to do,
>> or you've misunderstood the relationship somehow.
>>
>> If you were to explain what the two services actually are, it would
>> be easier to point you to a solution that works.
>>
>> --
>> Andreas Ericsson   andreas.erics...@op5.se
>> OP5 AB www.op5.se
>> Tel: +46 8-230225  Fax: +46 8-230231
>>
>> Considering the successes of the wars on alcohol, poverty, drugs and
>> terror, I think we should give some serious thought to declaring war
>> on peace.
>>
> 
> Well basically I have a map (similar to Google Maps) embedded in a
> website, which hits a URL to retrieve maps. So I have one check using
> check_http to check that the website itself is up and another check on
> that URL to make sure that the map service is available. Now if the
> map service goes down, the website is still up but the maps won't
> appear, which means the website's functionality is significantly
> affected. However, it is still up and viewable so doing a check on the
> website URL still passes.
> 

It sounds to me like you'd want to make the map-check dependent on
the webserver-check. That would suppress notifications from the
map-check when it's the webserver that's bombing out. Do you really
need two notifications when the map-service goes offline?

> Now of course I could just write a script or something to check both
> URLs and set that as the check command. There is a problem for me with
> this approach, however, because I have some other instances where a
> web service depends on other web services.

Define "depend". As I understand the definition, coal-based lifeforms
on our fine planet depend on water and sunlight; Life cannot function
properly without them.
It sounds like you want to make sunlight depend on coal-based lifeforms,
because without the life, the sun is rather pointless.

Instead of trying to coerce dependencies to work backwards, I'd sit
down and think what you want your Nagios installation to do for you,
and why you would want two services to go critical when one of them
does. Isn't one notification and one red blob in the UI enough? If
it isn't, what do you hope to gain from having two notifications and
two red blobs?

> When I want to use these
> services in websites, I'd then have to write a check for each script,
> each containing every service in the chain that is needed to display
> the website correctly. This way of doing things just seems a bit
> repetitive to me, especially when I have a check for these web
> services already.

I'm sorry, but I still fail to see the point. Perhaps you'd be better
off defining each website as a servicegroup with all of the services
that make up the entire visitor-experience parts of a particular
servicegroup. That would make it possible for you to get some sort of
visualization of what (Nagios-)services affect which customer-services,
while at the same time keeping configuration work to a minimum.

-- 
Andreas Ericsson   andreas.erics...@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225  

Re: [Nagios-users] Nagios and MySQL

2009-03-30 Thread Ricardo Maraschini
Hi Mark,

Allways use the list for your questions.

- "Mark Weaver"  escreveu:
> Now the nagios configuration exists both on disk in the flat files and
> 
> within the database that was created. Which then does nagios respond
> to 
> when new hosts are created and put into play? The ones in the database
> 
> or the ones on disk or both?

Nagios remains under plain configuration files.


The opcfg is a configuration utility. On the first installation, you must run 
an import, thus all original configuration files are imported to mysql 
database. From this time you must use opcfg to make your new configurations, so 
you can enjoy a bunch of features.

After you proceed with your new configuration using opcfg, you run an "export" 
at opcfg interface, so the plain files are updated with new mysql configuration 
and nagios process is restarted.

You can use our mailing list at http://www.opmon.org/get-involved to more 
details.


-rm

--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Nagios and MySQL

2009-03-30 Thread Ricardo Maraschini

- "Mark Weaver"  escreveu:
> I've got OpCfg installed and I "think" it's working, and if I
> understand 
> things so far it's imported all my nagios settings including objects
> and 
> templates and such into the db. I can make changes, adjustments,
> etc... 
> with OpCfg and then I've got to export those changes or additions back
> 
> out to the Nagios config on disk. However it doesn't appear to be
> doing 
> that and I'm not sure why.

Exactly.


> All Nagios configuration files for everything on the Nagios server is
> in 
> /etc/nagios and OpCfg appears to be aware of that. I've made
> /etc/nagios 
> and everything under that directory writable for nagios.apache, but
> when 
> I make changes such as create a workstation template and then add a
> host 
> made from that template the files aren't being written back to disk.

Are you doing an "export" at the end of the process?
Are there error messages at apache log files related to export process?

I think we can move our discussion to opmon.org mailing list.
There we have a group of developers that can help you with this configuration.

http://www.opmon.org/get-involved

-rm

--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] Monitoring applications through NSCA

2009-03-30 Thread Kumar, Ashish
Hello,

I am planning to move application/process monitoring setup from active
to passive monitoring.

I read 
http://nagios.sourceforge.net/download/contrib/documentation/misc/NSCA_Setup.pdf
but it doesn't really help on how to setup scripts and how monitoring
is done with NSCA, for example, how to setup scripts, what would be
configuration entry for that.

I tried searching online and nagios-users archive without luck.  I
will appreciate if someone could share their configuration or may be
provide some pointers on how to set it up.

Thanks.

--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] Bug, Feature or Layer 8 problem?

2009-03-30 Thread Andre Timmermann
Hello List,

we are generating the timeperiods for our oncall-support via script.
Sometimes, one line of the configfile semes to be ignored:

define timeperiod{
timeperiod_name andre_2pikett_week
alias   Andre 2nd Pikett

2009-03-29  09:00-24:00
2009-03-30  00:00-09:00
2009-03-30  18:00-24:00
2009-03-31  00:00-09:00
}

In this case there is a problem with the second definition of
2009-03-30, it will be ignored. So I will not get any messages between
18:00 and 24:00.

Of course, if i I use
2009-03-30  00:00-09:00,18:00-24:00 
it will work.

Unfortunately there is no message during startup of nagios, stating that
it will ignore one line of my configfile.

Is this a bug or a feature? ;)

I am using Version 3.0.6 from Debian lenny.

-- 
Mit freundlichen Grüssen

Andre Timmermann
Nine Internet Solutions AG, Albisriederstr. 243c, CH-8047 Zuerich


--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] Forwarding through intermediate nodes?

2009-03-30 Thread Chris Pepper
We have a couple small HPC compute clusters, and would like to monitor 
our nodes. They aren't large enough to justify their own Nagios 
installations on the head nodes, and the heads aren't particularly 
trusted in our network topology.

But we would like to monitor health of the compute nodes, on a Nagios 
server which *cannot* connect to the nodes directly.

I didn't find this in the wiki, SF, or Exchange, and it doesn't look 
like something I could do with a single NSCA.

Is anyone doing this, or forwarding/tunneling Nagios traffic for 
another reason? I'm considering running a couple ssh tunnels on the head 
node, pointing back to the monitoring servers, but not sure how well 
this will work.

Suggestions welcomed.

Thanks,

Chris

-- 
Chris Pepper:
  

--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] MySQL.SOCK Location in NDO2DB

2009-03-30 Thread Andreas Ericsson
Ken Netzorg wrote:
> Hopefully this question can be posted here as I haven't found a group
> specific to NDO2DB  =)
> I just installed NDO2DB and am looking to put the data into a MySQL DB on
> the same server using UNIX socks. MySQL has been configured to put the sock
> file in /tmp/mysql.sock but when I start NDO2DB, it is looking for the file
> in /var/lib/mysql/mysql.sock.
> 
> I have looked through the ndo2db.cfg file and cannot fine a location to
> specify where ndo2db should look to find the sock file. Is there an
> undocumented line I can add to override the default location? I know I can
> always fall back to adding a link myself in the /var/lib/mysql directory to
> /tmp, but I'd rather not go that route and have soft links here and there
> and just configure it through the central config file. (I've already
> specified the override location through the mysql.default_socket directive
> in the PHP.INI file and PHP is working fine.)
> 
> I looked at the ./configure --help output and it doesn't look like I can
> configure it through re-compiling
> 

NDO2DB is using the MySQL client libraries (libmysqlclient) on your
system. The default location of the socket is configured there.

> Thoughts?
> 

Does the regular mysql client work on the server you've set this up?
If it doesn't, I assume you need to add an entry defining the socket
location in the [client] part of the my.cnf file. Also, if you're
using a non-standard location for your my.cnf file, libmysqlclient
won't find it unless you pass it the same environment variables as
you pass to the mysql daemon.

In general, ndoutils work just fine if you install MySQL and all
its client libraries from your distribution's package repositories.
If you start fiddling with compiling on your own, you may well end
up shooting yourself in the foot which, to me, appears to be exactly
what has happened here.

> Thanks!
> Ken
> 
> 
> 
> 
> 
> --
> 
> 
> 
> 
> ___
> Nagios-users mailing list
> Nagios-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when reporting 
> any issue. 
> ::: Messages without supporting info will risk being sent to /dev/null


-- 
Andreas Ericsson   andreas.erics...@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225  Fax: +46 8-230231

Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.

--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Bug, Feature or Layer 8 problem?

2009-03-30 Thread Andreas Ericsson
Andre Timmermann wrote:
> Hello List,
> 
> we are generating the timeperiods for our oncall-support via script.
> Sometimes, one line of the configfile semes to be ignored:
> 
> define timeperiod{
> timeperiod_name andre_2pikett_week
> alias   Andre 2nd Pikett
> 
> 2009-03-29  09:00-24:00
> 2009-03-30  00:00-09:00
> 2009-03-30  18:00-24:00
> 2009-03-31  00:00-09:00
> }
> 
> In this case there is a problem with the second definition of
> 2009-03-30, it will be ignored. So I will not get any messages between
> 18:00 and 24:00.
> 
> Of course, if i I use
> 2009-03-30  00:00-09:00,18:00-24:00 
> it will work.
> 
> Unfortunately there is no message during startup of nagios, stating that
> it will ignore one line of my configfile.
> 
> Is this a bug or a feature? ;)
> 

It's a feature, of sorts. Exceptions are parsed as they are encountered,
and timerange definitions are absolute. This means you actually have
to put both all timeranges for a specific date on the same line.

Nagios not warning about it could possibly be a bug, but then again,
it's pretty nifty to be able to temporarily override a value by simply
specifying the variable again later in the object, so I'm not so sure.

Can't you make your script generate timeranges in the proper format?
It seems to me like that would by far be the simplest solution.

-- 
Andreas Ericsson   andreas.erics...@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225  Fax: +46 8-230231

Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.

--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Monitoring applications through NSCA

2009-03-30 Thread Curtis LaMasters
Will you be doing Windows or Linux monitoring.  If Windows, I have had
good luck with NSClient++.  They also have a good amount of how-to on
their website.

Curtis LaMasters
http://www.curtis-lamasters.com
http://www.builtnetworks.com



On Mon, Mar 30, 2009 at 8:10 AM, Kumar, Ashish  wrote:
> Hello,
>
> I am planning to move application/process monitoring setup from active
> to passive monitoring.
>
> I read 
> http://nagios.sourceforge.net/download/contrib/documentation/misc/NSCA_Setup.pdf
> but it doesn't really help on how to setup scripts and how monitoring
> is done with NSCA, for example, how to setup scripts, what would be
> configuration entry for that.
>
> I tried searching online and nagios-users archive without luck.  I
> will appreciate if someone could share their configuration or may be
> provide some pointers on how to set it up.
>
> Thanks.
>
> --
> ___
> Nagios-users mailing list
> Nagios-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when reporting 
> any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
>

--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Forwarding through intermediate nodes?

2009-03-30 Thread Andreas Ericsson
Chris Pepper wrote:
>   We have a couple small HPC compute clusters, and would like to monitor 
> our nodes. They aren't large enough to justify their own Nagios 
> installations on the head nodes, and the heads aren't particularly 
> trusted in our network topology.
> 
>   But we would like to monitor health of the compute nodes, on a Nagios 
> server which *cannot* connect to the nodes directly.
> 
>   I didn't find this in the wiki, SF, or Exchange, and it doesn't look 
> like something I could do with a single NSCA.
> 
>   Is anyone doing this, or forwarding/tunneling Nagios traffic for 
> another reason? I'm considering running a couple ssh tunnels on the head 
> node, pointing back to the monitoring servers, but not sure how well 
> this will work.
> 
>   Suggestions welcomed.
> 

I think I'd solve this using a small custom script that runs all the checks
you want against the nodes (I suppose all nodes require more or less identical
checks) and sends the results back to the Nagios server as passive checks.

If the head nodes aren't allowed to talk to Nagios, they could publish the
checkresults (along with a timestamp) through some other means, like http,
ftp or even just a simple netcat session where a polling script on the
Nagios server can fetch them later. Make sure to include a timestamp in the
results-file if you do that, so you can verify that the checks are actually
being run.

Interesting problem. I'd take it kindly if you keep us posted :)

-- 
Andreas Ericsson   andreas.erics...@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225  Fax: +46 8-230231

Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.

--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] Internet speed

2009-03-30 Thread Idriss ARABBAJ
Hi all, 
I want to know the  Internet speed on my network, there is already a
plugin of that?

regrards,
Idriss

--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Forwarding through intermediate nodes?

2009-03-30 Thread Roman Fiedler
Andreas Ericsson wrote:
> I think I'd solve this using a small custom script that runs all the checks
> you want against the nodes (I suppose all nodes require more or less identical
> checks) and sends the results back to the Nagios server as passive checks.
>
> If the head nodes aren't allowed to talk to Nagios, they could publish the
> checkresults (along with a timestamp) through some other means, like http,
> ftp or even just a simple netcat session where a polling script on the
> Nagios server can fetch them later. Make sure to include a timestamp in the
> results-file if you do that, so you can verify that the checks are actually
> being run.
>
> Interesting problem. I'd take it kindly if you keep us posted :)

I'm using stunnel to forward the messages via intermediate nodes and I'm 
quite happy with it: Each intermediate node does a namespace 
transformation for the hostname (most of them just prepending the zone 
name), so that I can use the same minimal monitoring script on all 
leaf-nodes (which are sending the same "node name" for redundant and 
nearly identical nodes)

Since name space transformation happens on the stunnel side closer to 
nagios+apache server, no node can send an invalid nagios service 
identifier to fake  messages for other nodes and each connection is 
secured with own client/server key pair to fight message injection.

The tunnel will also do an additional input validation for the forwarded 
messages and output of "invalid" messages (for services/hosts just new 
to the tree) can be used to create nagios configuration automatically.

--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] Integrate "top" into output

2009-03-30 Thread Andrew Davis
I migrated from BB to Nagios. One of the things I used to do in BB was 
to show the output of "top" into the html page for that host. I'm not 
seeing how to do something similar in Nagios. Is this possible and does 
anyone have any suggestions on how I would go about it?


--


 A. Davis
 Email: ncc...@gmail.com

 "There is no limit to what a man can accomplish
  if he doesn't care who gets the credit." - Ronald Reagan

--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] Assigning Services to Hosts/Hostgroups

2009-03-30 Thread Chris Pepper
We'd like to assign services to hostgroups or hosts (even host 
templates would be useful) rather than assigning them in the service.

The problem with assigning them in the service definition is that we 
have to touch 2 places each time we add a new host, which is error-prone.

Is this a FAQ? Is there a way to do it which I just haven't found yet?

Thanks,

Chris Pepper
-- 
Chris Pepper:
  

--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Assigning Services to Hosts/Hostgroups

2009-03-30 Thread Marc Powell

On Mar 30, 2009, at 2:21 PM, Chris Pepper wrote:

>   We'd like to assign services to hostgroups or hosts (even host
> templates would be useful) rather than assigning them in the service.

http://nagios.sourceforge.net/docs/3_0/objecttricks.html#service

--
Marc


--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Internet speed

2009-03-30 Thread Corey Chandler
Idriss ARABBAJ wrote:
> Hi all,   
> I want to know the  Internet speed on my network, there is already a
> plugin of that?
>
> regrards,
> Idriss
>   

I tend to monitor bandwidth via Cacti; Nagios isn't really the best tool 
for trending...

-- 
Corey Chandler / KB1JWQ
Living Legend / Systems Exorcist
Today's Excuse: Me no internet, only janitor, me just wax floors


--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] Children "unreachable" on soft down?

2009-03-30 Thread Israel Brewster
Does nagios (3.0.3) mark a child host as unreachable when its parent  
enters a soft down state? I am finding myself getting repeated down  
messages for a host (which is, in fact, down), even though I have  
notifications set to only send a single message. Looking at the logs,  
it would appear that what is happening is that the host is flipping  
between "down" (which notifies me) and "unreachable" (which does not).  
The parent host, however, never enters a hard down state. Looking at  
the logs, what I see is that one ICMP check fails, throwing the host  
into a soft down state, but the next one works just fine, bringing it  
back to an up state.

The logic works fine for the parent host- since it never hits a hard  
down state, it doesn't alert, and everyone is happy. But apparently  
with the child host every time this happens, it switches from critical  
to unreachable and back again, triggering a notification. Is there any  
way to keep this from happening? Thanks.

---
Israel Brewster
Computer Support Technician II
Frontier Flying Service Inc.
5245 Airport Industrial Rd
Fairbanks, AK 99709
(907) 450-7250 x293
---




--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Dependent service checks don't fail when depended-on service check fails

2009-03-30 Thread Jarrod Moore
On Fri, Mar 27, 2009 at 5:43 PM, Matthias Flacke  wrote:
>
> Jarrod Moore wrote:
>> On Thu, Mar 26, 2009 at 7:57 PM, Andreas Ericsson  wrote:
>>> Jarrod Moore wrote:
 Hello everyone,

 I have a couple of related questions regarding service dependencies in
 Nagios and their limitations. I have two service checks (let's call
 them A and B) and service A depends on service B to function
 correctly. I want to set Nagios up so that if service B crashes then
 both services A and B are put into the critical state in Nagios. I've
 tried using service dependencies in Nagios to represent this behaviour
 but have yet to be successful. I can only get it to suppress
 notifications of service A if both services go down.

>>> This is expected behaviour. If A is truly dependant on B, then A will
>>> turn into a non-ok state of its own volition rather than as a result
>>> of any dependency magic. Dependencies are designed as a means of
>>> suppressing notifications. Otherwise, you would *always* get a
>>> notification for B first, and a minute or so later from A (actually,
>>> without the dependency you could get from A first).
>>>
 Is there a way to do what I'm trying to do here? I'd have thought it
 would be logical that if a service depends on another service and the
 service depended on dies then all services depending on it would fail
 their checks as well, but there;s probably some scenario where it
 doesn't work so well. I've had a look through the mailing list
 archives and found someone had asked a similar question to the
 nagios-devel list about 2.5 years ago and didn't end up getting an
 answer, so I thought I might ask whether solutions to this type of
 problem had been developed since then.

>>> They haven't. You're using dependencies the wrong way, really. If
>>> A is truly dependent on B and doesn't go into a non-ok state after
>>> B has crashed, then your check isn't doing what it's supposed to do,
>>> or you've misunderstood the relationship somehow.
>>>
>>> If you were to explain what the two services actually are, it would
>>> be easier to point you to a solution that works.
>>>
>>> --
>>> Andreas Ericsson                   andreas.erics...@op5.se
>>> OP5 AB                             www.op5.se
>>> Tel: +46 8-230225                  Fax: +46 8-230231
>>>
>>> Considering the successes of the wars on alcohol, poverty, drugs and
>>> terror, I think we should give some serious thought to declaring war
>>> on peace.
>>>
>>
>> Well basically I have a map (similar to Google Maps) embedded in a
>> website, which hits a URL to retrieve maps. So I have one check using
>> check_http to check that the website itself is up and another check on
>> that URL to make sure that the map service is available. Now if the
>> map service goes down, the website is still up but the maps won't
>> appear, which means the website's functionality is significantly
>> affected. However, it is still up and viewable so doing a check on the
>> website URL still passes.
>>
>> Now of course I could just write a script or something to check both
>> URLs and set that as the check command. There is a problem for me with
>> this approach, however, because I have some other instances where a
>> web service depends on other web services. When I want to use these
>> services in websites, I'd then have to write a check for each script,
>> each containing every service in the chain that is needed to display
>> the website correctly. This way of doing things just seems a bit
>> repetitive to me, especially when I have a check for these web
>> services already.
>
> You can give check_multi a try (http://my-plugin.de/check_multi).
>
> It allows to combine multiple checks on plugin level and has a
> builtin state logic to evaluate the results of these checks.
> You can reuse the command files by implementing macros.
>
> If I understood your setup correctly the whole result should return
> CRITICAL if either the main website or the map are not accessible.
> This is the standard behaviour of check_multi and could be
> implemented like this:
>
> # foo.cmd
> # call: check_multi -f  -s URLWEB= -s
> URLMAP=
> command [ website ] = check_http ... -u $URLWEB$ ...
> command [ map     ] = check_http ... -u $URLMAP$ ...
>
> It should work already with these two statements like you expect it
> with simple check_http, only combined. If one of the child checks
> fails, the whole construct returns WARNING or CRITICAL.
>
> If you need the RC determination more sophisticated, you can define
> it in perl syntax like this:
> state [ WARNING ] = website != OK || $website$=~/some evil output/
> state [ CRITICAL] = website >= WARNING && map != OK
>
> Cheers,
> -Matthias
>

Hi Matthias,

Thanks for the link. I've been checking (no pun intended) out
check_multi over the last day or two and I like it. My main concern
with this, though, is that if I had 10 websites that were dependent on
the m

Re: [Nagios-users] Monitoring applications through NSCA

2009-03-30 Thread Kumar, Ashish
> Will you be doing Windows or Linux monitoring.  If Windows, I have had
> good luck with NSClient++.  They also have a good amount of how-to on
> their website.

I will be monitoring Linux and AIX servers only.

--
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Dependent service checks don't fail when depended-on service check fails

2009-03-30 Thread Jarrod Moore
On Mon, Mar 30, 2009 at 10:13 PM, Andreas Ericsson  wrote:
> Jarrod Moore wrote:
>>
>> On Thu, Mar 26, 2009 at 7:57 PM, Andreas Ericsson  wrote:
>>>
>>> Jarrod Moore wrote:

 Hello everyone,

 I have a couple of related questions regarding service dependencies in
 Nagios and their limitations. I have two service checks (let's call
 them A and B) and service A depends on service B to function
 correctly. I want to set Nagios up so that if service B crashes then
 both services A and B are put into the critical state in Nagios. I've
 tried using service dependencies in Nagios to represent this behaviour
 but have yet to be successful. I can only get it to suppress
 notifications of service A if both services go down.

>>> This is expected behaviour. If A is truly dependant on B, then A will
>>> turn into a non-ok state of its own volition rather than as a result
>>> of any dependency magic. Dependencies are designed as a means of
>>> suppressing notifications. Otherwise, you would *always* get a
>>> notification for B first, and a minute or so later from A (actually,
>>> without the dependency you could get from A first).
>>>
 Is there a way to do what I'm trying to do here? I'd have thought it
 would be logical that if a service depends on another service and the
 service depended on dies then all services depending on it would fail
 their checks as well, but there;s probably some scenario where it
 doesn't work so well. I've had a look through the mailing list
 archives and found someone had asked a similar question to the
 nagios-devel list about 2.5 years ago and didn't end up getting an
 answer, so I thought I might ask whether solutions to this type of
 problem had been developed since then.

>>> They haven't. You're using dependencies the wrong way, really. If
>>> A is truly dependent on B and doesn't go into a non-ok state after
>>> B has crashed, then your check isn't doing what it's supposed to do,
>>> or you've misunderstood the relationship somehow.
>>>
>>> If you were to explain what the two services actually are, it would
>>> be easier to point you to a solution that works.
>>>
>>> --
>>> Andreas Ericsson                   andreas.erics...@op5.se
>>> OP5 AB                             www.op5.se
>>> Tel: +46 8-230225                  Fax: +46 8-230231
>>>
>>> Considering the successes of the wars on alcohol, poverty, drugs and
>>> terror, I think we should give some serious thought to declaring war
>>> on peace.
>>>
>>
>> Well basically I have a map (similar to Google Maps) embedded in a
>> website, which hits a URL to retrieve maps. So I have one check using
>> check_http to check that the website itself is up and another check on
>> that URL to make sure that the map service is available. Now if the
>> map service goes down, the website is still up but the maps won't
>> appear, which means the website's functionality is significantly
>> affected. However, it is still up and viewable so doing a check on the
>> website URL still passes.
>>
>
> It sounds to me like you'd want to make the map-check dependent on
> the webserver-check. That would suppress notifications from the
> map-check when it's the webserver that's bombing out. Do you really
> need two notifications when the map-service goes offline?

Sorry, I didn't explain that very well. I have a website check that I
want to have depend on the result of a map service check. The thing is
that I would like two notifications to be sent to my email - one for
the service check that is failing and one for each site that is
affected by the crashed service. That way I would know what is
affected and what needs fixing. Now I should mention at this point (if
it wasn't already blindingly obvious) that I'm by no means a Nagios
master. However, my idea was to have a chain of service dependencies
and then not send notifications for service dependencies in between
that I don't want emails about. There's probably a better way of doing
what I want and in that case, I'm all ... eyes.

>> Now of course I could just write a script or something to check both
>> URLs and set that as the check command. There is a problem for me with
>> this approach, however, because I have some other instances where a
>> web service depends on other web services.
>
> Define "depend". As I understand the definition, coal-based lifeforms
> on our fine planet depend on water and sunlight; Life cannot function
> properly without them.
> It sounds like you want to make sunlight depend on coal-based lifeforms,
> because without the life, the sun is rather pointless.
>
> Instead of trying to coerce dependencies to work backwards, I'd sit
> down and think what you want your Nagios installation to do for you,
> and why you would want two services to go critical when one of them
> does. Isn't one notification and one red blob in the UI enough? If
> it isn't, what do you hope to gain from having two noti

Re: [Nagios-users] Dependent service checks don't fail when depended-on service check fails

2009-03-30 Thread Matthias Flacke


Jarrod Moore wrote:
> On Fri, Mar 27, 2009 at 5:43 PM, Matthias Flacke  
> wrote:
>> Jarrod Moore wrote:
>>> On Thu, Mar 26, 2009 at 7:57 PM, Andreas Ericsson  wrote:
 Jarrod Moore wrote:
> Hello everyone,
>
> I have a couple of related questions regarding service dependencies in
> Nagios and their limitations. I have two service checks (let's call
> them A and B) and service A depends on service B to function
> correctly. I want to set Nagios up so that if service B crashes then
> both services A and B are put into the critical state in Nagios. I've
> tried using service dependencies in Nagios to represent this behaviour
> but have yet to be successful. I can only get it to suppress
> notifications of service A if both services go down.
>
 This is expected behaviour. If A is truly dependant on B, then A will
 turn into a non-ok state of its own volition rather than as a result
 of any dependency magic. Dependencies are designed as a means of
 suppressing notifications. Otherwise, you would *always* get a
 notification for B first, and a minute or so later from A (actually,
 without the dependency you could get from A first).

> Is there a way to do what I'm trying to do here? I'd have thought it
> would be logical that if a service depends on another service and the
> service depended on dies then all services depending on it would fail
> their checks as well, but there;s probably some scenario where it
> doesn't work so well. I've had a look through the mailing list
> archives and found someone had asked a similar question to the
> nagios-devel list about 2.5 years ago and didn't end up getting an
> answer, so I thought I might ask whether solutions to this type of
> problem had been developed since then.
>
 They haven't. You're using dependencies the wrong way, really. If
 A is truly dependent on B and doesn't go into a non-ok state after
 B has crashed, then your check isn't doing what it's supposed to do,
 or you've misunderstood the relationship somehow.

 If you were to explain what the two services actually are, it would
 be easier to point you to a solution that works.

 --
 Andreas Ericsson   andreas.erics...@op5.se
 OP5 AB www.op5.se
 Tel: +46 8-230225  Fax: +46 8-230231

 Considering the successes of the wars on alcohol, poverty, drugs and
 terror, I think we should give some serious thought to declaring war
 on peace.

>>> Well basically I have a map (similar to Google Maps) embedded in a
>>> website, which hits a URL to retrieve maps. So I have one check using
>>> check_http to check that the website itself is up and another check on
>>> that URL to make sure that the map service is available. Now if the
>>> map service goes down, the website is still up but the maps won't
>>> appear, which means the website's functionality is significantly
>>> affected. However, it is still up and viewable so doing a check on the
>>> website URL still passes.
>>>
>>> Now of course I could just write a script or something to check both
>>> URLs and set that as the check command. There is a problem for me with
>>> this approach, however, because I have some other instances where a
>>> web service depends on other web services. When I want to use these
>>> services in websites, I'd then have to write a check for each script,
>>> each containing every service in the chain that is needed to display
>>> the website correctly. This way of doing things just seems a bit
>>> repetitive to me, especially when I have a check for these web
>>> services already.
>> You can give check_multi a try (http://my-plugin.de/check_multi).
>>
>> It allows to combine multiple checks on plugin level and has a
>> builtin state logic to evaluate the results of these checks.
>> You can reuse the command files by implementing macros.
>>
>> If I understood your setup correctly the whole result should return
>> CRITICAL if either the main website or the map are not accessible.
>> This is the standard behaviour of check_multi and could be
>> implemented like this:
>>
>> # foo.cmd
>> # call: check_multi -f  -s URLWEB= -s
>> URLMAP=
>> command [ website ] = check_http ... -u $URLWEB$ ...
>> command [ map ] = check_http ... -u $URLMAP$ ...
>>
>> It should work already with these two statements like you expect it
>> with simple check_http, only combined. If one of the child checks
>> fails, the whole construct returns WARNING or CRITICAL.
>>
>> If you need the RC determination more sophisticated, you can define
>> it in perl syntax like this:
>> state [ WARNING ] = website != OK || $website$=~/some evil output/
>> state [ CRITICAL] = website >= WARNING && map != OK
>>
>> Cheers,
>> -Matthias
>>
> 
> Hi Matthias,
> 
> Thanks for the link. I've been checking (no pun intended) out
> check_multi over th