date:20060823

[Nagios-users] GUI front in for nagios.

2006-08-23 Thread Malcolm Frazier

Hi,

Does a client exists that runs on Linux, Windows, and Macs that can poll 
a Nagios server and display information about the servers rather than 
using the web interface?


-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] NSCA only accepts first check

2006-08-23 Thread Andrew M. Lyons

Hi all,

I'm trying to configure nsca, and I've run into a problem:  NSCA only 
accepts the first passive check that is sent to it, and all subsequent 
checks fail.

first check:
[18:05:02]$ echo -e "vmp161.vampire\tMyrinet Connectivity\t0\tBLAH\n" | 
./src/send_nsca -H monitor -c nsca.cfg
1 data packet(s) sent to host successfully.

second try:
[18:11:29]$ echo -e "vmp161.vampire\tMyrinet Connectivity\t0\tBLAH\n" | 
./src/send_nsca -H monitor -c nsca.cfg
Error: Timeout after 10 seconds

After restarting the nsca daemon on the monitoring host, it will again 
accept a single check, and all subsequent checks will timeout.

The nsca.cfg on the remote host has only a password entry.  Here are the 
relevant bits from the nsca.cfg on the nagios server(monitor):

pid_file=/var/run/nsca.pid
server_port=
server_address=
nsca_user=nagios
nsca_group=nagios
debug=1
command_file=/var/log/nagios/rw/nagios.cmd
alternate_dump_file=/var/log/nagios/rw/nsca.dump
aggregate_writes=0
append_to_file=0
max_packet_age=30
password=
decryption_method=1

Any help would be appreciated.  Thanks!

Best,

Andrew

-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Timeouts

2006-08-23 Thread Marc Powell



> -Original Message-
> From: [EMAIL PROTECTED] [mailto:nagios-users-
> [EMAIL PROTECTED] On Behalf Of Dirk H. Schulz
> Sent: Wednesday, August 23, 2006 4:43 AM
> To: nagios-users@lists.sourceforge.net
> Subject: [Nagios-users] Timeouts
> 
> Hi folks,
> 
> I have a problem concerning timeouts.
> 
> First the basics: I run Nagios 2.3.1 on Debian Sarge stable.
> 
> I have configured "service_check_timeout=60", but in certain
> circumstances (e.g. slow dns) I get the erorr: "Plugin timed out after
> 10 seconds" or "Socket timed out after 10 seconds".
> 
> Is there another timeout value I have to configure to get rid of this
10
> seconds threshold?

Yes, The service_check_timout in nagios.cfg is a last-resort timeout. If
a plugin hasn't terminated itself in that period of time then nagios
will kill it. All of the standard plugins (I believe) can be passed a
timeout value in their command line, usually via -t. If none is passed
they'll use whatever value is hard coded (usually 10 seconds). You can
use '--help' for the plugins you use to see the timeout parameters.
 
> I know that I should work on my dns first, but I want to understand
what
> decisions Nagios makes there.

Or use IP's instead of names so you don't rely on an external service
that can possibly fail. ;)

--
marc

-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] servicegroup shortcut in servicedependency doesn't quite work that way in the code?

2006-08-23 Thread Michael Durket

If one attempts to specify the following in Nagios:

define servicedependency {
servicegroup_name   X   
dependent_host_name Y
dependent_service_description   Z
execution_failure_criteria  w,u,c
notification_failure_criteria   w,u,c
}

it generates the error:

Error: Could not expand master hostgroups and/or hosts specified in service 
dependency 
(config file 'config.cfg', starting on line 

If I'm reading the code correctly in xodtemplate.c it looks like the code 
starting 
at line 4119 should consider the possibility that no hosts/hostgroups are 
specified 
even though a dependent host name is, and allow a fall-through to the code at 
line
4179 which appears to be designed to handle just this case.

-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] ocsp slows nagios a great deal

2006-08-23 Thread Fred

Not sure if this was discussed (I didn't find later threads), but I would 
suggest
that you need to batch your send_nsca requests.  Realize that *every*
transaction that nagios does invokes OCSP if it is defined.   This means
there is a fork, exec, and then whatever that app does.  If you have a perl
ocsp script for example, perl has to compile that script, execute your
code, and most likely then fork/exec send_nsca.

Send_nsca has the ability to accept batch input.  I streamline my ocsp
script so that the data is batched up in a file that at some point later
in time will be sent using send_ncsa.  Given that you have a good
number of checks, nagios is making the ocsp call very frequently.  You
can use that to your advantage.

Each time you run your script:

1) stat the queue/batch file (if it exists)
2) flock the batch file (if it exists)
3) If it is older then an acceptable amount of time (make this
a configurable parameter), set a flag to remember you will
be pushing the data on this iteration.
4) If the file is larger then an acceptable size, set your
flag again.

5) write your ocsp args to the end of your batch queue file.
6) if the flag is set, run send_nsca and pipe your batch queue file
into it.
7) truncate the file to zero length
8) unlock it

You will dramatically cut down on the send_nsca fork/exec's and
you will also cut down on the network traffic and system noise
that you create as a result of making so many connections.

Go back over your code and streamline it.  An alternate implementation
could be to start a perl demon that does 1-8 that reads from a FIFO
and simply make your OCSP routine an "echo $@ >>fifo"
You could also then have the perl program wake up more regularly and
flush the queue rather then having to rely on the next OCSP request
to come through (you could also use a cron job or a nagios plug-in
to periodically flush the queue by making the ocsp command both
a plug-in and ocsp compatible, simply call the OCSP command with
a zero timeout to cause the flush and allow null args which would
skip adding them to the queue when called as a plug-in)

-FredC

 



- Original Message 
From: loren jan wilson <[EMAIL PROTECTED]>
To: nagios-users@lists.sourceforge.net
Sent: Friday, August 11, 2006 10:38:31 AM
Subject: [Nagios-users] ocsp slows nagios a great deal

dear nagios users,

I'm in the process of trying to set up a distributed nagios
environment monitoring about 9,000 services on 2,500 hosts.
i'm using Sunfire V210 servers running Solaris 10.

i've found that the distributed servers which monitor the active
services can run about 1700 checks every 5 minutes if ocsp isn't
enabled, but once I enable ocsp, the number of active checks I can do
goes WAY down. here's a breakdown:

- ocsp disabled: 1700 checks / 5 min.

- ocsp command set to /bin/true: 1200 checks / 5 min.

- ocsp command set to a perl program that forks, then pipes output to
  send_nsca: 800 checks / 5 min. 

- ocsp command set to a shell program that pipes output to
  send_nsca: 500 checks / 5 min.

What's the deal? I've followed the instructions in the "performance
tuning" place in the manual, but nothing seems to help much, and I
don't know what else to check. Resources on the machines are not being
fully utilized there's about 30% free cpu at any given time, and
plenty of RAM (only 500 MB used of 2 GB). Any help would be much
appreciated!

Solaris 10 is fully patched with recommended updates from last week.
I'm running Nagios 2.5 and it's configured like this:

--with-perlcache \
--enable-embedded-perl \
--enable-nanosleep \
--with-gd-inc=$GD_INC_PATH \
--with-gd-lib=$GD_LIB_PATH



Thanks, 
Loren


-- 
loren jan wilson
network engineering, uchicago.edu
1155 rm. 327 ; 773/702-8189

-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null





-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
h

Re: [Nagios-users] submitting external commands remotely

2006-08-23 Thread Peter Krüpl





Hi,

    I doubt you can use sockets over NFS, someone correct me if i am wrong.
    What you could do is to execute the script on the nagios host, via ssh.

    If you normaly do this on the nagios host: scheduledowntine.sh host ServerA
13:00 14:00

    You could do this on a remote machine: ssh nagioshost "scheduledowntine.sh
host ServerA 13:00 14:00"

    You would be prompted for a password every time you do this, but you
    could use rsa/dsa keys, then there would be no need to enter a password
each time.

    Safe and simple :)

    Cheers,
    Peter

Iphtashu Fitz wrote:
I
have a handful of scripts that I've written that submit commands to our Nagios
monitor remotely, so I'm pretty comfortable with working with external commands. 
I now need to be able to submit external commands from an external host. 
In a nutshell I want to schedule downtime of specific services based on recurring
tasks that are run on those remote machines.  I've written a script that
will schedule service downtime and it works fine when I run it on the nagios
server.  I tried exporting the nagios/var directory via NFS and mounted it
from one of the remote machines so that they could write to  nagios.cmd. 
But when I run the script on these remote machines Nagios never responds
to those external commands.  The script appears to successfully write to
the NFS mounted nagios.cmd but Nagios never gets the message, and it never
shows up in  nagios.log.  
  
Is it possible for a remote machine to communicate with Nagios this way (NFS
mounted nagios.cmd), or is there a better approach for submitting external
commands from an external host?
  
  

-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
  

___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null




-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] submitting external commands remotely

2006-08-23 Thread Iphtashu Fitz

I have a handful of scripts that I've written that submit commands to our Nagios monitor remotely, so I'm pretty comfortable with working with external commands.  I now need to be able to submit external commands from an external host.  In a nutshell I want to schedule downtime of specific services based on recurring tasks that are run on those remote machines.  I've written a script that will schedule service downtime and it works fine when I run it on the nagios server.  I tried exporting the nagios/var directory via NFS and mounted it from one of the remote machines so that they could write to 
nagios.cmd.  But when I run the script on these remote machines Nagios never responds to those external commands.  The script appears to successfully write to the NFS mounted nagios.cmd but Nagios never gets the message, and it never shows up in 
nagios.log.  Is it possible for a remote machine to communicate with Nagios this way (NFS mounted nagios.cmd), or is there a better approach for submitting external commands from an external host?
-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] escalation/notification question

2006-08-23 Thread Jeff Williams

Do I have to explicitly say in the escalations that warnings should stop at 1 notification, rather than just say only critical and recovery can go from 2 to 5 like I had before? Do I have to put something like this?
define serviceescalation{
   
host_name
viper
   
service_description  
DISK /
   
first_notification    1
   
last_notification 1
   
notification_interval 0 
   
contact_groups   
admins
   
escalation_options    w
}On 8/23/06, Michael Koprowski <
[EMAIL PROTECTED]> wrote:
















I may be wrong, but my interpretation is you
will receive a warning, unknown, critical, and recovery alerts if these changes
are reported on viper's disk and will notify you every 5 minutes.

 





If I have a host called viper with the following definitions, I'm
wondering why the group admins gets warning notifications for the service
"DISK /" after the first notification. To me, these definitions say
that admins should get no more than 5 notifications and no warning
notifications after the first one. Am I wrong in thinking that? In testing
this, the admins group got at least 8 warning notifications. 











host definition
define host{
   
use  
generic-host
   
host_name
viper
   
alias
viper
   
address  
127.0.0.1
   
parents  
localhost
   
check_command
check-host-alive
   
notification_interval 0
   
notification_options  d,r
   
notification_period   never 
   
contact_groups   
admins
}

service definition
define service{
   
use  
generic-service
   
host_name
viper
   
service_description  
DISK / 
   
is_volatile  
0
   
check_period 
24x7
   
retry_check_interval  1
   
contact_groups   
admins
   
notification_interval 5 
   
notification_period  
24x7
   
notification_options 
w,u,c,r
   
check_command
check_nrpe!check_disk1
}
escalation definition
define serviceescalation{ 
   
host_name
viper
   
service_description  
DISK /
   
first_notification   
2
   
last_notification
2
    notification_interval
5 
   
contact_groups   
admins
   
escalation_options   
c,r
}

define serviceescalation{
   
host_name
viper
   
service_description  
DISK / 
    first_notification   
3
   
last_notification
4
   
notification_interval 5
   
contact_groups   
admins
   
escalation_options   
c,r 
}
define serviceescalation{
   
host_name
viper
   
service_description  
DISK /
   
first_notification   
5
   
last_notification
0
   
notification_interval 0 
   
contact_groups   
admins
   
escalation_options   
c,r
}
















Contactgroup:
define contactgroup{
   
contactgroup_name   admins
   
alias  
Nagios Administrators
   
members
nagios
    }
 
 











   Contacts:








define contact{
   
contact_name   
nagios
   
alias  
Nagios Admin
   
service_notification_period 24x7
   
host_notification_period    24x7 
   
service_notification_options    w,u,c,r
   
host_notification_options   d,r
    service_notification_commands  
notify-by-email
   
host_notification_commands  host-notify-by-email 
   
email  
[EMAIL PROTECTED]
    }





The group "admins" should get warnings, but since the
escalation file says only go past original notification for critical and
recovery, shouldn't I just get 1 warning notification or 2 at most? 








Also, here are the generic-host and generic-service definitions. I forgot to
include those before as well.
define host{
   
name   
generic-host    ; The name of this host template 
   
check_interval 
0
   
max_check_attempts 
2
   
notifications_enabled  
1   ; Host notifications are enabled
   
event_handler_enabled  
1   ; Host event handler is enabled 
   
flap_detection_enabled 
1   ; Flap detection is enabled
   
failure_prediction_enabled 
1   ; Failure prediction is enabled
   
process_perf_data  
1   ; Process performance data 
   
retain_status_information  
1   ; Retain status information across
program restarts
   
retain_nonstatus_information   
1   ; Retain non

Re: [Nagios-users] escalation/notification question

2006-08-23 Thread Jeff Williams

If I have a host called viper with the following definitions, I'm wondering why the group admins gets warning notifications for the service "DISK /" after the first notification. To me, these definitions say that admins should get no more than 5 notifications and no warning notifications after the first one. Am I wrong in thinking that? In testing this, the admins group got at least 8 warning notifications.

host definitiondefine host{    use   generic-host    host_name viper    alias viper    address   

127.0.0.1    parents   localhost    check_command check-host-alive    notification_interval 0    notification_options  d,r    notification_period   never
    contact_groups    admins}service definitiondefine service{    use   generic-service    host_name viper    service_description   DISK /
    is_volatile   0    check_period  24x7    retry_check_interval  1    contact_groups    admins    notification_interval 5
    notification_period   24x7    notification_options  w,u,c,r    check_command check_nrpe!check_disk1}escalation definitiondefine serviceescalation{
    host_name viper    service_description   DISK /    first_notification    2    last_notification 2    notification_interval 5
    contact_groups    admins    escalation_options    c,r}define serviceescalation{    host_name viper    service_description   DISK /
    first_notification    3    last_notification 4    notification_interval 5    contact_groups    admins    escalation_options    c,r
}define serviceescalation{    host_name viper    service_description   DISK /    first_notification    5    last_notification 0    notification_interval 0
    contact_groups    admins    escalation_options    c,r}
Contactgroup:define contactgroup{    contactgroup_name   admins    alias   Nagios Administrators    members nagios    }
    Contacts:
define contact{    contact_name    nagios    alias   Nagios Admin    service_notification_period 24x7    host_notification_period    24x7
    service_notification_options    w,u,c,r    host_notification_options   d,r    service_notification_commands   notify-by-email    host_notification_commands  host-notify-by-email
    email   [EMAIL PROTECTED]    }The group "admins" should get warnings, but since the escalation file says only go past original notification for critical and recovery, shouldn't I just get 1 warning notification or 2 at most?
Also, here are the generic-host and generic-service definitions. I forgot to include those before as well.define host{    name    generic-host    ; The name of this host template
    check_interval  0    max_check_attempts  2    notifications_enabled   1   ; Host notifications are enabled    event_handler_enabled   1   ; Host event handler is enabled
    flap_detection_enabled  1   ; Flap detection is enabled    failure_prediction_enabled  1   ; Failure prediction is enabled    process_perf_data   1   ; Process performance data
    retain_status_information   1   ; Retain status information across program restarts    retain_nonstatus_information    1   ; Retain non-status information across program restarts    register    0   ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!
    } define service{    name    generic-service ; The 'name' of this service template    max_check_attempts  2    normal_check_interval   5
    active_checks_enabled   1   ; Active service checks are enabled    passive_checks_enabled  1   ; Passive service checks are enabled/accepted    parallelize_check   1   ; Active service checks should be parallelized (disabling this can lead to major performance problems)
    obsess_over_service 1   ; We should obsess over this service (if necessary)    check_freshness 0   ; Default is to NOT check service 'freshness'    notifications_enabled   1   ; Service notifications

Re: [Nagios-users] Web Event Log View can crash IE?

2006-08-23 Thread Carl Friend

   Giles Coochey writes:

> I just wondered whether I'm alone in having the Web "Event Log"
> sometimes appear to make IE stop responding (need to do forced
> close of IE).

   How big are your logs?  The ones I have here are fairly large
(about 3.6 megs/day) and sometimes cause IE to slow to an absolute
crawl.  I've never had to forcibly shut it down, but have seen
several instances where it went unresponsive for 30 seconds or so.

+--+---+
| Carl Richard Friend (UNIX Sysadmin)  | Natick, Massachusetts |
| Minicomputer Collector / Enthusiast  |01760-2098 |
| mailto:[EMAIL PROTECTED] +---+
| http://users.rcn.com/crfriend/museum |  ICBM: 42:18N 71:21W  |
+--+---+

-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

[Nagios-users] Web Event Log View can crash IE?

2006-08-23 Thread Giles Coochey

Hello List,

I just wondered whether I'm alone in having the Web "Event Log"
sometimes appear to make IE stop responding (need to do forced close of
IE).

This is most evident when attempting to view an earlier log than the one
being displayed, but often occurs on the current event log too.

I don't get the problem with Firefox, but I do get it with IE (on
multiple client systems).

I've had the problem with Nagios 2.4 & now 2.5.

Thanks

Giles

-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Problems with the nuvola theme

2006-08-23 Thread Stefano Mosconi

Got it!I had an old installation of nagios (ver.1.3)  still active on the server and cgipath in config.js was pointing to /nagios/cgi-bin/ instead of /nagios2/cgi-bin/Thus I was using old cgi scripts that didn't have the link to 
common.cssSorry for bothering.StezzOn 8/23/06, [EMAIL PROTECTED] <
[EMAIL PROTECTED]> wrote:
hi 

what is in your index.html file 
in mine is like 





Nagios