Re: [Nagios-users] How to surpess notifications if ping fails.
For some reason notifications are not being sent from nagios when i unplug the network cable from one of the hosts being monitored. Nagios recognizes that the hose is down but no notification... [1246567497] HOST ALERT: psefilesrv;DOWN;SOFT;1;(No Information Returned >From Host Check) [1246567528] Warning: Host check command '/usr/lib/nagios/plugins/check_ping -H 10.139.68.39 -w 3000.0,80% -c 5000.0,100% -p 5' for host 'psefilesrv' timed out after 30 seconds [1246567528] HOST ALERT: psefilesrv;DOWN;SOFT;2;(No Information Returned >From Host Check) [1246567559] Warning: Host check command '/usr/lib/nagios/plugins/check_ping -H 10.139.68.39 -w 3000.0,80% -c 5000.0,100% -p 5' for host 'psefilesrv' timed out after 30 seconds [1246567559] HOST ALERT: psefilesrv;DOWN;HARD;3;(No Information Returned >From Host Check) [1246567559] HOST NOTIFICATION: root;psefilesrv;DOWN;host-notify-by-email;(No Information Returned From Host Check) Here is the host config... define host{ host_name psefilesrv alias misc address10.139.68.39 usegeneric-host check_command check_alive notification_optionsd,r max_check_attempts 3 check_interval 1 notification_interval 1 notification_period 24x7 notifications_enabled 1 parentsswitch-office } And the command... define command { command_namecheck_alive command_line/usr/lib/nagios/plugins/check_ping -H $HOSTADDRESS$ -w 3000.0,80% -c 5000.0,100% -p 5 } Any thoughts to why this isn't working? Email notifications for other services is working fine. Maybe i need to add check_alive as a service? Thanks, JJ On Wed, Jul 1, 2009 at 3:17 PM, Jeremiah Jester wrote: > I removed it and notifications are still being sent for all services > associated with this host. Thoughts? > > JJ > > > On Wed, Jul 1, 2009 at 3:15 PM, Jon Angliss wrote: > >> Jeremiah Jester wrote: >> > Hmmm, tryied this...but not working. Also, i get an error when i specify >> > 'retry_interval'. maybe n agios3 only? >> > >> > define host{ >> > host_name psedev2 >> > alias dev >> > check_command check-host-alive >> > notification_optionsd,r >> > max_check_attempts 3 >> > check_interval 1 >> > retry_interval 1 >> > address 10.139.10.42 >> > use generic-host >> > parents switch-office >> > } >> > >> > Error log: >> > [1246481763] Error: Invalid host object directive 'retry_interval'. >> > [1246481763] Error: Could not add object property in file >> > '/etc/nagios2/conf.d/generic-host_nagios2.cfg' on line 143. >> > [1246481763] Bailing out due to one or more errors encountered in the >> > configuration files. Run Nagios from the command line with the -v >> > option to verify your config before restarting. (PID=27490) >> >> Yep, that'd be a nagios 3 option. I'd not realized (or maybe >> missed) you were using v2. Just remove that option. >> >> -- >> Jon Angliss >> >> >> >> > >> > >> > Thanks, >> > JJ >> > >> > >> > On Tue, Jun 30, 2009 at 9:08 PM, Jon Angliss > > <mailto:j...@netdork.net>> wrote: >> > >> > Jeremiah Jester wrote: >> > > Jon, >> > > >> > > Thanks for the reply. I've been struggling with this for some >> > days. Can >> > > you give me an example of what how to define this and in what >> file? I >> > > would appreciate your help. >> > >> > Files don't really matter. Nagios loads them all, and processes. >> > Its up to you how you want to format. Sometimes it's easier to >> > group by type (hosts, services, commands, etc), and others by >> > location (server room, etc). How you format is up to you. If you >> > want, you can even bundle it all in a single file. >> > >> > Lines ending in \ are wrapped and should appear on a single line in >> > your config. >> > >> > define command { >> >command_namecheck-host-alive >> >command_line$USER1$/check_ping -H $HOSTADDRESS$ \ >> >-w 3000.0,80% -c 5000.0,100% \ >> >-p 5 >> > } >> > >> > define command { >>
Re: [Nagios-users] How to surpess notifications if ping fails.
Hmmm, tryied this...but not working. Also, i get an error when i specify 'retry_interval'. maybe n agios3 only? define host{ host_name psedev2 alias dev check_command check-host-alive notification_optionsd,r max_check_attempts 3 check_interval 1 retry_interval 1 address 10.139.10.42 use generic-host parents switch-office } Error log: [1246481763] Error: Invalid host object directive 'retry_interval'. [1246481763] Error: Could not add object property in file '/etc/nagios2/conf.d/generic-host_nagios2.cfg' on line 143. [1246481763] Bailing out due to one or more errors encountered in the configuration files. Run Nagios from the command line with the -v option to verify your config before restarting. (PID=27490) Thanks, JJ On Tue, Jun 30, 2009 at 9:08 PM, Jon Angliss wrote: > Jeremiah Jester wrote: > > Jon, > > > > Thanks for the reply. I've been struggling with this for some days. Can > > you give me an example of what how to define this and in what file? I > > would appreciate your help. > > Files don't really matter. Nagios loads them all, and processes. > Its up to you how you want to format. Sometimes it's easier to > group by type (hosts, services, commands, etc), and others by > location (server room, etc). How you format is up to you. If you > want, you can even bundle it all in a single file. > > Lines ending in \ are wrapped and should appear on a single line in > your config. > > define command { >command_namecheck-host-alive >command_line$USER1$/check_ping -H $HOSTADDRESS$ \ >-w 3000.0,80% -c 5000.0,100% \ >-p 5 > } > > define command { >command_namecheck_http >command_line$USER1$/check_http -H $HOSTNAME$ > } > > define host { >host_name myhost >address 1.1.1.1 >check_command check-host-alive >notification_optionsd,r >check_periodAll >max_check_attempts 3 >check_interval 1 >retry_interval 1 >contact_groups mycontacts > } > > define service { >host_name myhost >check_command check_http >{.. other stuff here .. } > } > > This will execute check_http against the "myhost". check-host-alive > will be executed every 1 minute. If check-host-alive fails 3 times, > host is considered down, and alerts for check_http will be > suppressed. You should read up on host checks [1], service checks > [2], and notifications [3]. > > > Also, I've not seen v3 in the repsository but maybe i need to change my > > sources? > > You didn't mention which version of ubuntu you were using, but > jaunty has nagios3... > > http://packages.ubuntu.com/jaunty/nagios3 > > > > [1]: http://nagios.sourceforge.net/docs/3_0/hostchecks.html > [2]: http://nagios.sourceforge.net/docs/3_0/servicechecks.html > [3]: http://nagios.sourceforge.net/docs/3_0/notifications.html > > -- > Jon Angliss > > -- ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] How to surpess notifications if ping fails.
Hello, I have a working nagios2 installation under Ubuntu. I would like to configure nagios so that when a host cannot be ping'd by nagios all other services related to this server/s are stopped so I don't get a slew of notifications for that host. I'm thinking i want servicedepency directive in my dependecies.cfg file? Does this sound about right? define servicedependency{ hostgroup_name servers service_description ping dependent_hostgroup_nameservers dependent_service_description * execution_failure_criteria w,c notification_failure_criteria w,u,c } Thanks, JJ -- ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] Fwd: "can't shift that many" error
I keep getting these error messages when i post to the list.. any ideas why? -- Forwarded message -- From: <> Date: 2009/6/29 Subject: [Nagios-users] "can't shift that many" error To: jeremiahjes...@gmail.com Erro ao enviar o email para nagios-users@lists.sourceforge.net A caixa postal do destinatario esta cheia. O email foi recusado The mailbox is full. The email was rejected. -- ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] "can't shift that many" error
Marc, Thanks for your reply. I am using that nagios 2 package apart of Ubuntu. It appears to still work but the errors are worrisome. $:/etc/nagios2/scripts$ sudo /etc/init.d/nagios2 stop * Stopping nagios2 monitoring daemon nagios2 shift: 225: can't shift that many [ OK ] $:/etc/nagios2/scripts$ sudo /etc/init.d/nagios2 start * Starting nagios2 monitoring daemon nagios2 shift: 1: can't shift that many I have not modified the init script. Thanks, JJ On Mon, Jun 29, 2009 at 2:09 PM, Marc Powell wrote: > > On Jun 29, 2009, at 3:44 PM, Jeremiah Jester wrote: > > > While attempting to restart the nagios daemon i get the following > > error. Any thoughts? > > > > $ sudo /etc/init.d/nagios2 restart > > * Restarting nagios2 monitoring daemon > > nagios2 > > > > shift: 225: can't shift that many > > shift: 1: can't shift that many > > shift doesn't appear to be used in the init script in the tarball. Are > you using some OS or package version of nagios? If so, which? Have you > modified the init script from it's default? > > Is nagios running? Does a stop and then start work correctly? > > -- > Marc > > > > -- > ___ > Nagios-users mailing list > Nagios-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS when > reporting any issue. > ::: Messages without supporting info will risk being sent to /dev/null > -- ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] "can't shift that many" error
While attempting to restart the nagios daemon i get the following error. Any thoughts? $ sudo /etc/init.d/nagios2 restart * Restarting nagios2 monitoring daemon nagios2 shift: 225: can't shift that many shift: 1: can't shift that many Thanks, JJ -- ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] ping dependency
For some reason I'm having a hard time understanding exactly how dependencies work. What I'm trying to do is create a service dependcy that will halt notifications of any number of servers file if that host cannot be pinged. Can anyone assist? Here is the required basic syntax of a basic servicedependency definition in nagios2. define servicedependency{ host_name service_description dependent_host_name dependent_service_description } Much appreciated, Jeremiah -- Are you an open source citizen? Join us for the Open Source Bridge conference! Portland, OR, June 17-19. Two days of sessions, one day of unconference: $250. Need another reason to go? 24-hour hacker lounge. Register today! http://ad.doubleclick.net/clk;215844324;13503038;v?http://opensourcebridge.org___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] need help with check_sensors
Hello, I'm trying to get check_sensors working for nagios. I already have lm_sensors installed and ran 'sensors' to configure it for my machine. Next, I setup and defined the commands and services in nagios. However, when i run the following command i get the following error. $ /usr/lib/nagios/plugins/check_nrpe -H 10.10.10.44 -c check_temp SENSOR CRITICAL - Sensor alarm detected! $tail /var/log/nagios2/nagios.log [1245270237] SERVICE ALERT: pse06-back;CHECK TEMP;UNKNOWN;HARD;3;(No output returned from plugin) [1245270237] SERVICE NOTIFICATION: root;pse06-back;CHECK TEMP;UNKNOWN;notify-by-email;(No output returned from plugin) There doesn't seem to be much information on the web or in my Pro Nagios 2.0 book about this plugin. Any help would be appreciated! JJ -- Crystal Reports - New Free Runtime and 30 Day Trial Check out the new simplified licensing option that enables unlimited royalty-free distribution of the report engine for externally facing server and web deployment. http://p.sf.net/sfu/businessobjects___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] nagios2: failover
Scratch that. Process was stuck. Killed and restart nagios server. Thanks! JJ On Mon, Jun 15, 2009 at 3:22 PM, Jeremiah Jester wrote: > Oops, Looks like i had an error in my configuration file. > > Next question: Any idea why check_nagios can't find the running process? > > $/usr/lib/nagios/plugins/check_nrpe -H 10.10.10.41 -c check_nagios > NAGIOS CRITICAL: Could not locate a running Nagios process! > > Nothing in /var/log/nagios2/nagios.log about this message. I can verify the > nrpe and nagios are working as the following command returns succesfully. > > $ /usr/lib/nagios/plugins/check_nrpe -H 10.10.10.41 -c check_load > OK - load average: 1.00, 1.04, 1.05|load1=1.000;15.000;30.000;0; > load5=1.040;10.000;25.000;0; load15=1.050;5.000;20.000;0; > > Thanks, > JJ > > > On Mon, Jun 15, 2009 at 1:10 PM, Jeremiah Jester > wrote: > >> Hello Nagios Users, >> >> I'm attempting to setup a second nagios server up to do failover >> monitoring in case my primary nagios server goes down. >> >> Running into a odd problem in talking to the nrpe daemon on the master >> server. >> >> When I run the following cmd prompt, nrpe does not allow me to connect >> from the new host even though it is specified in the master nagios server >> nrpe.cfg file. >> >> SLAVE$ /usr/lib/nagios/plugins/check_nrpe -H 10.10.10.41 -c check_load >> Connection refused by host >> >> However, when i run the same cmd for another host, it returns >> successfully. >> >> SLAVE:~$ /usr/lib/nagios/plugins/check_nrpe -H 10.10.10.43 -c check_load >> OK - load average: 0.00, 0.00, 0.00|load1=0.000;15.000;30.000;0; >> load5=0.000;10.000;25.000;0; load15=0.000;5.000;20.000;0; >> >> **Note thta the IP for the nagios failover server IS listed in the >> nrpe.cfg allowed_hosts directive and the nrpe daemon has been restarted. >> >> Any insight on this issue? I am running Nagios 2.11. >> >> Thanks, >> JJ >> > > -- Crystal Reports - New Free Runtime and 30 Day Trial Check out the new simplified licensing option that enables unlimited royalty-free distribution of the report engine for externally facing server and web deployment. http://p.sf.net/sfu/businessobjects___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] nagios2: failover
Oops, Looks like i had an error in my configuration file. Next question: Any idea why check_nagios can't find the running process? $/usr/lib/nagios/plugins/check_nrpe -H 10.10.10.41 -c check_nagios NAGIOS CRITICAL: Could not locate a running Nagios process! Nothing in /var/log/nagios2/nagios.log about this message. I can verify the nrpe and nagios are working as the following command returns succesfully. $ /usr/lib/nagios/plugins/check_nrpe -H 10.10.10.41 -c check_load OK - load average: 1.00, 1.04, 1.05|load1=1.000;15.000;30.000;0; load5=1.040;10.000;25.000;0; load15=1.050;5.000;20.000;0; Thanks, JJ On Mon, Jun 15, 2009 at 1:10 PM, Jeremiah Jester wrote: > Hello Nagios Users, > > I'm attempting to setup a second nagios server up to do failover monitoring > in case my primary nagios server goes down. > > Running into a odd problem in talking to the nrpe daemon on the master > server. > > When I run the following cmd prompt, nrpe does not allow me to connect from > the new host even though it is specified in the master nagios server > nrpe.cfg file. > > SLAVE$ /usr/lib/nagios/plugins/check_nrpe -H 10.10.10.41 -c check_load > Connection refused by host > > However, when i run the same cmd for another host, it returns successfully. > > SLAVE:~$ /usr/lib/nagios/plugins/check_nrpe -H 10.10.10.43 -c check_load > OK - load average: 0.00, 0.00, 0.00|load1=0.000;15.000;30.000;0; > load5=0.000;10.000;25.000;0; load15=0.000;5.000;20.000;0; > > **Note thta the IP for the nagios failover server IS listed in the nrpe.cfg > allowed_hosts directive and the nrpe daemon has been restarted. > > Any insight on this issue? I am running Nagios 2.11. > > Thanks, > JJ > -- Crystal Reports - New Free Runtime and 30 Day Trial Check out the new simplified licensing option that enables unlimited royalty-free distribution of the report engine for externally facing server and web deployment. http://p.sf.net/sfu/businessobjects___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] nagios2: failover
Hello Nagios Users, I'm attempting to setup a second nagios server up to do failover monitoring in case my primary nagios server goes down. Running into a odd problem in talking to the nrpe daemon on the master server. When I run the following cmd prompt, nrpe does not allow me to connect from the new host even though it is specified in the master nagios server nrpe.cfg file. SLAVE$ /usr/lib/nagios/plugins/check_nrpe -H 10.10.10.41 -c check_load Connection refused by host However, when i run the same cmd for another host, it returns successfully. SLAVE:~$ /usr/lib/nagios/plugins/check_nrpe -H 10.10.10.43 -c check_load OK - load average: 0.00, 0.00, 0.00|load1=0.000;15.000;30.000;0; load5=0.000;10.000;25.000;0; load15=0.000;5.000;20.000;0; **Note thta the IP for the nagios failover server IS listed in the nrpe.cfg allowed_hosts directive and the nrpe daemon has been restarted. Any insight on this issue? I am running Nagios 2.11. Thanks, JJ -- Crystal Reports - New Free Runtime and 30 Day Trial Check out the new simplified licensing option that enables unlimited royalty-free distribution of the report engine for externally facing server and web deployment. http://p.sf.net/sfu/businessobjects___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] funny disk space message
Any one know why I'm getting this weird disk space message? * Nagios * > > Notification Type: PROBLEM > > Service: DISK SPACE > Host: prod > Address: (ip) > State: WARNING > > Date/Time: Mon Jun 8 23:52:12 UTC 2009 > > Additional Info: > > DISK WARNING - free space: / 23146 MB (32 0node=99 > > > -- Crystal Reports - New Free Runtime and 30 Day Trial Check out the new simplified licensing option that enables unlimited royalty-free distribution of the report engine for externally facing server and web deployment. http://p.sf.net/sfu/businessobjects___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] truncated nagios email?
When receiving an email notification of disk space full the 'Additional info' line seems to be truncated near "(32 0node=99" Any ideas why this is happening? Thanks, JJ Nagios * Notification Type: PROBLEM Service: DISK SPACE Host: prod Address: 10.10.10.47 State: CRITICAL Date/Time: Wed Jun 3 22:26:34 UTC 2009 Additional Info: DISK CRITICAL - free space: / 23155 MB (32 0node=99 -- OpenSolaris 2009.06 is a cutting edge operating system for enterprises looking to deploy the next generation of Solaris that includes the latest innovations from Sun and the OpenSource community. Download a copy and enjoy capabilities such as Networking, Storage and Virtualization. Go to: http://p.sf.net/sfu/opensolaris-get___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] disk warning messages
Hello, I'm trying to configure nagios to include the disk full percentage in the subject line of the email notification. For example. Subject: [Nagios] CRITICAL 99% Full on Server1 My commands currently look something like this for checking disk space. Anyone have any suggestions for me? -JJ = Check Disk Commands: define command{ command_namecheck_local_disk command_line$USER1$/check_disk -w $ARG1$ -c $ARG2$ -p $ARG3$ } define command{ command_namecheck_remote_disk command_line$USER1$/check_nrpe -H $HOSTADDRESS$ -c check_remote_disk } Email Commands: # 'host-notify-by-email' command definition define command{ command_namehost-notify-by-email command_line/usr/bin/printf "Subject: [Nagios] $SERVICESTATE$: $SERVICEDESC$ $HOSTNAME$\nFrom: nag...@$hostname$.mascorp.com\n\n * Nagios *\n\nNotification Type: $NOTIFICATIONTYPE$\nHost: $HOSTNAME$\nState: $HOSTSTATE$\nAddress: $HOSTADDRESS$\nInfo: $HOSTOUTPUT$\n\nDate/Time: $LONGDATETIME$\n" | /usr/sbin/exim "Host $HOSTSTATE$ alert for $HOSTNAME$!" $CONTACTEMAIL$ } Service: define service{ use local-service ; Name of service template to use use generic-service ; Name of service template hostgroup_name servers service_description DISK SPACE is_volatile 0 check_period24x7 max_check_attempts 3 normal_check_interval 3 retry_check_interval1 contact_groups admins notification_interval 1440 notification_period 24x7 notification_optionsw,u,c,r check_command check_nrpe!check_remote_disk!4%!2%! } -- OpenSolaris 2009.06 is a cutting edge operating system for enterprises looking to deploy the next generation of Solaris that includes the latest innovations from Sun and the OpenSource community. Download a copy and enjoy capabilities such as Networking, Storage and Virtualization. Go to: http://p.sf.net/sfu/opensolaris-get___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null