Re: [Nagios-users] check_mbean_collector cannot create socket
On what IP is port 5566 listening ? Localhost is usually defined as 127.0.0.1 and most programs do not listen on the loop-back device . also for more accurate testing - always test as the nagios user not as root. On 07/10/12 19:15, Marek Grinberg wrote: Hello, I have installed check_mbean_collector into /usr/lib/nagios/libexec, and tried it: # ./check_mbean_collector -H localhost -p 5566 -m jboss.system:type=ServerInfo -a ActiveThreadCount -w 200 -c 400 ERROR: could not create socket What have I missed ? Marek -- Don't let slow site performance ruin your business. Deploy New Relic APM Deploy New Relic app performance management and know exactly what is happening inside your Ruby, Python, PHP, Java, and .NET app Try New Relic at no cost today and get our sweet Data Nerd shirt too! http://p.sf.net/sfu/newrelic-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- Don't let slow site performance ruin your business. Deploy New Relic APM Deploy New Relic app performance management and know exactly what is happening inside your Ruby, Python, PHP, Java, and .NET app Try New Relic at no cost today and get our sweet Data Nerd shirt too! http://p.sf.net/sfu/newrelic-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_jboss_status errors off
If you don't specify the port, it will use 8080 as default. is it correct in your case ? regards, On Sun, Oct 7, 2012 at 3:23 PM, Marek Grinberg mag...@telia.com wrote: Hello, After copying check_jboss_status.pl into /usr/local/nagios/libexec, I tried it: # ./check_jboss_status.pl -H tpstest.domain.com Unable to load status: Connection refused What have I missed ? Marek -- Don't let slow site performance ruin your business. Deploy New Relic APM Deploy New Relic app performance management and know exactly what is happening inside your Ruby, Python, PHP, Java, and .NET app Try New Relic at no cost today and get our sweet Data Nerd shirt too! http://p.sf.net/sfu/newrelic-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- Don't let slow site performance ruin your business. Deploy New Relic APM Deploy New Relic app performance management and know exactly what is happening inside your Ruby, Python, PHP, Java, and .NET app Try New Relic at no cost today and get our sweet Data Nerd shirt too! http://p.sf.net/sfu/newrelic-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] check services in the future
Hello, I have nagios 3.2.3. Sometimes, the nagios want to check my services (not all) in the future. For example: Next Scheduled Check: 04-09-2014 08:26:08 But check interval: Normal Check Interval = 0h 5m 0s Usually, this situation occurs in services, which in down more than a one day. Issue of this problem: - stop nagios; - remove files objects.cache and retention.dat - start nagios Date on the server is correct. And problem occur always in different services. OS: debian, Nagios was installed from source. Why? Thank you. -- Don't let slow site performance ruin your business. Deploy New Relic APM Deploy New Relic app performance management and know exactly what is happening inside your Ruby, Python, PHP, Java, and .NET app Try New Relic at no cost today and get our sweet Data Nerd shirt too! http://p.sf.net/sfu/newrelic-dev2dev___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] Repeating event handler in hard service state....
Hi there list, As I understand service specific event handlers are triggered for every state change whenever a server is in a SOFT state, and once when a service enters a HARD state. Problem is that I have a service (an IPSEC tunnel) which is dependent on an outside source. If the outside party fails (whenever they do updates once a week) I actually kill the tunnel when attempting a restart. To solve this I would like to keep trying the restart. I could do this in a SOFT state by increasing the max check attempts to a higher number... but than I would never get a notification. Letting the service go to a HARD state (to get the notification) would limit the restart attempt to just the one event when the service enters the HARD state. I think there are 2 possibilities: - Keep the service in a SOFT state and send out a notification on attempt X. - Let the service go to a HARD state but keep on trying the restart. Is there anyway I could achieve this? Or am I completely missing something. Peter -- Don't let slow site performance ruin your business. Deploy New Relic APM Deploy New Relic app performance management and know exactly what is happening inside your Ruby, Python, PHP, Java, and .NET app Try New Relic at no cost today and get our sweet Data Nerd shirt too! http://p.sf.net/sfu/newrelic-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_mbean_collector cannot create socket
Sorry for bothering the list. As of today a collegue of me is assigned to work with Nagios. Marek Assaf Flatto skrev 2012-10-08 12:39: On what IP is port 5566 listening ? Localhost is usually defined as 127.0.0.1 and most programs do not listen on the loop-back device . also for more accurate testing - always test as the nagios user not as root. On 07/10/12 19:15, Marek Grinberg wrote: Hello, I have installed check_mbean_collector into /usr/lib/nagios/libexec, and tried it: # ./check_mbean_collector -H localhost -p 5566 -m jboss.system:type=ServerInfo -a ActiveThreadCount -w 200 -c 400 ERROR: could not create socket What have I missed ? Marek -- Don't let slow site performance ruin your business. Deploy New Relic APM Deploy New Relic app performance management and know exactly what is happening inside your Ruby, Python, PHP, Java, and .NET app Try New Relic at no cost today and get our sweet Data Nerd shirt too! http://p.sf.net/sfu/newrelic-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- Don't let slow site performance ruin your business. Deploy New Relic APM Deploy New Relic app performance management and know exactly what is happening inside your Ruby, Python, PHP, Java, and .NET app Try New Relic at no cost today and get our sweet Data Nerd shirt too! http://p.sf.net/sfu/newrelic-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- Don't let slow site performance ruin your business. Deploy New Relic APM Deploy New Relic app performance management and know exactly what is happening inside your Ruby, Python, PHP, Java, and .NET app Try New Relic at no cost today and get our sweet Data Nerd shirt too! http://p.sf.net/sfu/newrelic-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_jboss_status errors off
Yes, it is port 8080. Sorry for bothering the list. As of today a collegue of me is assigned to Nagios. Marek Leonardo Bacha Abrantes skrev 2012-10-08 13:15: If you don't specify the port, it will use 8080 as default. is it correct in your case ? regards, On Sun, Oct 7, 2012 at 3:23 PM, Marek Grinberg mag...@telia.com wrote: Hello, After copying check_jboss_status.pl into /usr/local/nagios/libexec, I tried it: # ./check_jboss_status.pl -H tpstest.domain.com Unable to load status: Connection refused What have I missed ? Marek -- Don't let slow site performance ruin your business. Deploy New Relic APM Deploy New Relic app performance management and know exactly what is happening inside your Ruby, Python, PHP, Java, and .NET app Try New Relic at no cost today and get our sweet Data Nerd shirt too! http://p.sf.net/sfu/newrelic-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- Don't let slow site performance ruin your business. Deploy New Relic APM Deploy New Relic app performance management and know exactly what is happening inside your Ruby, Python, PHP, Java, and .NET app Try New Relic at no cost today and get our sweet Data Nerd shirt too! http://p.sf.net/sfu/newrelic-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- Don't let slow site performance ruin your business. Deploy New Relic APM Deploy New Relic app performance management and know exactly what is happening inside your Ruby, Python, PHP, Java, and .NET app Try New Relic at no cost today and get our sweet Data Nerd shirt too! http://p.sf.net/sfu/newrelic-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] NRPE: Unable to read output; but works when run under strace ...
Hello all, given a fairly well-running monitoring setup with about 18k services I thought I had understood the basics. However, the following leaves me clueless, and I hope I'm merely missing something obvious here: On an up-to-date Debian Squeeze (i386) OpenVZ guest I have established that my monitoring user can execute a given command: root@vserv08:/# sudo -u monitor -i /usr/lib/nagios/plugins/check_dummy 0 success; echo Exitcode: $? OK: success Exitcode: 0 So far, so good. Now entering NRPE, using a stripped-down config for illustrating the point: root@vserv08:/# grep -v -e '^$' -e '^#' /etc/nagios/nrpe.cfg debug=1 nrpe_user=monitor nrpe_group=monitor allowed_hosts=127.0.0.1 command[dummy]=/usr/lib/nagios/plugins/check_dummy 0 success root@vserv08:/# ps auxww | grep '[/]usr/sbin/nrpe' monitor 7215 0.0 0.1 3704 892 ?Ss 15:20 0:00 /usr/sbin/nrpe -c /etc/nagios/nrpe.cfg -d The process startup logged as follows: Oct 8 15:20:22 vserv08 nrpe[7214]: Added command[dummy]=/usr/lib/nagios/plugins/check_dummy 0 success Oct 8 15:20:22 vserv08 nrpe[7214]: INFO: SSL/TLS initialized. All network traffic will be encrypted. Oct 8 15:20:22 vserv08 nrpe[7215]: Starting up daemon Oct 8 15:20:22 vserv08 nrpe[7215]: Listening for connections on port 5666 Oct 8 15:20:22 vserv08 nrpe[7215]: Allowing connections from: 127.0.0.1 However, executing the dummy command won't work: root@vserv08:/# /usr/lib/nagios/plugins/check_nrpe -H 127.0.0.1 -c dummy NRPE: Unable to read output This has been logged as: Oct 8 15:21:36 vserv08 nrpe[7234]: Connection from 127.0.0.1 port 48791 Oct 8 15:21:36 vserv08 nrpe[7234]: Host address is in allowed_hosts Oct 8 15:21:36 vserv08 nrpe[7234]: Handling the connection... Oct 8 15:21:36 vserv08 nrpe[7234]: Host is asking for command 'dummy' to be run... Oct 8 15:21:36 vserv08 nrpe[7234]: Running command: /usr/lib/nagios/plugins/check_dummy 0 success Oct 8 15:21:36 vserv08 nrpe[7234]: Command completed with return code 2 and output: Oct 8 15:21:36 vserv08 nrpe[7234]: Return Code: 2, Output: NRPE: Unable to read output Oct 8 15:21:36 vserv08 nrpe[7234]: Connection from 127.0.0.1 closed. This strikes me as weird: nrpe tries to execute the defined command, but somehow no output shows up. I know of the peculiarities that might arise once sudo joins the team or when permissions aren't set appropriately, but this doesn't apply here. Playing around with the dummy command (substituting a shell script, sprinkling '| tee -a logfile' into the code, ...) revealed that indeed the desired text output is generated but somehow gets discarded. Perhaps the monitoring user or even the whole system is subtly broken, but given that there are ~400 similiarily setup systems (all using the same workflow/automatisms for deploying the monitoring infrastructure) I was starting to wonder how that might have happened ... However, it got weirder: if I strace the nrpe process, everything works as desired: root@vserv08:/# strace -f -o /root/log -p 7215 And then in another terminal: root@vserv08:/# /usr/lib/nagios/plugins/check_nrpe -H 127.0.0.1 -c dummy OK: success Logged as follows: Oct 8 15:21:57 vserv08 nrpe[7240]: Connection from 127.0.0.1 port 37275 Oct 8 15:21:57 vserv08 nrpe[7240]: Host address is in allowed_hosts Oct 8 15:21:57 vserv08 nrpe[7240]: Handling the connection... Oct 8 15:21:57 vserv08 nrpe[7240]: Host is asking for command 'dummy' to be run... Oct 8 15:21:57 vserv08 nrpe[7240]: Running command: /usr/lib/nagios/plugins/check_dummy 0 success Oct 8 15:21:57 vserv08 nrpe[7240]: Command completed with return code 0 and output: OK: success Oct 8 15:21:57 vserv08 nrpe[7240]: Return Code: 0, Output: OK: success Oct 8 15:21:57 vserv08 nrpe[7240]: Connection from 127.0.0.1 closed. I found no further hints in the strace log, but this led me to assume that there is some NRPE weirdness involved, and thus I'm writing here instead of further digging through the system. Any ideas? Cheers, Flo -- Don't let slow site performance ruin your business. Deploy New Relic APM Deploy New Relic app performance management and know exactly what is happening inside your Ruby, Python, PHP, Java, and .NET app Try New Relic at no cost today and get our sweet Data Nerd shirt too! http://p.sf.net/sfu/newrelic-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_mbean_collector cannot create socket
Nrpe usually listen on port 5666. Not 5566. Verzonden vanaf Samsung Mobile Marek Grinberg mag...@telia.com schreef: Sorry for bothering the list. As of today a collegue of me is assigned to work with Nagios. Marek Assaf Flatto skrev 2012-10-08 12:39: On what IP is port 5566 listening ? Localhost is usually defined as 127.0.0.1 and most programs do not listen on the loop-back device . also for more accurate testing - always test as the nagios user not as root. On 07/10/12 19:15, Marek Grinberg wrote: Hello, I have installed check_mbean_collector into /usr/lib/nagios/libexec, and tried it: # ./check_mbean_collector -H localhost -p 5566 -m jboss.system:type=ServerInfo -a ActiveThreadCount -w 200 -c 400 ERROR: could not create socket What have I missed ? Marek -- Don't let slow site performance ruin your business. Deploy New Relic APM Deploy New Relic app performance management and know exactly what is happening inside your Ruby, Python, PHP, Java, and .NET app Try New Relic at no cost today and get our sweet Data Nerd shirt too! http://p.sf.net/sfu/newrelic-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- Don't let slow site performance ruin your business. Deploy New Relic APM Deploy New Relic app performance management and know exactly what is happening inside your Ruby, Python, PHP, Java, and .NET app Try New Relic at no cost today and get our sweet Data Nerd shirt too! http://p.sf.net/sfu/newrelic-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- Don't let slow site performance ruin your business. Deploy New Relic APM Deploy New Relic app performance management and know exactly what is happening inside your Ruby, Python, PHP, Java, and .NET app Try New Relic at no cost today and get our sweet Data Nerd shirt too! http://p.sf.net/sfu/newrelic-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- Don't let slow site performance ruin your business. Deploy New Relic APM Deploy New Relic app performance management and know exactly what is happening inside your Ruby, Python, PHP, Java, and .NET app Try New Relic at no cost today and get our sweet Data Nerd shirt too! http://p.sf.net/sfu/newrelic-dev2dev___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] NRPE: Unable to read output; but works when run under strace ...
Lol... had me going for a while to... for me the awnser was in the sudo config (visudo)... it required a user on a tty... I believe the directive was requiretty. Disable that line and you'll be fine I think. Peter Van: Florian Ernst [florian_er...@gmx.net] Verzonden: maandag 8 oktober 2012 20:31 To: nagios-users@lists.sourceforge.net Onderwerp: [Nagios-users] NRPE: Unable to read output; but works when run under strace ... Hello all, given a fairly well-running monitoring setup with about 18k services I thought I had understood the basics. However, the following leaves me clueless, and I hope I'm merely missing something obvious here: On an up-to-date Debian Squeeze (i386) OpenVZ guest I have established that my monitoring user can execute a given command: root@vserv08:/# sudo -u monitor -i /usr/lib/nagios/plugins/check_dummy 0 success; echo Exitcode: $? OK: success Exitcode: 0 So far, so good. Now entering NRPE, using a stripped-down config for illustrating the point: root@vserv08:/# grep -v -e '^$' -e '^#' /etc/nagios/nrpe.cfg debug=1 nrpe_user=monitor nrpe_group=monitor allowed_hosts=127.0.0.1 command[dummy]=/usr/lib/nagios/plugins/check_dummy 0 success root@vserv08:/# ps auxww | grep '[/]usr/sbin/nrpe' monitor 7215 0.0 0.1 3704 892 ?Ss 15:20 0:00 /usr/sbin/nrpe -c /etc/nagios/nrpe.cfg -d The process startup logged as follows: Oct 8 15:20:22 vserv08 nrpe[7214]: Added command[dummy]=/usr/lib/nagios/plugins/check_dummy 0 success Oct 8 15:20:22 vserv08 nrpe[7214]: INFO: SSL/TLS initialized. All network traffic will be encrypted. Oct 8 15:20:22 vserv08 nrpe[7215]: Starting up daemon Oct 8 15:20:22 vserv08 nrpe[7215]: Listening for connections on port 5666 Oct 8 15:20:22 vserv08 nrpe[7215]: Allowing connections from: 127.0.0.1 However, executing the dummy command won't work: root@vserv08:/# /usr/lib/nagios/plugins/check_nrpe -H 127.0.0.1 -c dummy NRPE: Unable to read output This has been logged as: Oct 8 15:21:36 vserv08 nrpe[7234]: Connection from 127.0.0.1 port 48791 Oct 8 15:21:36 vserv08 nrpe[7234]: Host address is in allowed_hosts Oct 8 15:21:36 vserv08 nrpe[7234]: Handling the connection... Oct 8 15:21:36 vserv08 nrpe[7234]: Host is asking for command 'dummy' to be run... Oct 8 15:21:36 vserv08 nrpe[7234]: Running command: /usr/lib/nagios/plugins/check_dummy 0 success Oct 8 15:21:36 vserv08 nrpe[7234]: Command completed with return code 2 and output: Oct 8 15:21:36 vserv08 nrpe[7234]: Return Code: 2, Output: NRPE: Unable to read output Oct 8 15:21:36 vserv08 nrpe[7234]: Connection from 127.0.0.1 closed. This strikes me as weird: nrpe tries to execute the defined command, but somehow no output shows up. I know of the peculiarities that might arise once sudo joins the team or when permissions aren't set appropriately, but this doesn't apply here. Playing around with the dummy command (substituting a shell script, sprinkling '| tee -a logfile' into the code, ...) revealed that indeed the desired text output is generated but somehow gets discarded. Perhaps the monitoring user or even the whole system is subtly broken, but given that there are ~400 similiarily setup systems (all using the same workflow/automatisms for deploying the monitoring infrastructure) I was starting to wonder how that might have happened ... However, it got weirder: if I strace the nrpe process, everything works as desired: root@vserv08:/# strace -f -o /root/log -p 7215 And then in another terminal: root@vserv08:/# /usr/lib/nagios/plugins/check_nrpe -H 127.0.0.1 -c dummy OK: success Logged as follows: Oct 8 15:21:57 vserv08 nrpe[7240]: Connection from 127.0.0.1 port 37275 Oct 8 15:21:57 vserv08 nrpe[7240]: Host address is in allowed_hosts Oct 8 15:21:57 vserv08 nrpe[7240]: Handling the connection... Oct 8 15:21:57 vserv08 nrpe[7240]: Host is asking for command 'dummy' to be run... Oct 8 15:21:57 vserv08 nrpe[7240]: Running command: /usr/lib/nagios/plugins/check_dummy 0 success Oct 8 15:21:57 vserv08 nrpe[7240]: Command completed with return code 0 and output: OK: success Oct 8 15:21:57 vserv08 nrpe[7240]: Return Code: 0, Output: OK: success Oct 8 15:21:57 vserv08 nrpe[7240]: Connection from 127.0.0.1 closed. I found no further hints in the strace log, but this led me to assume that there is some NRPE weirdness involved, and thus I'm writing here instead of further digging through the system. Any ideas? Cheers, Flo -- Don't let slow site performance ruin your business. Deploy New Relic APM Deploy New Relic app performance management and know exactly what is happening inside your Ruby, Python, PHP, Java, and .NET app Try New Relic at no cost today and get our sweet Data Nerd shirt too! http://p.sf.net/sfu/newrelic-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net
[Nagios-users] AUTO: Chan, Brian is away from the office (returning 10/10/2012)
I am out of the office until 10/10/2012. I will respond to your message when I return. If this is a request for support, click here to open an ITRequest--: mailto:itrequ...@shawcor.com Alternatively, all questions can be directed to the Help Desk at 416-744-5557 Brian Chan Note: This is an automated response to your message Nagios-users Digest, Vol 77, Issue 3 sent on 10/08/2012 17:54:47. This is the only notification you will receive while this person is away. -- Don't let slow site performance ruin your business. Deploy New Relic APM Deploy New Relic app performance management and know exactly what is happening inside your Ruby, Python, PHP, Java, and .NET app Try New Relic at no cost today and get our sweet Data Nerd shirt too! http://p.sf.net/sfu/newrelic-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Repeating event handler in hard service state....
On 9/10/2012 1:17 AM, Peter Kaagman wrote: Hi there list, As I understand service specific event handlers are triggered for every state change whenever a server is in a SOFT state, and once when a service enters a HARD state. Problem is that I have a service (an IPSEC tunnel) which is dependent on an outside source. If the outside party fails (whenever they do updates once a week) I actually kill the tunnel when attempting a restart. To solve this I would like to keep trying the restart. I could do this in a SOFT state by increasing the max check attempts to a higher number... but than I would never get a notification. Letting the service go to a HARD state (to get the notification) would limit the restart attempt to just the one event when the service enters the HARD state. I think there are 2 possibilities: - Keep the service in a SOFT state and send out a notification on attempt X. - Let the service go to a HARD state but keep on trying the restart. Is there anyway I could achieve this? Or am I completely missing something. Peter If you set the service to volatile if will run the event handler every time the service is not OK, even after multiple HARD states. The event handler at edcint.co.nz/checkwmiplus will also give you fine grain control over exactly what states the event handler should do something including specific text strings in the service output. This may add some flexibility so that you only restart the tunnel if you really need to. -- Smartmon System Monitoring http://www.smartmon.com.au www.smartmon.com.au http://www.smartmon.com.au -- Don't let slow site performance ruin your business. Deploy New Relic APM Deploy New Relic app performance management and know exactly what is happening inside your Ruby, Python, PHP, Java, and .NET app Try New Relic at no cost today and get our sweet Data Nerd shirt too! http://p.sf.net/sfu/newrelic-dev2dev___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null