[Nagios-users] Fwd: Problem with recovery notification
Hello, Anybody ? I don't understand the Hard and soft logic in Service Alert of server 1 : CRITICAL;SOFT;1 CRITICAL;SOFT;2 CRITICAL;HARD;3 CRITICAL;SOFT;1 = Why I don't have after CRITICAL;HARD;3 and before CRITICAL;SOFT;1 : OK;HARD;3. Questions : 1) The flapping mode can explain this behaviour ? 2) If the node is down the service state (hard or soft) is set to soft ? Regards, Thanks. - Mail transféré - De: Samuel Mutel samuel.mu...@free.fr À: nagios-users@lists.sourceforge.net Envoyé: Mardi 16 Février 2010 21:05:54 GMT +01:00 Amsterdam / Berlin / Berne / Rome / Stockholm / Vienne Objet: Problem with recovery notification Hello, I have two Nagios servers that monitor the same equipement. This two nagios send the result of check by notification to another monitoring system (OpenNMS). I use Nagios 3.2. I received the recovery notification from server 2 but I did not received recovery notification from server 1. Why ? I think that SOFT and HARD states are the problem but I am not sur. In the second server 2 the status of service is HARD - OK so the notification is sent but on server 1, the service is SOFT - OK !!! Here is the log of Nagios : Service Alert of server 1 : [1266299592] SERVICE ALERT: test-server;CPU;CRITICAL;SOFT;1;CHECK_ESX3.PL CRITICAL - Error: Server version unavailable at 'https://ip_address/sdk/vimService.wsdl' [1266299885] SERVICE ALERT: test-server;CPU;CRITICAL;SOFT;2;CHECK_ESX3.PL CRITICAL - Error connecting to server at 'https://ip_address/sdk/webService': Perhaps host is not a Virtual Center or ESX server [1266299925] SERVICE ALERT: test-server;CPU;CRITICAL;HARD;3;CHECK_ESX3.PL CRITICAL - Error: Server version unavailable at 'https://ip_address/sdk/vimService.wsdl' [1266303080] SERVICE ALERT: test-server;CPU;CRITICAL;SOFT;1;CHECK_ESX3.PL CRITICAL - Error connecting to server at 'https://ip_address/sdk/webService': Perhaps host is not a Virtual Center or ESX server [1266303380] SERVICE ALERT: test-server;CPU;CRITICAL;SOFT;2;CHECK_ESX3.PL CRITICAL - Error connecting to server at 'https://ip_address/sdk/webService': Perhaps host is not a Virtual Center or ESX server [1266308485] SERVICE ALERT: test-server;CPU;CRITICAL;SOFT;1;(Service Check Timed Out) [1266308500] SERVICE ALERT: test-server;CPU;OK;SOFT;2;CHECK_ESX3.PL OK - test-server cpu usage=2.29 % Service Notification of server 1 : [1266299925] SERVICE NOTIFICATION: onms_prod;test-server;CPU;CRITICAL;send_service_trap_to_onms_prod;CHECK_ESX3.PL CRITICAL - Error: Server version unavailable at https://ip_address/sdk/vimService.wsdl [1266300645] SERVICE NOTIFICATION: onms_prod;test-server;CPU;CRITICAL;send_service_trap_to_onms_prod;CHECK_ESX3.PL CRITICAL - Error: Server version unavailable at https://ip_address/sdk/vimService.wsdl [1266301385] SERVICE NOTIFICATION: onms_prod;test-server;CPU;CRITICAL;send_service_trap_to_onms_prod;CHECK_ESX3.PL CRITICAL - Error: Server version unavailable at https://ip_address/sdk/vimService.wsdl [1266301720] SERVICE NOTIFICATION: onms_prod;test-server;CPU;CRITICAL;send_service_trap_to_onms_prod;CHECK_ESX3.PL CRITICAL - Error: Server version unavailable at https://ip_address/sdk/vimService.wsdl [1266303575] SERVICE NOTIFICATION: onms_prod;test-server;CPU;CRITICAL;send_service_trap_to_onms_prod;CHECK_ESX3.PL CRITICAL - Error connecting to server at https://ip_address/sdk/webService: Perhaps host is not a Virtual Center or ESX server [1266304175] SERVICE NOTIFICATION: onms_prod;test-server;CPU;CRITICAL;send_service_trap_to_onms_prod;CHECK_ESX3.PL CRITICAL - Error connecting to server at https://ip_address/sdk/webService: Perhaps host is not a Virtual Center or ESX server [1266304810] SERVICE NOTIFICATION: onms_prod;test-server;CPU;CRITICAL;send_service_trap_to_onms_prod;CHECK_ESX3.PL CRITICAL - Error: Server version unavailable at https://ip_address/sdk/vimService.wsdl [1266305270] SERVICE NOTIFICATION: onms_prod;test-server;CPU;CRITICAL;send_service_trap_to_onms_prod;CHECK_ESX3.PL CRITICAL - Error: Server version unavailable at https://ip_address/sdk/vimService.wsdl [1266305975] SERVICE NOTIFICATION: onms_prod;test-server;CPU;CRITICAL;send_service_trap_to_onms_prod;CHECK_ESX3.PL CRITICAL - Error connecting to server at https://ip_address/sdk/webService: Perhaps host is not a Virtual Center or ESX server Service Alert of server 2 : [1266299856] SERVICE ALERT: test-server;CPU;CRITICAL;SOFT;1;(Service Check Timed Out) [1266300161] SERVICE ALERT: test-server;CPU;CRITICAL;HARD;1;(Service Check Timed Out) [1266300516] SERVICE ALERT: test-server;CPU;CRITICAL;SOFT;1;CHECK_ESX3.PL CRITICAL - Error: Server version unavailable at 'https://ip_address/sdk/vimService.wsdl' [1266301481] SERVICE ALERT: test-server;CPU;CRITICAL;SOFT;1;(Service Check Timed Out) [1266301512] SERVICE ALERT: test-server;CPU;CRITICAL;SOFT;2;CHECK_ESX3.PL CRITICAL - Error: Server version unavailable at 'https://ip_address/sdk/vimService.wsdl' [1266304201]
Re: [Nagios-users] Fwd: Problem with recovery notification
samuel.mu...@free.fr wrote: Hello, Anybody ? I don't understand the Hard and soft logic in Service Alert of server 1 : CRITICAL;SOFT;1 CRITICAL;SOFT;2 CRITICAL;HARD;3 CRITICAL;SOFT;1 = Why I don't have after CRITICAL;HARD;3 and before CRITICAL;SOFT;1 : OK;HARD;3. Questions : 1) The flapping mode can explain this behaviour ? 2) If the node is down the service state (hard or soft) is set to soft ? Flap detection only inhibits notifications. It would not effect hard/soft states. Several things could cause this, but it appears you've stripped all context out of the logs. Was Nagios restarted between the CRITICAL;HARD;3 and the CRITICAL;SOFT;1, maybe? Im not 100% sure, but the service state count may also be reset (I'd be a bit surprised if it isn't) if the host is determined to be down. -- Download Intelreg; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] service notification when host is down
Thanks for your answer, In fact it is normal behavior to me also. Thing that is not normal behavior to me is that between two checks, Nagios jumps from SOFT 1 to HARD 1 without doing the steps SOFT 1 SOFT 2 SOFT 3 and finally HARD 4. Regards, Samuel Bancal 2010/2/17 Morris, Patrick patrick.mor...@hp.com Samuel Bancal wrote: Nagios Core 3.2.0 nagios-plugins-1.4.14 Ubuntu server 8.04.3 LTS Hi, I'm encountering problems to configure the notifications in case a server is no more responding to PING (ICMP). I don't understand why Nagios is jumping over steps when it's doing service-check icmp. Here is the config : define host{ usegeneric-server host_name server1 alias server1 addressthe.ip.the.ip hostgroups prod-servers contact_groups group1 check_command check-host-alive check_period 24x7 check_interval 5 retry_interval 1 max_check_attempts 4 notification_period24x7 notification_interval 60 notification_options d,u,r } define service{ use generic-service host_name server1 service_description ICMP check_command check_icmp!100.0,20%!500.0,60% max_check_attempts 4 normal_check_interval 5 retry_check_interval1 notification_optionsw,u,c,r notification_interval 60 notification_period 24x7 } [...] define command{ command_namecheck-host-alive command_line$USER1$/check_ping -H $HOSTADDRESS$ -w 3000.0,80% -c 5000.0,100% -p 5 } define command{ command_namecheck_icmp command_line$USER1$/check_icmp -H $HOSTADDRESS$ -w $ARG1$ -c $ARG2$ -p 5 } [...] Here is an example of history that I get : Service Critical[2010-02-16 11:33:13] SERVICE ALERT: server1;ICMP;CRITICAL;SOFT;1;CRITICAL - the.ip.the.ip: rta nan, lost 100% Host Down[2010-02-16 11:33:43] HOST ALERT: server1;DOWN;SOFT;1;(Host Check Timed Out) Service Critical[2010-02-16 11:34:13] SERVICE ALERT: server1;ICMP;CRITICAL;HARD;1;CRITICAL - the.ip.the.ip: rta nan, lost 100% Host Down[2010-02-16 11:34:43] HOST ALERT: server1;DOWN;SOFT;2;(Host Check Timed Out) Host Down[2010-02-16 11:35:23] HOST ALERT: server1;DOWN;SOFT;3;(Host Check Timed Out) Host Down[2010-02-16 11:36:33] HOST ALERT: server1;DOWN;HARD;4;(Host Check Timed Out) Host Up[2010-02-16 11:37:43] HOST ALERT: server1;UP;HARD;1;PING OK - Packet loss = 0%, RTA = 0.67 ms Service Ok[2010-02-16 11:39:13] SERVICE ALERT: server1;ICMP;OK;HARD;1;OK - the.ip.the.ip: rta 0.943ms, lost 0% Or later : Host Down[2010-02-16 11:42:03] HOST ALERT: server1;DOWN;SOFT;1;(Host Check Timed Out) Host Down[2010-02-16 11:43:13] HOST ALERT: server1;DOWN;SOFT;2;(Host Check Timed Out) Service Critical[2010-02-16 11:44:13] SERVICE ALERT: server1;ICMP;CRITICAL;HARD;1;CRITICAL - the.ip.the.ip: rta nan, lost 100% Host Down[2010-02-16 11:44:43] HOST ALERT: server1;DOWN;SOFT;3;(Host Check Timed Out) Host Up[2010-02-16 11:45:53] HOST ALERT: server1;UP;SOFT;4;PING OK - Packet loss = 0%, RTA = 0.64 ms Service Ok[2010-02-16 11:49:13] SERVICE ALERT: server1;ICMP;OK;HARD;1;OK - the.ip.the.ip: rta 0.948ms, lost 0% If you're asking why Nagios runs a host check when it sees the service fail a check, that's normal behavior. When a service check fails, the first thing Nagios will do is look to see if the service failed because the host is down. -- Samuel Bancal - CH -- Download Intelreg; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Fwd: Problem with recovery notification
I found this in source code : /* ADDED IF STATEMENT 01-17-05 EG */ /* 01-17-05: Services in hard problem states before hosts went down would sometimes come back as soft problem states after */ /* the hosts recovered. This caused problems, so hopefully this will fix it */ if(temp_service-state_type==SOFT_STATE) temp_service-current_attempt=1; } so hopefully this will fix it = Perhaps this patch does not work ... Samuel Mutel. - Mail Original - De: samuel mutel samuel.mu...@free.fr À: nagios-users@lists.sourceforge.net Envoyé: Jeudi 18 Février 2010 09:21:41 GMT +01:00 Amsterdam / Berlin / Berne / Rome / Stockholm / Vienne Objet: [Nagios-users] Fwd: Problem with recovery notification Hello, Anybody ? I don't understand the Hard and soft logic in Service Alert of server 1 : CRITICAL;SOFT;1 CRITICAL;SOFT;2 CRITICAL;HARD;3 CRITICAL;SOFT;1 = Why I don't have after CRITICAL;HARD;3 and before CRITICAL;SOFT;1 : OK;HARD;3. Questions : 1) The flapping mode can explain this behaviour ? 2) If the node is down the service state (hard or soft) is set to soft ? Regards, Thanks. - Mail transféré - De: Samuel Mutel samuel.mu...@free.fr À: nagios-users@lists.sourceforge.net Envoyé: Mardi 16 Février 2010 21:05:54 GMT +01:00 Amsterdam / Berlin / Berne / Rome / Stockholm / Vienne Objet: Problem with recovery notification Hello, I have two Nagios servers that monitor the same equipement. This two nagios send the result of check by notification to another monitoring system (OpenNMS). I use Nagios 3.2. I received the recovery notification from server 2 but I did not received recovery notification from server 1. Why ? I think that SOFT and HARD states are the problem but I am not sur. In the second server 2 the status of service is HARD - OK so the notification is sent but on server 1, the service is SOFT - OK !!! Here is the log of Nagios : Service Alert of server 1 : [1266299592] SERVICE ALERT: test-server;CPU;CRITICAL;SOFT;1;CHECK_ESX3.PL CRITICAL - Error: Server version unavailable at 'https://ip_address/sdk/vimService.wsdl' [1266299885] SERVICE ALERT: test-server;CPU;CRITICAL;SOFT;2;CHECK_ESX3.PL CRITICAL - Error connecting to server at 'https://ip_address/sdk/webService': Perhaps host is not a Virtual Center or ESX server [1266299925] SERVICE ALERT: test-server;CPU;CRITICAL;HARD;3;CHECK_ESX3.PL CRITICAL - Error: Server version unavailable at 'https://ip_address/sdk/vimService.wsdl' [1266303080] SERVICE ALERT: test-server;CPU;CRITICAL;SOFT;1;CHECK_ESX3.PL CRITICAL - Error connecting to server at 'https://ip_address/sdk/webService': Perhaps host is not a Virtual Center or ESX server [1266303380] SERVICE ALERT: test-server;CPU;CRITICAL;SOFT;2;CHECK_ESX3.PL CRITICAL - Error connecting to server at 'https://ip_address/sdk/webService': Perhaps host is not a Virtual Center or ESX server [1266308485] SERVICE ALERT: test-server;CPU;CRITICAL;SOFT;1;(Service Check Timed Out) [1266308500] SERVICE ALERT: test-server;CPU;OK;SOFT;2;CHECK_ESX3.PL OK - test-server cpu usage=2.29 % Service Notification of server 1 : [1266299925] SERVICE NOTIFICATION: onms_prod;test-server;CPU;CRITICAL;send_service_trap_to_onms_prod;CHECK_ESX3.PL CRITICAL - Error: Server version unavailable at https://ip_address/sdk/vimService.wsdl [1266300645] SERVICE NOTIFICATION: onms_prod;test-server;CPU;CRITICAL;send_service_trap_to_onms_prod;CHECK_ESX3.PL CRITICAL - Error: Server version unavailable at https://ip_address/sdk/vimService.wsdl [1266301385] SERVICE NOTIFICATION: onms_prod;test-server;CPU;CRITICAL;send_service_trap_to_onms_prod;CHECK_ESX3.PL CRITICAL - Error: Server version unavailable at https://ip_address/sdk/vimService.wsdl [1266301720] SERVICE NOTIFICATION: onms_prod;test-server;CPU;CRITICAL;send_service_trap_to_onms_prod;CHECK_ESX3.PL CRITICAL - Error: Server version unavailable at https://ip_address/sdk/vimService.wsdl [1266303575] SERVICE NOTIFICATION: onms_prod;test-server;CPU;CRITICAL;send_service_trap_to_onms_prod;CHECK_ESX3.PL CRITICAL - Error connecting to server at https://ip_address/sdk/webService: Perhaps host is not a Virtual Center or ESX server [1266304175] SERVICE NOTIFICATION: onms_prod;test-server;CPU;CRITICAL;send_service_trap_to_onms_prod;CHECK_ESX3.PL CRITICAL - Error connecting to server at https://ip_address/sdk/webService: Perhaps host is not a Virtual Center or ESX server [1266304810] SERVICE NOTIFICATION: onms_prod;test-server;CPU;CRITICAL;send_service_trap_to_onms_prod;CHECK_ESX3.PL CRITICAL - Error: Server version unavailable at https://ip_address/sdk/vimService.wsdl [1266305270] SERVICE NOTIFICATION: onms_prod;test-server;CPU;CRITICAL;send_service_trap_to_onms_prod;CHECK_ESX3.PL CRITICAL - Error: Server version unavailable at https://ip_address/sdk/vimService.wsdl [1266305975] SERVICE NOTIFICATION: onms_prod;test-server;CPU;CRITICAL;send_service_trap_to_onms_prod;CHECK_ESX3.PL CRITICAL - Error connecting to server at
[Nagios-users] CHECK_HTTP odd behaviour
Well I tried writing a wrapper script to see what check_http was actually receiving. The answer would appear to be absolutely nothing, in fact check_http is never even getting called. Something in the parameters would appear to be causing nagios to throw an exception when trying to make the call that is caught and treated as a critical error with a null reply. When I went through the -A parameter and escaped every non-standard character everything burst into life, the wrapper reported the correct string and check_http reported the site as up. Clearly that whereas bash only needs $ and ` escaping within inverted commas nagios must have a larger list, including I would guess either the ; or the : Thanks for the help Paul Willis -- This email and any accompanying document(s) contain information from Kent Police, which is confidential or privileged. The information is intended to be for the exclusive use of the individual(s) or bodies to whom it is addressed. If you are not the intended recipient, be aware that any disclosure, copying, distribution or use of the contents of this information is prohibited. If you have received this email in error, please notify us immediately by contacting the sender or telephoning 01622 690690. The copyright in the contents of this email and any enclosure is the property of Kent Police and any unauthorised reproduction or disclosure is contrary to the provisions of the Copyright Designs and Patents Act 1998. -- Download Intelreg; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] Processing External Commands
Hi, I'm running a distributed setup with two servers carrying out monitoring and sending their results back to a central server via NSCA. Most of the time this works well, but from time to time I get a substantial delay both between NSCA receiving the check result and the EXTERNAL COMMAND being logged, and between the EXTERNAL COMMAND being logged and the PASSIVE SERVICE CHECK result being logged. This delay can be several minutes and has sometimes been over 10 minutes. I am running ndoutils as well, and some of the tables are quite big. Could this affect things? Any help appreciated. Thanks, Glynne -- Download Intelreg; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] CHECK_HTTP odd behaviour
On Feb 18, 2010, at 5:04 AM, Paul WILLIS PSE 55499 wrote: Clearly that whereas bash only needs $ and ` escaping within inverted commas nagios must have a larger list, including I would guess either the ; or the : Nope, not really. \, ! and $ are the only characters that may need escaping, depending on where they are used. With the exception of $MACRO$ substitutions, nagios just takes your raw command_line and passes it to the shell for execution. You never posted your command definition but I'd guess that you didn't have proper quoting or something like that. -- Marc -- Download Intelreg; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Can I run both Nagios V2 and V3 in parallel while I migrate?
On Feb 17, 2010, at 8:29 PM, Lylex Ryan wrote: In upgrading from nagios (v2) to nagios3, I'd like to do a fresh install of nagios3 and start with a clean sheet of (config) paper. But can I do this while V2 is running production? Yes, I've run instances of all three versions on a single box at once. Since the packages have different names, I thought it might work. But they probably would both have /etc/nagios and other default directories in common. Clearly if the packages install components to common directories then that isn't going to work. Compiling and installing from tarball is not difficult at all and you have control over where things get put (by default everything is under /usr/local/nagios). You'll need to set up a second http vhost with a different name for the second instance and either modify the nagios init script to start the second instance or add the startup to rc.local. Once you're confident in the success of your transition, you could uninstall the v2 package, install the v3 package and copy over your etc and var directories from your transition install... -- Marc -- Download Intelreg; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Processing External Commands
On Feb 18, 2010, at 5:17 AM, Glynne Jones wrote: This delay can be several minutes and has sometimes been over 10 minutes. I am running ndoutils as well, and some of the tables are quite big. Could this affect things? Yes, certainly. If the database is busy, either through action of your own or through one of the regular table maintenance tasks then processing of check results may be delayed waiting on the database. Should be pretty easy to see through top if mysql is busy during those times. -- Marc -- Download Intelreg; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Can I run both Nagios V2 and V3 in parallel while I migrate?
On 18 February 2010 02:29, Lylex Ryan lylexr...@yahoo.com wrote: In upgrading from nagios (v2) to nagios3, I'd like to do a fresh install of nagios3 and start with a clean sheet of (config) paper. But can I do this while V2 is running production? Since the packages have different names, I thought it might work. But they probably would both have /etc/nagios and other default directories in common. Maybe if I installed from the tar-ball, I could specify new directories for V3, but I'm also trying to avoid that learning-process and use a pre-packaged rpm. Maybe installing V3 on a different server all-together, then moving it to the production machine would be a way. I think the standard advice is no you can't run more than one instance on a single operating-system (of course you probably can if you put enough effort in to it). I would recommend against installing your new Nagios 3 install with non-standard install paths - it could make installing add-ons in the future (for example PNP graphing, NagVis dashboards etc,) difficult if everything is in the wrong place. Personally, when I upgraded from 2 to 3, I put the 3 install on a new server and 'migrated' hosts and services across from old to new gradually over a period of a couple of months. -- Download Intelreg; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Editing the nagios Side bar
On Feb 17, 2010, at 5:14 PM, Ron Wilson wrote: I have 6 groups set up holding servers being patched on each day. I would like an entry in the Nagios sidebar that says patching which would then give a web page view of the six patching groups on one page. This makes it easier for admins to disable notifications for a large number of servers with one click. Because we have so many groups it would be easier to have the Patching days on one page. However while I can create a url for one days Patching in the new page I cannot get all six. This is my php code lia href=?php echo $cfg[cgi_base_url];?/status.cgi?hostgroup=Patch_Day1amp;style=overview target=?php echo $link_target;?Patch Day1/a/li This works fine but how can I get the other 5 Patch Groups in that line. I need something like Patch_Day* but such a command does not work with php. Anyone got some ideas? It's not a PHP thing... Nagios does not have functionality to limit (or expand, depending on how you look at it), the display of multiple hostgroups that are a subset of all hostgroups. The only exception to this is limitation through authentication, which wouldn't appear to fit your goals. -- Marc -- Download Intelreg; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] service notification when host is down
On Feb 18, 2010, at 3:47 AM, Samuel Bancal wrote: Thanks for your answer, In fact it is normal behavior to me also. Thing that is not normal behavior to me is that between two checks, Nagios jumps from SOFT 1 to HARD 1 without doing the steps SOFT 1 SOFT 2 SOFT 3 and finally HARD 4. If the host is down, why should nagios go through all that? There's no possibility for the service to be up when the host is not. -- Marc -- Download Intelreg; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] unsubscribe
unsubscribe -- Download Intelreg; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Processing External Commands
On Thu, Feb 18, 2010 at 07:19:05AM -0600, Marc Powell wrote: On Feb 18, 2010, at 5:17 AM, Glynne Jones wrote: This delay can be several minutes and has sometimes been over 10 minutes. I am running ndoutils as well, and some of the tables are quite big. Could this affect things? Yes, certainly. If the database is busy, either through action of your own or through one of the regular table maintenance tasks then processing of check results may be delayed waiting on the database. Should be pretty easy to see through top if mysql is busy during those times. Thought that might be the case. mysql is always busy (I've got 3370 checks over 362 hosts). You mention regular table maintenance tasks - is this something that comes out of the box or something separate? Thanks, Glynne -- Download Intelreg; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Help required
From: reachta...@hotmail.com To: j...@jimavery.me.uk Date: Thu, 18 Feb 2010 12:36:17 +0530 CC: nagios-users@lists.sourceforge.net Subject: Re: [Nagios-users] Help required Date: Wed, 17 Feb 2010 20:27:34 + Subject: Re: [Nagios-users] Help required From: j...@jimavery.me.uk To: reachta...@hotmail.com CC: nagios-users@lists.sourceforge.net On 17 February 2010 17:36, Digital Edge reachta...@hotmail.com wrote: Hi List, It will be really helpful if i can get any response on my below mentioned query. I have an URL , say.. http://www.example.com/sigin.jsf , After login in, it'll redirect to https://www.example1.com/ddo/get_sec_pwd.php; ; here another authentication will happen, then it'll come to an URL https://www.example1.com/home/home.jsf. Inside that page I have several other Tabs.. Home|Home1|Home2 all the tabs can be navigate viewable after successful login of 2nd time . And can be accessible within that session only. Can we monitor those URLS response time without loosing the session , one by one in Nagios.. Not in Nagios itself, no, but I expect you could use WebInject http://www.webinject.org/ to do the web querying and timing and feed the results back to Nagios. hth, Jim Hi , Yes; even I have tried also. The issue what i'm facing is after successful authentication checking , I'm unable to navigate through those links. testcases repeat=1 case id=1 description1=Connecting to portfolio_signup method=get url=http://www.example.com/portfolio_signup.php; verifypositive=Sign in errormessage=Unable to connect to the login page of portfolio_signup / case id=2 description1=Authentication on portfolio_signup method=post parseresponse='mykey=|' url=http://www.example.com/portfolio_signup.php; postbody=user=abcdpassword=1234mykey={PARSEDRESULT} verifypositive=Sign in errormessage=Unable to authenticate user abcd in portfolio_signup / case id=3 description1=Authentication on MM method=post parseresponse='mykey=|' url=https://www.example1.com/sso/get_sec_pwd.php; postbody=user=abcdpassword=12345rmykey={PARSEDRESULT} verifypositive=Secure Password errormessage=Unable to authenticate user abcd in MM / case id=4 description1=Navigate through www.example1.com while authenticated method=get url=https://www.example1.com/quickenweb/main/home.jsf; verifypositive=How can I ? errormessage=Unable to navigate through www.example1.com even though correctly authenticated / /testcases All the tests are passing except case4. I am not able to understand why it's happening. can anyone help me on this . /\ Ricky Dear List, can anyone help me on this sorry for the double post. Hi , Yes; even I have tried also. The issue what i'm facing is after successful authentication checking , I'm unable to navigate through those links. testcases repeat=1 case id=1 description1=Connecting to portfolio_signup method=get url=http://www.example.com/portfolio_signup.php; verifypositive=Sign in errormessage=Unable to connect to the login page of portfolio_signup / case id=2 description1=Authentication on portfolio_signup method=post parseresponse='mykey=|' url=http://www.example.com/portfolio_signup.php; postbody=user=abcdpassword=1234mykey={PARSEDRESULT} verifypositive=Sign in errormessage=Unable to authenticate user abcd in portfolio_signup / case id=3 description1=Authentication on MM method=post parseresponse='mykey=|' url=https://www.example1.com/sso/get_sec_pwd.php; postbody=user=abcdpassword=12345rmykey={PARSEDRESULT} verifypositive=Secure Password errormessage=Unable to authenticate user abcd in MM / case id=4 description1=Navigate through www.example1.com while authenticated method=get url=https://www.example1.com/quickenweb/main/home.jsf; verifypositive=How can I ? errormessage=Unable to navigate through www.example1.com even though correctly authenticated / /testcases All the tests are passing except case4. I am not able to understand why it's happening. can anyone help me on this . /\ Ricky _ Your E-mail and More On-the-Go. Get Windows Live Hotmail Free. https://signup.live.com/signup.aspx?id=60969-- Download Intelreg; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev___
[Nagios-users] CHECK_HTTP odd behaviour
Marc It wasn't the command definition file. That was the same as the command I was using for running directly, ie # 'check_eRhttp' command definition define command{ command_namecheck_eRhttp command_line$USER1$/check_http -p 8000 -H some.host.co.uk -u /sap/bc/webdynpro/sap/hrrcf_a_unreg_job_search?sap-wd-configId=ZUNREG_JOB_SEARCHsap-ep-themeroot=/sap/public/bc/ur/customerthemes/sap_kp -A Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US; rv:1.8.0.9) Gecko/20061206 Firefox/1.5.0.9 -R fs_QE2_00I did away with passing parameters from the service as I originally thought that was the problem. I have since had a quick play and can confirm it is indeed the semi colons. Leave them in and it goes critical / null status without calling the plugin. Escape them and it behaves Paul Willis -- This email and any accompanying document(s) contain information from Kent Police, which is confidential or privileged. The information is intended to be for the exclusive use of the individual(s) or bodies to whom it is addressed. If you are not the intended recipient, be aware that any disclosure, copying, distribution or use of the contents of this information is prohibited. If you have received this email in error, please notify us immediately by contacting the sender or telephoning 01622 690690. The copyright in the contents of this email and any enclosure is the property of Kent Police and any unauthorised reproduction or disclosure is contrary to the provisions of the Copyright Designs and Patents Act 1998. -- Download Intelreg; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] MACRO PROBLEM
Hi PAtrick Thanks for your reply, i know according to the matrix CONTACTEMAIL and CONTACTPAGER are disabled for Host/Service Event Handlers but I am talking about CONTACTGROUPMEMBERS. Or if you can help me any other way getting CONTACTEMAIL and CONTACTPAGER ? I hope you understand the problem that i need all this information to send to Netcool along with other Host or Service related information. Regards Khurram Malik Date: Wed, 17 Feb 2010 23:38:12 -0800 From: patrick.mor...@hp.com To: malik_khur...@hotmail.com CC: nagios-users@lists.sourceforge.net Subject: Re: [Nagios-users] MACRO PROBLEM Khurram Malik wrote: Hi I am using Nagios 3.0.6 and in an integration project i want Nagios to send alerts to Netcool. I am using Host/Service Global Event Handlers. I am able to get the maximum information via the following macros $SERVICESTATE$ $SERVICESTATETYPE$ $SERVICEATTEMPT$ $SERVICEDESC$ But i also want some other info via macros and i am using the following link to see if macro is enabled or disabled http://nagios.sourceforge.net/docs/3_0/macrolist.html#hostoutput I want to get CONTACTEMAIL and CONTACTPAGER contents but these macros are disabled with Global Host/Service handler, what is the easiest way to get info for the conact macros with Global Event Handlers. I can see $_CONTACTGROUPMEMBERS$_ is enabled with Global Event Handlers but I am unable to get any value, seems like a bug. This is not a bug. These macros are not available with event handlers, since eventhandlers do not have contacts associated with them. If you look at the matrix on tha page you linked, you'll see that CONTACTEMAIL and CONTACTPAGER work only with host and service notifications. -- Download Intelreg; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] MACRO PROBLEM
Thanks Patrick But how can i provide the name of contact group when it depends upon the service or host which is triggering the event. Is there a way that i can provide contact group dynamically to CONTACTGROUPMEMBERS ? e.g $CONTACTEMAIL:CONTACTGROUPMEMBERS:, which can give me comma separated list of emails associated with that perticular service or host? or if there is any ready made script for Nagios and Netcool integration? Regards Khurram Malik Date: Wed, 17 Feb 2010 23:48:08 -0800 From: patrick.mor...@hp.com To: malik_khur...@hotmail.com CC: nagios-users@lists.sourceforge.net Subject: Re: [Nagios-users] MACRO PROBLEM Morris, Patrick wrote: Khurram Malik wrote: Hi I am using Nagios 3.0.6 and in an integration project i want Nagios to send alerts to Netcool. I am using Host/Service Global Event Handlers. I am able to get the maximum information via the following macros $SERVICESTATE$ $SERVICESTATETYPE$ $SERVICEATTEMPT$ $SERVICEDESC$ But i also want some other info via macros and i am using the following link to see if macro is enabled or disabled http://nagios.sourceforge.net/docs/3_0/macrolist.html#hostoutput I want to get CONTACTEMAIL and CONTACTPAGER contents but these macros are disabled with Global Host/Service handler, what is the easiest way to get info for the conact macros with Global Event Handlers. I can see $_CONTACTGROUPMEMBERS$_ is enabled with Global Event Handlers but I am unable to get any value, seems like a bug. This is not a bug. These macros are not available with event handlers, since eventhandlers do not have contacts associated with them. If you look at the matrix on tha page you linked, you'll see that CONTACTEMAIL and CONTACTPAGER work only with host and service notifications. After re-reading your original question, I may have misunderstood, and you're wondering why $CONTACTGROUPMEMBERS$ doesn't work. See notes 5 and 7 on the page you linked. These macros work as on-demand-macros in event handlers, since event handler have no contacts associated with them. To obtain a list of conatct group members in that contacts, you would also need to provide the name of the group. -- Download Intelreg; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Processing External Commands
On Feb 18, 2010, at 7:44 AM, Glynne Jones wrote: Thought that might be the case. mysql is always busy (I've got 3370 checks over 362 hosts). You mention regular table maintenance tasks - is this something that comes out of the box or something separate? ndo2db.cfg -- ## TABLE TRIMMING OPTIONS # Several database tables containing Nagios event data can become quite large # over time. Most admins will want to trim these tables and keep only a # certain amount of data in them. The options below are used to specify the # age (in MINUTES) that data should be allowd to remain in various tables # before it is deleted. Using a value of zero (0) for any value means that # that particular table should NOT be automatically trimmed. # Keep timed events for 24 hours max_timedevents_age=1440 # Keep system commands for 1 week max_systemcommands_age=10080 # Keep service checks for 1 week max_servicechecks_age=10080 # Keep host checks for 1 week max_hostchecks_age=10080 # Keep event handlers for 31 days max_eventhandlers_age=44640 I've set all of these to 1 hour for my install based on my needs. If you have database backup scripts, those could be causing delays as well. -- Marc -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] Delayed Notification for Primary Secondary Nagios Servers
We have 2 Nagios servers, with the same /usr/local/nagios/etc (kept current with rsync). I'd like to have the second server skip the initial notifications, since under normal circumstances we'll receive the primary server's notification email and react to that. Does Nagios provide a way to do this? I don't see anything. Currently, I'm considering going with email notifications from primary every 30min, and using a $USER$ macro to effectively set notifications_enabled=0 on secondary. Then I should be abel to escalate to pages with first_notification=3 from secondary, and first_notification=5 from primary. This way, if primary goes down, we should get pages at 90min; if secondary goes down, we should get email about it at 30min and pages at 150min. I can also explicitly enable notifications regarding primary on the secondary server. Is there a better way to do this? Thanks, Chris Pepper -- Chris Pepper:http://cbio.mskcc.org/ http://www.extrapepperoni.com/ -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Processing External Commands
On Thu, Feb 18, 2010 at 10:02:51AM -0600, Marc Powell wrote: On Feb 18, 2010, at 7:44 AM, Glynne Jones wrote: Thought that might be the case. mysql is always busy (I've got 3370 checks over 362 hosts). You mention regular table maintenance tasks - is this something that comes out of the box or something separate? ndo2db.cfg -- ## TABLE TRIMMING OPTIONS [snip] I've set all of these to 1 hour for my install based on my needs. If you have database backup scripts, those could be causing delays as well. Doh!, cheers Marc. Forgot those were there. Don't think those are getting the way. Think it's more likely to be some of the other tables getting large. Have you changed the table engine from MyISAM to InnoDB? I'll have a play with the debug settings to see if I can find where the delay is coming in. Thanks, Glynne -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Processing External Commands
On Feb 18, 2010, at 10:29 AM, Glynne Jones wrote: Doh!, cheers Marc. Forgot those were there. Don't think those are getting the way. Think it's more likely to be some of the other tables getting large. Have you changed the table engine from MyISAM to InnoDB? Still all myISAM. -- Marc -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] NRPE/NSCA replacement thoughts?
Hello Since I am pondering a replacement for the NSCA and NRPE protocol I thought I would get some thoughts from the community? So this is pretty much an open floor kind of thing to get some sense of what people actually need and would want (if anything at all). But to get some general idea I'll give you a few questions to start it off: Is a new protocol a good idea? Should a new protocol be flat text based or structured? Would webservices be the best way? Should the protocol be extensible? What features would a new protocol need to support? - message, performance data, configuration, multiple queries, control logic transfer, inventory, etc. What plattforms would it need to support? Whats polling scheme(s): active, passive, active/passive, proxy, etc? Master/slave scenarios? In both NRPE and NSCA nagios is the master should the client be allowed to act as master? What kind of security mechanisms do you need (host, password, encryption, certificates, etc)? Client side checks or client side data gathering with server side checks? (ie. check_nrpe get ok back, another option would be to get the value and let the server decide if it is good or bad.) Multiple streams? ie send to both Nagios and potentially other collectors (like rrd) Feel free to add more thoughts and ideas here // Michael Medin -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] unsubscribe
unsubscribe -- The information contained in this transmission may be confidential. Any disclosure, copying, or further distribution of confidential information is not permitted unless such privilege is explicitly granted in writing by Quantum. Quantum reserves the right to have electronic communications, including email and attachments, sent across its networks filtered through anti virus and spam software programs and retain such messages in order to comply with applicable data security and retention requirements. Quantum is not responsible for the proper and complete transmission of the substance of this communication or for any delay in its receipt. -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] unsubscribe
you mean to send this to nagios-users-requ...@lists.sourceforge.net From this e-mail's headers -- List-Id:Nagios Users List nagios-users.lists.sourceforge.net List-Unsubscribe: https://lists.sourceforge.net/lists/listinfo/nagios-users, mailto:nagios-users-requ...@lists.sourceforge.net?subject=unsubscribe List-Archive: http://sourceforge.net/mailarchive/forum.php?forum_name=nagios-users List-Post: mailto:nagios-users@lists.sourceforge.net List-Help: mailto:nagios-users-requ...@lists.sourceforge.net?subject=help List-Subscribe: https://lists.sourceforge.net/lists/listinfo/nagios-users, mailto:nagios-users-requ...@lists.sourceforge.net?subject=subscribe -- Marc On Feb 18, 2010, at 12:42 PM, Rick Garland wrote: unsubscribe The information contained in this transmission may be confidential. Any disclosure, copying, or further distribution of confidential information is not permitted unless such privilege is explicitly granted in writing by Quantum. Quantum reserves the right to have electronic communications, including email and attachments, sent across its networks filtered through anti virus and spam software programs and retain such messages in order to comply with applicable data security and retention requirements. Quantum is not responsible for the proper and complete transmission of the substance of this communication or for any delay in its receipt. -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] Set Host Status from Distributed Monitoring Server
Hello! Pardon me if this is already covered somewhere in the documentation. I can't seem to find what I'm looking for. I have a working Nagios 3.2.0 environment in a distributed configuration. One of my environments lives behind a firewall that is blocking icmp traffic from the central server. In that same environment I have a remote Nagios node that is successfully sending service checks backs to the central server. All of the nodes behind this firewall are reporting DOWN on the central server nagios page, but their service checks are reporting OK. Is their a configuration setting available so that the central server will report these nodes as UP if it receives a successful check from the remote monitoring node? The translate_passive_host_checks options sounds like it should work, but it doesn't. I understand I can remove the check_command from the host.cfg, but the host status will be a Pending status. Any suggestions? Thank you in advance for your time and assistance. CS Configs log_file=/usr/local/nagios/var/nagios.log cfg_file=/usr/local/nagios/etc/objects/commands.cfg cfg_file=/usr/local/nagios/etc/objects/contacts.cfg cfg_file=/usr/local/nagios/etc/objects/timeperiods.cfg cfg_file=/usr/local/nagios/etc/objects/templates.cfg cfg_file=/usr/local/nagios/etc/objects/hostgroups.cfg cfg_dir=/usr/local/nagios/etc/hosts object_cache_file=/usr/local/nagios/var/objects.cache precached_object_file=/usr/local/nagios/var/objects.precache resource_file=/usr/local/nagios/etc/resource.cfg status_file=/usr/local/nagios/var/status.dat status_update_interval=10 nagios_user=nagios nagios_group=nagios check_external_commands=1 command_check_interval=-1 command_file=/usr/local/nagios/var/rw/nagios.cmd external_command_buffer_slots=4096 lock_file=/usr/local/nagios/var/nagios.lock temp_file=/usr/local/nagios/var/nagios.tmp temp_path=/tmp event_broker_options=-1 log_rotation_method=d log_archive_path=/usr/local/nagios/var/archives use_syslog=1 log_notifications=1 log_service_retries=1 log_host_retries=1 log_event_handlers=1 log_initial_states=0 log_external_commands=1 log_passive_checks=1 service_inter_check_delay_method=s max_service_check_spread=30 service_interleave_factor=s host_inter_check_delay_method=s max_host_check_spread=30 max_concurrent_checks=0 check_result_reaper_frequency=10 max_check_result_reaper_time=30 check_result_path=/usr/local/nagios/var/spool/checkresults max_check_result_file_age=3600 cached_host_check_horizon=15 cached_service_check_horizon=15 enable_predictive_host_dependency_checks=1 enable_predictive_service_dependency_checks=1 soft_state_dependencies=0 auto_reschedule_checks=0 auto_rescheduling_interval=30 auto_rescheduling_window=180 sleep_time=0.25 service_check_timeout=60 host_check_timeout=30 event_handler_timeout=30 notification_timeout=30 ocsp_timeout=5 perfdata_timeout=5 retain_state_information=1 state_retention_file=/usr/local/nagios/var/retention.dat retention_update_interval=60 use_retained_program_state=1 use_retained_scheduling_info=1 retained_host_attribute_mask=0 retained_service_attribute_mask=0 retained_process_host_attribute_mask=0 retained_process_service_attribute_mask=0 retained_contact_host_attribute_mask=0 retained_contact_service_attribute_mask=0 interval_length=60 check_for_updates=1 bare_update_check=0 use_aggressive_host_checking=0 execute_service_checks=0 accept_passive_service_checks=1 execute_host_checks=1 accept_passive_host_checks=1 enable_notifications=0 enable_event_handlers=1 process_performance_data=0 obsess_over_services=0 obsess_over_hosts=0 translate_passive_host_checks=1 passive_host_checks_are_soft=0 check_for_orphaned_services=1 check_for_orphaned_hosts=1 check_service_freshness=1 service_freshness_check_interval=60 check_host_freshness=1 host_freshness_check_interval=60 additional_freshness_latency=15 enable_flap_detection=1 low_service_flap_threshold=5.0 high_service_flap_threshold=20.0 low_host_flap_threshold=5.0 high_host_flap_threshold=20.0 date_format=us p1_file=/usr/local/nagios/bin/p1.pl enable_embedded_perl=1 use_embedded_perl_implicitly=1 illegal_object_name_chars=`~!$%^*|'?,()= illegal_macro_output_chars=`~$|' use_regexp_matching=0 use_true_regexp_matching=0 admin_email=nag...@localhost admin_pager=pagenag...@localhost daemon_dumps_core=0 use_large_installation_tweaks=0 enable_environment_macros=1 debug_level=16 debug_verbosity=2 debug_file=/usr/local/nagios/var/nagios.debug max_debug_file_size=100 Remote Monitoring Node log_file=/usr/local/nagios/var/nagios.log cfg_file=/usr/local/nagios/etc/objects/commands.cfg cfg_file=/usr/local/nagios/etc/objects/contacts.cfg cfg_file=/usr/local/nagios/etc/objects/timeperiods.cfg cfg_file=/usr/local/nagios/etc/objects/templates.cfg cfg_file=/usr/local/nagios/etc/objects/hostgroups.cfg cfg_dir=/usr/local/nagios/etc/hosts object_cache_file=/usr/local/nagios/var/objects.cache
[Nagios-users] E-mailing separate group for subset of hosts(and their services)
Hi, I've had a smallish deployment of Nagios for a while now, but now I need to add some more functionality to it. I need to have Nagios notify certain people when there is an issue with a host or any service on it. I see that adding their contactgroup to the host definition only notifies them when the host itself is down or up, however adding their contactgroup to the service definition would notify them whenever said service has an issue on any host - not just the ones I want them to be notified about. Where is the happy medium here? Do I need to create a duplicate copy of all services on these hosts just so that I can list them as the contacts? Thanks, Ryan -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Set Host Status from Distributed Monitoring Server
tktuc...@gmail.com wrote: Hello! Pardon me if this is already covered somewhere in the documentation. I can't seem to find what I'm looking for. I have a working Nagios 3.2.0 environment in a distributed configuration. One of my environments lives behind a firewall that is blocking icmp traffic from the central server. In that same environment I have a remote Nagios node that is successfully sending service checks backs to the central server. All of the nodes behind this firewall are reporting DOWN on the central server nagios page, but their service checks are reporting OK. Is their a configuration setting available so that the central server will report these nodes as UP if it receives a successful check from the remote monitoring node? The translate_passive_host_checks options sounds like it should work, but it doesn't. I understand I can remove the check_command from the host.cfg, but the host status will be a Pending status. Any suggestions? Thank you in advance for your time and assistance. translate_passive_host_checks only works if you send them. Are you? I suspect whatever method you're using to send service check results upstream is only being used for service checks, and you may need to modify it to also send service check results. -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] NRPE/NSCA replacement thoughts?
Michael Medin wrote: Hello Since I am pondering a replacement for the NSCA and NRPE protocol I thought I would get some thoughts from the community? So this is pretty much an open floor kind of thing to get some sense of what people actually need and would want (if anything at all). But to get some general idea I'll give you a few questions to start it off: Is a new protocol a good idea? Should a new protocol be flat text based or structured? Would webservices be the best way? Should the protocol be extensible? What features would a new protocol need to support? - message, performance data, configuration, multiple queries, control logic transfer, inventory, etc. What plattforms would it need to support? Whats polling scheme(s): active, passive, active/passive, proxy, etc? Master/slave scenarios? In both NRPE and NSCA nagios is the master should the client be allowed to act as master? What kind of security mechanisms do you need (host, password, encryption, certificates, etc)? Client side checks or client side data gathering with server side checks? (ie. check_nrpe get ok back, another option would be to get the value and let the server decide if it is good or bad.) Multiple streams? ie send to both Nagios and potentially other collectors (like rrd) For what it's worth, I'm pretty happy with NSCA and NRPE as-is, though I'd be interested to hear your motivation for replacing them (especially the resons for replacing them outright instead of extending the existing apps). -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] NRPE/NSCA replacement thoughts?
On 2010-02-19 05:22, Morris, Patrick wrote: Michael Medin wrote: Hello Multiple streams? ie send to both Nagios and potentially other collectors (like rrd) For what it's worth, I'm pretty happy with NSCA and NRPE as-is, though I'd be interested to hear your motivation for replacing them (especially the resons for replacing them outright instead of extending the existing apps). Well, the main reason is that they have a number of limitations which I need to resolve and after speaking with Ethan about it I got the impression he would not be updating NRPE/NSCA any more (for instance Ton Voon has some patches to handle payload size which has not been applied). He would (or so I gathered) rather have a new replacement client(s). Also I tend to write programs in C++ and not C which sort of means it is simpler for me to re-write them. // Michael Medin -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] NRPE/NSCA replacement thoughts?
-Original Message- From: Morris, Patrick [mailto:patrick.mor...@hp.com] Sent: Thursday, February 18, 2010 8:23 PM To: Michael Medin Cc: nagios-users Subject: Re: [Nagios-users] NRPE/NSCA replacement thoughts? Michael Medin wrote: Hello Since I am pondering a replacement for the NSCA and NRPE protocol I thought I would get some thoughts from the community? So this is pretty much an open floor kind of thing to get some sense of what people actually need and would want (if anything at all). I have actually written a minimal replacement to resolve a shortcoming in my own situation. Actually, it is the standard NSCA protocol wrapped in SSL or (optionally) SSH But to get some general idea I'll give you a few questions to start it off: Is a new protocol a good idea? Maybe the answer to that question should come at the end instead of the beginning of the process? Generally, I believe that extending an existing protocol is usually a better idea than wholesale replacement, but sometimes one does have to clear-cut some junk. Should a new protocol be flat text based or structured? What is the design goal? I would advocate structured because of the flexibility, but if it means more bandwidth or using more processing power to parse the protocol, it may not be a great idea? Would webservices be the best way? Separate the structure of the protocol from the underlying transport mechanism. In my mind, a lack of that separation is actually the greatest weakness of the current protocol. Web services are an excellent choice of transport for many situations, and quite likely will be the the dominant one for the foreseeable future. Another potential transport is SSH, yet another could be RFC 1149/RFC 2549 avian carriers or whatever else somebody comes up with. Some advantages of Web services: Advantages: - built-in firewall compatibility - built-in encryption and authentication (via SSL) Drawbacks: - needs to be integrated with Web servers, potentially adding to complexity and performance issues. Should the protocol be extensible? Yes, within reason. One of the beauties of Nagios is in its simplicity, so if you add too much extensibility you might actually lose more than you gain. But on the other hand, some extensibility is important - otherwise, people will come up with their own extensions that don't really fit with the model. For instance, today's performance data is such an enhancement. What plattforms would it need to support? ASCII and Unicode. Other than that, is must be platform neutral, or you lose too much. Whats polling scheme(s): active, passive, active/passive, proxy, etc? Both have its place. Most people seem to love active polling, but firewalls may sometimes require passive polling. Master/slave scenarios? In both NRPE and NSCA nagios is the master should the client be allowed to act as master? Define master and slave in this context! If you are talking about the current model of multiple Nagios servers, it seems to me that this is more of a redesign of Nagios, rather than a protocol issue. One thing I would definitely like to see in this context is automatically adding services to Nagios when passive check results arrive. Keeping the list of services in sync between master and slave servers is one of the things that makes such a setup complicated. What kind of security mechanisms do you need (host, password, encryption, certificates, etc)? That should be left to the underlying transport. Why reinvent the wheel and have to keep chasing security holes if there are already plenty of good solutions available? Client side checks or client side data gathering with server side checks? (ie. check_nrpe get ok back, another option would be to get the value and let the server decide if it is good or bad.) Definitely client-side checks. Otherwise, you'd be looking at effectively re-inventing RPC. What if the value being checked is some huge binary blob, or the result of multiple interdependent system calls? Multiple streams? ie send to both Nagios and potentially other collectors (like rrd) No. Keep it simple, not a protocol to solve all the problems in the world. Nagios itself can forward to other collectors. -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] NRPE/NSCA replacement thoughts?
-Original Message- From: Michael Medin [mailto:mich...@medin.name] Sent: Thursday, February 18, 2010 10:05 PM To: Morris, Patrick; nagios-users Subject: Re: [Nagios-users] NRPE/NSCA replacement thoughts? On 2010-02-19 05:22, Morris, Patrick wrote: For what it's worth, I'm pretty happy with NSCA and NRPE as-is, though I'd be interested to hear your motivation for replacing them (especially the resons for replacing them outright instead of extending the existing apps). Don't get me wrong - I like the idea of improvements to NRPE/NSCA, but I see a few issues with the motivation. Well, the main reason is that they have a number of limitations which I need to resolve and after speaking with Ethan about it I got the impression he would not be updating NRPE/NSCA any more (for instance Ton Voon has some patches to handle payload size which has not been applied). He would (or so I gathered) rather have a new replacement client(s). Client? Or protocol? Also I tend to write programs in C++ and not C which sort of means it is simpler for me to re-write them. That really isn't a good reason to throw out the investment thousands of people have made in a working NRPE/NSCA infrastructure! When the next developer comes into the project and likes Java, are we going to get yet another protocol? What if somebody wants to write a client for a new platform - does it have to be in C++? Now don't get me wrong: I actually agree that there are good reasons to update or even replace the protocol. But I'm quite concerned about the motivation, and the end result that would come from it. -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] NRPE/NSCA replacement thoughts?
On 2010-02-19 07:35, Kevin Keane wrote: Well, the main reason is that they have a number of limitations which I need to resolve and after speaking with Ethan about it I got the impression he would not be updating NRPE/NSCA any more (for instance Ton Voon has some patches to handle payload size which has not been applied). He would (or so I gathered) rather have a new replacement client(s). Client? Or protocol? protocol (NRPE and NSCA has fixed limits on data length, Ton extended the protocol with an additional packet type that was more data. Also I tend to write programs in C++ and not C which sort of means it is simpler for me to re-write them. That really isn't a good reason to throw out the investment thousands of people have made in a working NRPE/NSCA infrastructure! When the next developer comes into the project and likes Java, are we going to get yet another protocol? What if somebody wants to write a client for a new platform - does it have to be in C++? Uhmm... I am talking about a protocol here, so feel free to implement a client in brainf*ck if you like... Now don't get me wrong: I actually agree that there are good reasons to update or even replace the protocol. But I'm quite concerned about the motivation, and the end result that would come from it. Well, in this case my motivation is pure and simple self interest... I have no noble ideas about helping the nagios community. I need a new protocol, I will write one... pure and simple... I just figured I would get some insight into what to think about whilst doing it... // Michael Medin -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] NRPE/NSCA replacement thoughts?
On 2010-02-19 07:25, Kevin Keane wrote: Is a new protocol a good idea? Maybe the answer to that question should come at the end instead of the beginning of the process? Well, if everyone thinks this is a doomed sinking ship there is no point to venture forth so for me this is the most important question actually :) Generally, I believe that extending an existing protocol is usually a better idea than wholesale replacement, but sometimes one does have to clear-cut some junk. The protocol part of NRPEand NSCA has far far to many flaws to merit extending them. Should a new protocol be flat text based or structured? What is the design goal? I would advocate structured because of the flexibility, but if it means more bandwidth or using more processing power to parse the protocol, it may not be a great idea? Thats exactly the question: whats more interesting, speed, simplicity or flexibility? Nagios has survived on its simplicity but lately has tried to grow into something more advanced. Should the protocol be extensible? Yes, within reason. One of the beauties of Nagios is in its simplicity, so if you add too much extensibility you might actually lose more than you gain. But on the other hand, some extensibility is important - otherwise, people will come up with their own extensions that don't really fit with the model. For instance, today's performance data is such an enhancement. Master/slave scenarios? In both NRPE and NSCA nagios is the master should the client be allowed to act as master? Define master and slave in this context! If you are talking about the current model of multiple Nagios servers, it seems to me that this is more of a redesign of Nagios, rather than a protocol issue. One pretty interesting idea I saw at the Nordic Nagios Meet last spring was a client (I don't recall the name now) that allowed you to define the checks and such on the clienht. This was then uploaded and incorporated into Nagios. This means nagios is no longer the master for configuration data instead the clients have become masters. What kind of security mechanisms do you need (host, password, encryption, certificates, etc)? That should be left to the underlying transport. Why reinvent the wheel and have to keep chasing security holes if there are already plenty of good solutions available? Client side checks or client side data gathering with server side checks? (ie. check_nrpe get ok back, another option would be to get the value and let the server decide if it is good or bad.) Definitely client-side checks. Otherwise, you'd be looking at effectively re-inventing RPC. What if the value being checked is some huge binary blob, or the result of multiple interdependent system calls? Multiple streams? ie send to both Nagios and potentially other collectors (like rrd) No. Keep it simple, not a protocol to solve all the problems in the world. Nagios itself can forward to other collectors. From what I have gathered this is pretty time and CPU consuming so another option would be to split off the data outside Nagios. // Michael Medin -- Download Intel#174; Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null