[Nagios-users] HOWTO I did on installing NRPE on AIX
For what it's worth, I've written a HOWTO on installing NRPE on AIX. http://www.hackmyidea.com/wordpress/2008/01/22/installing-nagios-nrpe-on-aix/ I am not an AIX adminstrator, so please feel free to correct any errata. (I'm assuming the best place for this stuff to go is in /opt ? Any advice here would be greatly appreciated) - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] Go to map button
Hi Guys, When a host is down or one of it's services are down. Where do I click to go directly to the host on the status map page? -- Alex Dehaini Developer Site - www.alexdehaini.com Email - [EMAIL PROTECTED] - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Go to map button
Hi, Under host commands there is a Locate host on map option, at least there is with the nuvola style. The URI is: /nagios/cgi-bin/statusmap.cgi?host=hostnamelayout=1max_width=0max_hei ght=0proximity_width=1000proximity_height=800layermode=exclude Replace hostname with the name of the host. From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Alex Dehaini Sent: 22 January 2008 12:10 To: nagios-users@lists.sourceforge.net Subject: [Nagios-users] Go to map button Hi Guys, When a host is down or one of it's services are down. Where do I click to go directly to the host on the status map page? -- Alex Dehaini Developer Site - www.alexdehaini.com Email - [EMAIL PROTECTED] - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Go to map button
That is the problem, I can't find the View Host on Map button Or the 'Go to Map button. Alex On Jan 22, 2008 11:16 AM, Valdinger, Stephen (DOV, MSX) [EMAIL PROTECTED] wrote: Well you could use the view host on map button... I do believe there is one of those as I remember albeit I can't remember if it was 2.x or 3.x. Stephen Valdinger MIS Helpdesk Coordinator P: 330.365.3622 C: 740.491.0958 -Original Message- From: Alex Dehaini [EMAIL PROTECTED] To: nagios-users@lists.sourceforge.net nagios-users@lists.sourceforge.net Sent: Tue Jan 22 06:09:57 2008 Subject: [Nagios-users] Go to map button Hi Guys, When a host is down or one of it's services are down. Where do I click to go directly to the host on the status map page? -- Alex Dehaini Developer Site - www.alexdehaini.com Email - [EMAIL PROTECTED] -- Alex Dehaini Developer Site - www.alexdehaini.com Email - [EMAIL PROTECTED] - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Go to map button
Giles, The Locate host on map option is not present under the Host command. Is there a way I can enable it? Alex On Jan 22, 2008 11:19 AM, Giles Coochey [EMAIL PROTECTED] wrote: Hi, Under host commands there is a Locate host on map option, at least there is with the nuvola style. The URI is: /nagios/cgi-bin/statusmap.cgi?host=hostnamelayout=1max_width=0max_height=0proximity_width=1000proximity_height=800layermode=exclude Replace hostname with the name of the host. -- *From:* [EMAIL PROTECTED] [mailto: [EMAIL PROTECTED] *On Behalf Of *Alex Dehaini *Sent:* 22 January 2008 12:10 *To:* nagios-users@lists.sourceforge.net *Subject:* [Nagios-users] Go to map button Hi Guys, When a host is down or one of it's services are down. Where do I click to go directly to the host on the status map page? -- Alex Dehaini Developer Site - www.alexdehaini.com Email - [EMAIL PROTECTED] -- Alex Dehaini Developer Site - www.alexdehaini.com Email - [EMAIL PROTECTED] - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Go to map button
Alex, I am unsure why it is not there as an option for you. What version of Nagios are you running? (it's there in my 2.9 installation). Did you compile from source or install via RPM? If / When you originally compiled did you have issues with the required libraries for the statusmap? If so you may need to rebuild extinfo.cgi. Thanks Giles From: Alex Dehaini [mailto:[EMAIL PROTECTED] Sent: 22 January 2008 12:26 To: Giles Coochey; nagios-users@lists.sourceforge.net Subject: Re: [Nagios-users] Go to map button Giles, The Locate host on map option is not present under the Host command. Is there a way I can enable it? Alex On Jan 22, 2008 11:19 AM, Giles Coochey [EMAIL PROTECTED] wrote: Hi, Under host commands there is a Locate host on map option, at least there is with the nuvola style. The URI is: /nagios/cgi-bin/statusmap.cgi?host=hostnamelayout=1max_width=0max_hei ght=0proximity_width=1000proximity_height=800layermode=exclude Replace hostname with the name of the host. From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Alex Dehaini Sent: 22 January 2008 12:10 To: nagios-users@lists.sourceforge.net Subject: [Nagios-users] Go to map button Hi Guys, When a host is down or one of it's services are down. Where do I click to go directly to the host on the status map page? -- Alex Dehaini Developer Site - www.alexdehaini.com Email - [EMAIL PROTECTED] -- Alex Dehaini Developer Site - www.alexdehaini.com Email - [EMAIL PROTECTED] - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Go to map button
Giles. I migrated to a new installation. I was running 2.9 and it was there. I upgraded to 2.10 and I can't see it anymore. I compiled by source. Initially I had issue with my gd library, I installed gd and recompiled. The status map is working alright but I can't get the Locate host on map button How do I rebuild extinfo.cgi? Alex On Jan 22, 2008 11:39 AM, Giles Coochey [EMAIL PROTECTED] wrote: Alex, I am unsure why it is not there as an option for you. What version of Nagios are you running? (it's there in my 2.9 installation). Did you compile from source or install via RPM? If / When you originally compiled did you have issues with the required libraries for the statusmap? If so you may need to rebuild extinfo.cgi. Thanks Giles -- *From:* Alex Dehaini [mailto:[EMAIL PROTECTED] *Sent:* 22 January 2008 12:26 *To:* Giles Coochey; nagios-users@lists.sourceforge.net *Subject:* Re: [Nagios-users] Go to map button Giles, The Locate host on map option is not present under the Host command. Is there a way I can enable it? Alex On Jan 22, 2008 11:19 AM, Giles Coochey [EMAIL PROTECTED] wrote: Hi, Under host commands there is a Locate host on map option, at least there is with the nuvola style. The URI is: /nagios/cgi-bin/statusmap.cgi?host=hostnamelayout=1max_width=0max_height=0proximity_width=1000proximity_height=800layermode=exclude Replace hostname with the name of the host. -- *From:* [EMAIL PROTECTED] [mailto: [EMAIL PROTECTED] *On Behalf Of *Alex Dehaini *Sent:* 22 January 2008 12:10 *To:* nagios-users@lists.sourceforge.net *Subject:* [Nagios-users] Go to map button Hi Guys, When a host is down or one of it's services are down. Where do I click to go directly to the host on the status map page? -- Alex Dehaini Developer Site - www.alexdehaini.com Email - [EMAIL PROTECTED] -- Alex Dehaini Developer Site - www.alexdehaini.com Email - [EMAIL PROTECTED] -- Alex Dehaini Developer Site - www.alexdehaini.com Email - [EMAIL PROTECTED] - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] random character appended to NRPE output
Hi, I also have this, and think this is a bug. I reported this already to the dev list, but don’t know if someone saw mu mail. Still, since I suspected this had to do with string ending, I added the following at the end of all my test scripts : /usr/bin/printf \0 I did this because I think strings end with a zero char in C, so I thought it would be worth trying printing one manually. This apparently solves the problem of appended chars, but this is just a workaround, I admit… Cheers From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Nathan Blackham Sent: Monday, January 21, 2008 11:54 PM To: nagios-users@lists.sourceforge.net Subject: [Nagios-users] random character appended to NRPE output I was wondering if anyone has run into this before and the solution. When I run nrpe I get a bunch of extra characters on the end of the output as seen here: /usr/lib64/nagios/plugins/check_nrpe -H host -c check_users USERS OK - 2 users currently logged in |users=2;5;10;0 �����r��r� I am running the latest version of nrpe (2.11). It seems like an array isn't ending properly. Is there anything else that might be helpful, like a debug log? Thanks alot, Nathan - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Go to map button
Alex, I am not sure, but, why not rename the extinfo.cgi file in your SOURCE tree and re-run 'make' to recompile the file from the source (extinfo.c). I'm assuming you still have the source and not done a 'make clean', 'make distclean' etc... all of which is probably beyond the scope of the Nagios list. Thanks Giles From: Alex Dehaini [mailto:[EMAIL PROTECTED] Sent: 22 January 2008 12:44 To: Giles Coochey; nagios-users@lists.sourceforge.net Subject: Re: [Nagios-users] Go to map button Giles. I migrated to a new installation. I was running 2.9 and it was there. I upgraded to 2.10 and I can't see it anymore. I compiled by source. Initially I had issue with my gd library, I installed gd and recompiled. The status map is working alright but I can't get the Locate host on map button How do I rebuild extinfo.cgi? Alex On Jan 22, 2008 11:39 AM, Giles Coochey [EMAIL PROTECTED] wrote: Alex, I am unsure why it is not there as an option for you. What version of Nagios are you running? (it's there in my 2.9 installation). Did you compile from source or install via RPM? If / When you originally compiled did you have issues with the required libraries for the statusmap? If so you may need to rebuild extinfo.cgi. Thanks Giles From: Alex Dehaini [mailto:[EMAIL PROTECTED] Sent: 22 January 2008 12:26 To: Giles Coochey; nagios-users@lists.sourceforge.net Subject: Re: [Nagios-users] Go to map button Giles, The Locate host on map option is not present under the Host command. Is there a way I can enable it? Alex On Jan 22, 2008 11:19 AM, Giles Coochey [EMAIL PROTECTED] wrote: Hi, Under host commands there is a Locate host on map option, at least there is with the nuvola style. The URI is: /nagios/cgi-bin/statusmap.cgi?host=hostnamelayout=1max_width=0max_hei ght=0proximity_width=1000proximity_height=800layermode=exclude Replace hostname with the name of the host. From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Alex Dehaini Sent: 22 January 2008 12:10 To: nagios-users@lists.sourceforge.net Subject: [Nagios-users] Go to map button Hi Guys, When a host is down or one of it's services are down. Where do I click to go directly to the host on the status map page? -- Alex Dehaini Developer Site - www.alexdehaini.com Email - [EMAIL PROTECTED] -- Alex Dehaini Developer Site - www.alexdehaini.com Email - [EMAIL PROTECTED] -- Alex Dehaini Developer Site - www.alexdehaini.com Email - [EMAIL PROTECTED] - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] Nuvola for Nagios 3
Hey guys! I was looking for images to use on my Web interface (CGI). So, I found Nuvola Style on NagiosExchange.org. I didnt see a reference for Nagios 3. Can I use it for Nagios 3? If no, Do you know other style like Nuvola available for Nagios 3? Regards, -- Ronaldo A. Bueno Filho - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Problem with some NSCA packets getting corrupted on 64-bit SLES 10
Brian, You beat me to the punch. After a few days of trying to figure out the pattern, I found this was only happening when the distributed nodes were trying to do host checks. Further discover revealed that we were using 'fping' to check host reachability which did include a ',' in the output. The send shell script I was using at the time passed -d ',' to send_nsca to use as a delimiter. So while the actual host check was sending only 3 fields to the send_service_check script, the arguments to send_nsca were causing it to be broken into 4 fields so the NSCA daemon assumed it was a service check. Not that it matter which I use, but I switched over to use Ethan's script. I guess when no arguments are passed to send_nsca, it breaks on a tab as a delimiter. Anyway, that part of my migration has been working fine. Glad to see the whole 64-bit business was a red herring in my setup (whew!). Thanks for your help. Mark -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Brian A. Seklecki Sent: Saturday, January 19, 2008 12:32 PM To: Frost, Mark {PBG} Cc: nagios-users@lists.sourceforge.net Subject: Re: [Nagios-users] Problem with some NSCA packets getting corruptedon 64-bit SLES 10 MF: Show us your ocsp_command and ochp_command mappings. Are you calling a piped command from checkcommands.cfg or calling an external shell script? I guarantee you the comma (,) in results is being mapped into a field delimiter, which confuses nscad(8). ~~BAS On Thu, 2008-01-17 at 10:37 -0500, Frost, Mark {PBG} wrote: I've recently begun an effort to move our Nagios installation to a distributed architecture from a centralized one. I had previous used NSCA only for a very few passive checks and it works fine on a 32-bit Red Hat AS 3 platform (the centralized server). In testing on a distributed architecture (which is 64-bit Suse Linux Enterprise Server (SLES) 10), I seem to have a problem with NSCA. (Note that all Nagios and NSCA binaries and libraries were recompiled on the 64-bit platform). After I broke out all the checks to have 2 separate distributed nodes send to a central server, I saw a few messages like this one in the nagios.log file: [1200583727] Warning: Passive check result was received for service '0' on host 'HOSTXXX', but the service could not be found! but only about every 1 out of 10 or maybe 20 results was doing this. That is, the rest of the results were being correctly shown as EXTERNAL COMMAND and all expected NSCA fields came up correctly (hostname, service desc, check result, text output). I started having the send_nsca script from the distbributed nodes log what they were sending to a file. When I correlate what they're sending with what the NSCA daemon thinks it's receiving, the client is still sending the correct 4 fields, but it's as if the NSCA daemon is dropping the 2nd field (service desc) and replacing it with the check result field. So ultimately, it thinks the service name is '0'. I can't see that this matches a pattern (i.e. always on the same hosts or same service checks). All I've seen so far is that it happens whether I run NSCA as --single or --daemon. It also happens even if I turn off one of the distributed nodes (that is, I can't see it being volume related). I have turned on debugging in the NSCA daemon to see what it thinks it's getting and it echoes what the nagios.log shows: SERVICE CHECK - Host Name: 'HOSTXXX', Service Description: '0', Return Code: '0', Output: ' rta=0.14 ms)' Again, maybe only 1 out of 10. Ultimately, this causes the server to run an active check as it thinks it never got a result from the distbributed node. I'm still trying to dig deeper, but it seems to me that this is increasingly pointing to some issue with 64-bit SLES. Or perhaps some variable type in NSCA daemon that's not quite right for 64-bit. It's hard to tell with its intermittent nature and the fact that I have yet to discover a pattern. Has anyone seen anything like this before? Thanks Mark - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Nagios-users mailing list
[Nagios-users] Problem with high latencies after going distributed
As I'd mentioned in a previous message, I'm in the process of converting from a centralized Nagios 2.10 setup all running on a single host to a distributed setup running on at least 3 hosts (3 to start anyway). The centralized setup has 572 hosts and 2900 services 99.9% of which are active checks. My approach when going passive was to group the checks so that some ran on the first distributed node and some ran on the second node. The central server will do freshness checking and run an active check if it fails to get a check back from a distributed node after 20 minutes (virtually all checks run at 15 minute intervals or less). Our old centralized server reports the following via nagiostats: Active Service Latency: 0.000 / 9.129 / 0.833 sec Active Service Execution Time:0.037 / 10.045 / 0.227 sec I started noticing a fair number of checks going stale on the new reporting server and that server would then run those service checks actively. I could see no reason for this. When I had a look at the distributed nodes, I saw: Distributed Node 1(min/max/avg) Active Service Latency: 0.000 / 7267.198 / 4241.019 sec Active Service Execution Time:0.000 / 60.014 / 0.651 sec Distributed Node 2(min/max/avg) Active Service Latency: 0.000 / 11475.901 / 6393.641 sec Active Service Execution Time:0.000 / 60.018 / 0.593 sec Wow. I reviewed the performance doc for Nagios 2.x yet again and I'm not finding anything there that I'm not doing that would affect latencies this much. These boxes are dedicated to Nagios so there's no other application competing for resources. They're on the same subnet. I run a few perl checks, but that would be a very small percentage of my checks. The distributed nodes are newer and have more resources (faster CPU, at least as much memory) as the old standalone box. The only thing I can think of that could be unusual is that both distributed nodes know about all hosts and services. I have created a configuration whereby hosts/services that are not to be checked by node 1 are given a template that looks like: define service { namenagios-dist-check-service freshness_threshold 1200 active_checks_enabled 1 check_freshness 0 check_period24x7 event_handler_enabled 0 flap_detection_enabled 0 notifications_enabled 0 obsess_over_service 1 passive_checks_enabled 0 process_perf_data 0 register0 } define service { namenagios-dist-nocheck-service freshness_threshold 1200 active_checks_enabled 0 check_freshness 0 check_periodnone event_handler_enabled 0 flap_detection_enabled 0 notifications_enabled 0 obsess_over_service 0 passive_checks_enabled 0 process_perf_data 0 register0 } So services on node1 that are supposed to be run, get the nagios-dist-check-service template and those that should not, get the nagios-dist-nocheck-service template. Is there something about Nagios that I don't understand that would cause a lot of disabled service checks to shoot latencies way up? Is something else going on here? Here's my output of nagios -s on one of the nodes (both yield similar output and are configured similarly): Nagios 2.10 Copyright (c) 1999-2007 Ethan Galstad (http://www.nagios.org) Last Modified: 10-21-2007 License: GPL Projected scheduling information for host and service checks is listed below. This information assumes that you are going to start running Nagios with your current config files. HOST SCHEDULING INFORMATION --- Total hosts: 569 Total scheduled hosts: 0 Host inter-check delay method: SMART Average host check interval: 0.00 sec Host inter-check delay: 0.00 sec Max host check spread: 30 min First scheduled check: N/A Last scheduled check:N/A SERVICE SCHEDULING INFORMATION --- Total services: 2917 Total scheduled services: 1122 Service inter-check delay method: SMART Average service check interval: 385.13 sec Inter-check delay: 0.34 sec Interleave factor method: SMART Average services per host: 5.13 Service interleave factor: 2 Max service check spread: 30 min First scheduled check: Tue Jan 22 11:35:47 2008 Last scheduled check:
Re: [Nagios-users] HOWTO I did on installing NRPE on AIX
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Roger wrote: For what it's worth, I've written a HOWTO on installing NRPE on AIX. http://www.hackmyidea.com/wordpress/2008/01/22/installing-nagios-nrpe-on-aix/ http://www.hackmyidea.com/wordpress/2008/01/22/installing-nagios-nrpe-on-aix/ I am not an AIX adminstrator, so please feel free to correct any errata. (I'm assuming the best place for this stuff to go is in /opt ? Any advice here would be greatly appreciated) A good place to link the actual docs would be in the Nagios Community wiki. Just link to your page, that way everyone can see the documentation if they are searching for it on the wiki. http://www.nagioscommunity.org/wiki/index.php/Howtos:nrpe_nsca Regards, Max -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (GNU/Linux) iD8DBQFHlh3MIXSX/6LmsXkRAlw9AJ9Hws04bzDfgA0fj9Sb+Be0+hBvlQCdFJJH /YgBT8TDYaEZoEVOjJuC29Q= =3jmm -END PGP SIGNATURE- - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] nagios.cmd over nfs
Hi, I've migrated the nagios web interface onto a different physical server, nfs mounting the nagios directory from the actual nagios server. Only snag I'm running into right now is trying to access the nagios.cmd pipe over nfs. When trying to schedule downtime, disable notifications etc... the web interface just spins. I see cmd.cgi is being executed, but nothing happens. Both servers have the same uid/gids for nagios, webserver is even running as nagios user. I know a 'work-around' is to ssh+keysnagios.cmd pipe, but this isn't an option. Any ideas? Kyle - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] webservices check using check_http
Hi all! I'm facing a problem to check some webservices here with the check_http command. The point is from the command line (shell) I can run the command with the syntaxes perfectly. Testing the return everything is OK. The command line I run on the shell is: ./usr/local/nagios/libexec/check_http -H www4.somehost.com.br -u /monitor/billing.asmx/isFunctional?idCarrier=6 -s 'boolean xmlns=http://www.somehost.com.br/;true/boolean' My nightmare starts when I put the line on services.cfg It seems like nagios don't understand what I put on services.cfg (or I'm giving the wrong syntax). The check_command line on services.cfg is: check_http!www4.somehost.com.br!/monitor/billingvpn.asmx/isFunctional?idCarrier=6!'boolean xmlns=http://www.somehost.com.br/;true/boolean' The commands.cfg check_http line is: check_http -H $HOSTADDRESS$ $ARG1$ $ARG2$ I've tried already putting -u and -s before $ARG1$ and $ARG2$... But no changes has occured. I've tried also putting just -s before $ARG2$... And nothing changed. The www4.somehost.com.br is the virtualhost -H option /monitor/billingvpn.asmx/isFunctional?idCarrier=6 is the URL -u option and 'boolean xmlns=http://www.somehost.com.br/;true/boolean' is the string expected, -s option. The above command works perfectly. If I change anything after -s the expected string changes and the service status changes to CRITICAL, indicating that he command works normally. But Nagios daemon couldn't show the same... I don't know what's happening... If someone have some hint to give me... I'll apretiate. Regards, Fabiano Martins - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] nagios.cmd over nfs
On Tue, 2008-01-22 at 13:09 -0500, Kyle O'Donnell wrote: Hi, I've migrated the nagios web interface onto a different physical server, nfs mounting the nagios directory from the actual nagios server. Only snag I'm running into right now is trying to access the nagios.cmd pipe over nfs. When trying to schedule downtime, disable notifications etc... the web interface just spins. I see cmd.cgi is being executed, but nothing happens. Is NFS rpc.lockd running? Try mount_nfs(8) with '-L' -L Do not forward fcntl(2) locks over the wire. All locks will be local and not seen by the server and likewise not seen by other NFS clients. This removes the need to run the rpcbind(8) service and the rpc.statd(8) and rpc.lockd(8) servers on the client. Note that this option will only be honored when performing the initial mount, it will be silently ignored if used while updating the mount options. ~BAS Both servers have the same uid/gids for nagios, webserver is even running as nagios user. I know a 'work-around' is to ssh+keysnagios.cmd pipe, but this isn't an option. Any ideas? Kyle - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- Brian A. Seklecki [EMAIL PROTECTED] Collaborative Fusion, Inc. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] nagios.cmd over nfs
AFAIK, You cannot use a FIFO (pipe) over NFS. The pipe refers to not a file but a kernel IPC method. So a FIFO created on the NFS server is known to the nfs server and not the client system. On 22-Jan-08, at 1:11 PM, Brian A. Seklecki wrote: On Tue, 2008-01-22 at 13:09 -0500, Kyle O'Donnell wrote: Hi, I've migrated the nagios web interface onto a different physical server, nfs mounting the nagios directory from the actual nagios server. Only snag I'm running into right now is trying to access the nagios.cmd pipe over nfs. When trying to schedule downtime, disable notifications etc... the web interface just spins. I see cmd.cgi is being executed, but nothing happens. Is NFS rpc.lockd running? Try mount_nfs(8) with '-L' -L Do not forward fcntl(2) locks over the wire. All locks will be local and not seen by the server and likewise not seen by other NFS clients. This removes the need to run the rpcbind (8) service and the rpc.statd(8) and rpc.lockd(8) servers on the client. Note that this option will only be honored when performing the initial mount, it will be silently ignored if used while updating the mount options. ~BAS Both servers have the same uid/gids for nagios, webserver is even running as nagios user. I know a 'work-around' is to ssh+keysnagios.cmd pipe, but this isn't an option. Any ideas? Kyle - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/ null -- Brian A. Seklecki [EMAIL PROTECTED] Collaborative Fusion, Inc. -- --- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null Sean McAvoy NOC Team Lead Afilias Canada P. 416.673.4194 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Strange think
Hi all! I've tried what hugo has poted. I've got it workin using the following lines on commands.cfg check_http -H $ARG1$ -u $ARG2$ -s $ARG3$ Only with that syntax I could got the webservices check fair! Thanks for all messages!! Regards, Fabiano On Jan 21, 2008 6:33 PM, Hugo van der Kooij [EMAIL PROTECTED] wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Fabiano Martins wrote: | Hi all!! | | I have a strange issue here with my nagios. Let me try to explain... | | I have some web services to check and when I run the command from | console to test two diferent status of these services (UP or DOWN) I | receive the correct answer from remote application. Let me put the | command here: | | ./usr/local/nagios/libexec/check_http -H www.somehost.com.br | http://www.somehost.com.br/ -u /heartbeat.asmx/Web -s 'boolean | xmlns= http://www.somehost.com.br/;true/boolean' | http://www.somehost.com.br/%22%3Etrue%3C/boolean%3E' | | As the check_http says, I have to use check_http -H vhost -u URL path | that I wan to check on specified host and -s string that I expect to | be returned to me. | | Ok! | | I run this command from console and everything goes fine. To test, I've | changed the expected string from true to false, and the answer received | after running the command from console changes to DOWN String not found. | | The problem is, when I put it into service.cfg for nagios run the | command, I've realized that nagios is not threating the return. If the | service is down, nagios keeps showing me that the service is OK. | | In command.cfg file, I've changed the line check_http command to add one | more argument that I need. | | The default is check_http -H $HOSTADDRESS$ $ARG1$. | | I've changed it to check_http -H $HOSTADDRESS$ $ARG1$ $ARG2$ | | In services.cfg file I put check_http!www.somehost.com.br -u | /heartbeat.asmx/Web!'boolean | xmlns=http://www.somehost.com.br/;true/boolean' | http://www.somehost.com.br/%22%3Etrue%3C/boolean%3E' | | I think that nagios is doesn't recognizing the sintax. It seem you are mixing things in a odd manner. I would recommend to define the command with: define command{ command_namecheck_http_reply command_line$USER1$/check_http -I $HOSTADDRESS$ -H $ARG1$ -u $ARG2$ - -s $ARG3$ } Then define the service with: check_http_reply!www.somehost.com.br!/heartbeat.asmx/Web!'boolean xmlns=http://www.somehost.com.br/;true/boolean' I would never try to mix the arguments the way you did. Hugo. - -- [EMAIL PROTECTED] http://hugo.vanderkooij.org/ PGP/GPG? Use: http://hugo.vanderkooij.org/0x58F19981.asc A: Yes. Q: Are you sure? A: Because it reverses the logical flow of conversation. Q: Why is top posting frowned upon? Bored? Click on http://spamornot.org/ and rate those images. -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.7 (GNU/Linux) iD8DBQFHlQF5BvzDRVjxmYERAoNNAJ9BAmNX8jaVg64ziBbvxYHfj4U0JACdEmpL jwM3YfmS0UoiCn91aBj5tfU= =sJNx -END PGP SIGNATURE- - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] nagios.cmd over nfs
I found a neat but ugly work-around. After doing some reading, it appears as though the fifo/pipe needs to be opened for reading and writing at the same time. If I leave 'tail -f nagios.cmd' running on the remote site nfs mounting the pipe, data is processed! On 1/22/08, Sean McAvoy [EMAIL PROTECTED] wrote: AFAIK, You cannot use a FIFO (pipe) over NFS. The pipe refers to not a file but a kernel IPC method. So a FIFO created on the NFS server is known to the nfs server and not the client system. On 22-Jan-08, at 1:11 PM, Brian A. Seklecki wrote: On Tue, 2008-01-22 at 13:09 -0500, Kyle O'Donnell wrote: Hi, I've migrated the nagios web interface onto a different physical server, nfs mounting the nagios directory from the actual nagios server. Only snag I'm running into right now is trying to access the nagios.cmd pipe over nfs. When trying to schedule downtime, disable notifications etc... the web interface just spins. I see cmd.cgi is being executed, but nothing happens. Is NFS rpc.lockd running? Try mount_nfs(8) with '-L' -L Do not forward fcntl(2) locks over the wire. All locks will be local and not seen by the server and likewise not seen by other NFS clients. This removes the need to run the rpcbind (8) service and the rpc.statd(8) and rpc.lockd(8) servers on the client. Note that this option will only be honored when performing the initial mount, it will be silently ignored if used while updating the mount options. ~BAS Both servers have the same uid/gids for nagios, webserver is even running as nagios user. I know a 'work-around' is to ssh+keysnagios.cmd pipe, but this isn't an option. Any ideas? Kyle - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/ null -- Brian A. Seklecki [EMAIL PROTECTED] Collaborative Fusion, Inc. -- --- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null Sean McAvoy NOC Team Lead Afilias Canada P. 416.673.4194 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] nagios.cmd over nfs
... not so much It solves the web page spinning, but since nagios never picks up the data nothing happens. Back to the drawing board. On 1/22/08, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: I found a neat but ugly work-around. After doing some reading, it appears as though the fifo/pipe needs to be opened for reading and writing at the same time. If I leave 'tail -f nagios.cmd' running on the remote site nfs mounting the pipe, data is processed! On 1/22/08, Sean McAvoy [EMAIL PROTECTED] wrote: AFAIK, You cannot use a FIFO (pipe) over NFS. The pipe refers to not a file but a kernel IPC method. So a FIFO created on the NFS server is known to the nfs server and not the client system. On 22-Jan-08, at 1:11 PM, Brian A. Seklecki wrote: On Tue, 2008-01-22 at 13:09 -0500, Kyle O'Donnell wrote: Hi, I've migrated the nagios web interface onto a different physical server, nfs mounting the nagios directory from the actual nagios server. Only snag I'm running into right now is trying to access the nagios.cmd pipe over nfs. When trying to schedule downtime, disable notifications etc... the web interface just spins. I see cmd.cgi is being executed, but nothing happens. Is NFS rpc.lockd running? Try mount_nfs(8) with '-L' -L Do not forward fcntl(2) locks over the wire. All locks will be local and not seen by the server and likewise not seen by other NFS clients. This removes the need to run the rpcbind (8) service and the rpc.statd(8) and rpc.lockd(8) servers on the client. Note that this option will only be honored when performing the initial mount, it will be silently ignored if used while updating the mount options. ~BAS Both servers have the same uid/gids for nagios, webserver is even running as nagios user. I know a 'work-around' is to ssh+keysnagios.cmd pipe, but this isn't an option. Any ideas? Kyle - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/ null -- Brian A. Seklecki [EMAIL PROTECTED] Collaborative Fusion, Inc. -- --- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null Sean McAvoy NOC Team Lead Afilias Canada P. 416.673.4194 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] check_commands.cgf
check_commands.cfg was not installed. Where do I get it from? I could copy if from another machine but I'd like to find out where it came from. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] Marco Wiese/BEIT GmbH ist außer H aus.
Ich werde vom 21.01.2008 bis einschließlich 25.01.2008 nicht im Büro sein. Ich werde Ihre Nachricht nach meiner Rückkehr schnellstmöglich beantworten. -- I'm out of office until the 25th January 2008. Your e-mail will not be forwarded but I will answer your message as soon as possible.- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] NDO Troubles
Hi all i using nagios 3.10 and ndo in the same version..i ahve the same problem...the ndo2db say in the log is conect in mysqk, but the ndodamen saying error to write in the data sink Em Sex, 2008-01-18 às 14:34 -0500, [EMAIL PROTECTED] escreveu: List, Today our NDOUtils broker...broke. I'm not entirely sure what happened, but I can tell you what is going on and what I've done... Nagios 2.10, 1.4.10 plugs on Ubuntu 7.10 This has been fine for about 3 months now. Today however, we started seeing some issues with the daemon. In the event log of the nagios web interface it shows successful connections, as does the nagios/mysql logs. Only problem is that in the Nagvis interface it is showing that NDO is reporting Nagios as closed, or no updates in more that 100 seconds. I'm leaning more towards a broken Nagvis at the moment. I am the only person touching the Nagios box and nothing has changed. I did try to add Bacula to this box, but have since removed all relative packages that I installed. Has anyone seen/experienced a similar issue with NDO? It's really frustrating, as we have an IT auditor here today, and this is a big part of our audit this year!!! Stephen - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] NDO Troubles
you just delete the database and ndodaemon back to work ? if you use postgresql? same problem ?? (if i dont solve my problem with NDO tomorow my boss will abort nagios in the company.. :-( ) Em Seg, 2008-01-21 às 09:33 +0100, Giles Coochey escreveu: List, Today our NDOUtils broker...broke. I'm not entirely sure what happened, but I can tell you what is going on and what I've done... Nagios 2.10, 1.4.10 plugs on Ubuntu 7.10 This has been fine for about 3 months now. Today however, we started seeing some issues with the daemon. In the event log of the nagios web interface it shows successful connections, as does the nagios/mysql logs. Only problem is that in the Nagvis interface it is showing that NDO is reporting Nagios as closed, or no updates in more that 100 seconds. I'm leaning more towards a broken Nagvis at the moment. I am the only person touching the Nagios box and nothing has changed. I did try to add Bacula to this box, but have since removed all relative packages that I installed. Has anyone seen/experienced a similar issue with NDO? It's really frustrating, as we have an IT auditor here today, and this is a big part of our audit this year!!! I get this problem about once a week. In order to resolve it, I stop Nagios, stop NDO, restart mysql, drop the Nagios database, recreate the Nagios database, start Nagios and start NDO. Once done everything works again. If I don't restart mysql I cannot drop the Nagios database within the mysql client and sometimes I need to expressly kill (-9) both Nagios and Mysql. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Problem with high latencies after going distributed
As I'd mentioned in a previous message, I'm in the process of converting from a centralized Nagios 2.10 setup all running on a single host to a distributed setup running on at least 3 hosts (3 to start anyway). The centralized setup has 572 hosts and 2900 services 99.9% of which are active checks. ... Active Service Latency: 0.000 / 7267.198 / 4241.019 sec This isn't much help, but... We've just done exactly the same (Nagios 2.9), and we have a comparable size of system (actually a bit larger - 713 hosts, 5834 services). After going distributed, we too have this insanely high latency on the satellites. The only possible cause is the OCSP command slowing things down somehow. This is using the supplied send_nsca call to send the status off to the central server... define command { command_namerelay command_line$USER1$/submit_check_result $HOSTNAME$ $SERVICEDESC$ $SERVICESTATEID$ $SERVICEOUTPUT$ } So it should work. I guess things would be better if it packaged the updates up into batches, although it cant do that normally. I think it might be better to make the OCSP command just dump the status to a file, and then have a cronjob every 60 seconds that reads the file and sends the statuses off as a batch. I will try this here, when I get the chance. Steve - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Problem with high latencies after going distributed
-Original Message- From: Steve Shipway [mailto:[EMAIL PROTECTED] Sent: Tuesday, January 22, 2008 8:45 PM To: Frost, Mark {PBG}; Nagios Users Subject: RE: [Nagios-users] Problem with high latencies after going distributed As I'd mentioned in a previous message, I'm in the process of converting from a centralized Nagios 2.10 setup all running on a single host to a distributed setup running on at least 3 hosts (3 to start anyway). The centralized setup has 572 hosts and 2900 services 99.9% of which are active checks. ... Active Service Latency: 0.000 / 7267.198 / 4241.019 sec This isn't much help, but... We've just done exactly the same (Nagios 2.9), and we have a comparable size of system (actually a bit larger - 713 hosts, 5834 services). After going distributed, we too have this insanely high latency on the satellites. The only possible cause is the OCSP command slowing things down somehow. This is using the supplied send_nsca call to send the status off to the central server... define command { command_namerelay command_line$USER1$/submit_check_result $HOSTNAME$ $SERVICEDESC$ $SERVICESTATEID$ $SERVICEOUTPUT$ } So it should work. I guess things would be better if it packaged the updates up into batches, although it cant do that normally. I think it might be better to make the OCSP command just dump the status to a file, and then have a cronjob every 60 seconds that reads the file and sends the statuses off as a batch. I will try this here, when I get the chance. Steve But if the submit_check_result is running slowly, that would only affect the service execution time wouldn't it? My understanding of check latency is that it's the difference in time between when Nagios schedules a check to run versus the time that the check actually starts to execute. But maybe I'm misunderstanding something here. When it comes to working with Nagios, I tend to learn the most when I have the biggest problems :-). Do you do the same thing I mentioned where you define all the checks on both distributed nodes, but disable checks on complimentary halves of those checks on each node? Thanks Mark - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Problem with high latencies after going distributed
Active Service Latency: 0.000 / 7267.198 / ... The only possible cause is the OCSP command slowing things down somehow. ... But if the submit_check_result is running slowly, that would only affect the service execution time wouldn't it? My understanding of check latency is that it's the difference in time between when Nagios schedules a check to run versus the time that the check actually starts to execute. If the scheduler gets behind, then the latency increases as it runs the service checks in order of the scheduler. It is possible that the OSCP handler is run SERIALLY with service checks (as the host checks are done in 1.x) and is therefore holding up service checks, just like you'd see if you had a lot of down hosts and a long-running host check command. But maybe I'm misunderstanding something here. When it comes to working with Nagios, I tend to learn the most when I have the biggest problems Don't we all :-/. The latency effect of non-parallel host checks was a nasty surprise to me. Do you do the same thing I mentioned where you define all the checks on both distributed nodes, but disable checks on complimentary halves of those checks on each node? Yes. However, I can't always set the freshness checking because some of our checks are every 4 hours, although most are at a sub 15min interval. We have a complex configuration tool that builds our whole distributed Nagios/MRTG configuration set from templates so I can't hand-hack the config files either. I have now set up one of our distributed nodes to batch the NSCA messages, and will see if the latency increases overnight (so far, it looks good). To do this, I just changed submit_check_result to only append to a file, then added a Nagios every-minute cronjob to cat the contents of this file into send_nsca (actually, there are a few more steps to ensure data integrity and checks, but that's basically it). The upshot is that some checks may be delayed by up to a minute, and we're dependent on cron, but the OCSP command exits very fast. Let me know if you want a copy of the two scripts I used to achieve this. Steve - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Problem with high latencies after going distributed
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 22/01/08 09:13 PM, Frost, Mark {PBG} wrote: -Original Message- From: Steve Shipway [mailto:[EMAIL PROTECTED] Sent: Tuesday, January 22, 2008 8:45 PM To: Frost, Mark {PBG}; Nagios Users Subject: RE: [Nagios-users] Problem with high latencies after going distributed We've just done exactly the same (Nagios 2.9), and we have a comparable size of system (actually a bit larger - 713 hosts, 5834 services). After going distributed, we too have this insanely high latency on the satellites. The only possible cause is the OCSP command slowing things down somehow. This is using the supplied send_nsca call to send the status off to the central server... define command { command_namerelay command_line$USER1$/submit_check_result $HOSTNAME$ $SERVICEDESC$ $SERVICESTATEID$ $SERVICEOUTPUT$ } So it should work. I guess things would be better if it packaged the updates up into batches, although it cant do that normally. I think it might be better to make the OCSP command just dump the status to a file, and then have a cronjob every 60 seconds that reads the file and sends the statuses off as a batch. I will try this here, when I get the chance. Steve But if the submit_check_result is running slowly, that would only affect the service execution time wouldn't it? My understanding of check latency is that it's the difference in time between when Nagios schedules a check to run versus the time that the check actually starts to execute. You're right, but you're just missing one detail. Nagios runs checks in parallel and then reaps all the service results at once. While it's reaping it can't schedule other checks and it is in the reaping state that Nagios runs host check, event handlers, performance data commands and oc[hs]p commands. All this is done serially and can slow down significantly each service reaping run and thus delay the execution of further checks. I although I never built a distributed system, I designed mine to be easily distributed. Moreover, I used a technique I developed for latency-free performance-data processing (That I still heavily use BTW) to create a way to distribute check results to to a distributed central server in the same latency-free way (Was more like a fun project as I don't use it myself yet). Basically you use the host/service performance data files to get the data, but instead of writing to a file you write it to a named pipe (fifo). That pipe is then read by a high-performance non-blocking event-based Perl daemon (yeah I know that looks like marketing terms, but I can explain further each of them if you like) that forks send_nsca processes to send results in bulk (normally every few seconds though). So Nagios doesn't even loose time rotating a file and all your checks are transmitted almost instantly. See this wiki page for details and code: http://www.nagioscommunity.org/wiki/index.php/OCP_Daemon Thomas -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFHlrR+6dZ+Kt5BchYRAgPAAKD7Rj6esSEe+yU4oiw6f+zI5SwTQgCeLJRS Kc+BjLetcWxzanZOREHO8ks= =2pY+ -END PGP SIGNATURE- - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] nagios.cmd over nfs
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 22/01/08 03:38 PM, [EMAIL PROTECTED] wrote: ... not so much It solves the web page spinning, but since nagios never picks up the data nothing happens. Yep. As it was already said on the thread fifos only works locally. It implemented in the kernel, not in the FS code. The FS is only involved in giving it a name and permissions (and this trivial task is properly implemented by NFS: each server can use the pipe ... locally). A trivial (hacky) workaround could be using Netcat (nc): On the webserver: cat /path/to/nagios.cmd | nc nagios_host some_port On the Nagios server: nc -lp some_port /path/to/nagios.cmd Since NC can die you should ideally run in with Daniel J Bernstein's Daemontools: http://cr.yp.to/daemontools.html Also try in UDP mode (nc -u host post / nc -lup port) if you have problems. The solution I'd go for though would be using Perl daemons for relaying the commands using code similar to these daemons: http://www.nagioscommunity.org/wiki/index.php/OCP_Daemon#OCP_daemon_code http://nagiosexchange.altinity.org/nagiosexchange/NPDaemon/ It requires good Perl knowledge, especially since these daemons doesn't include any non-blocking sending function. Since the command pipe isn't much solicited you can just go blocking too (while (FIFO)/blocking send on the web server, blocking listen/blocking write to the nagios server), but make sure you implement timeouts with alarm() in the network code to avoid jamming there. Thomas -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFHlroF6dZ+Kt5BchYRAuhlAJ0SZ5RXPnXkRZGMHTaklNw8znQIVwCdHrah EqhmEozdq4qLNeM8W0Ip3l0= =BRMo -END PGP SIGNATURE- - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null