[Nagios-users] could not fetch information from server
Hi, I'm quite new to Nagios. I have a test VM environment setup to where Nagios 3.0.1 on RHEL 4 with Nagios plug-ins 1.4.11 are monitoring the Nagios server itself along with a Windows XP VM box. The PING status is OK, so I know there is communication between my two VM sessions. Although for health checks, such as CPU usage, Disk space, et cetera, I am receiving on the Nagios server: could not fetch information from server. I have defined a password for communication between the two, which is in the NSC.INI file on the Windows box as well as within the check_nt command definition within the commands.cfg file. Any assistance is greatly appreciated. Using NSClient++ v0.3.1 on the windows machine. Thank you, Izz DISCLAIMER: This email (including any attachments) is intended for the sole use of the intended recipient/s and may contain material that is CONFIDENTIAL AND PRIVATE COMPANY INFORMATION. Any review or reliance by others or copying or distribution or forwarding of any or all of the contents in this message is STRICTLY PROHIBITED. If you are not the intended recipient, please contact the sender by email and delete all copies; your cooperation in this regard is appreciated. - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] Ring topology parent/child relation Nagios
Hello, I have some problems defining the parent/child relationships to reflect changes and monitoring on the map. My topology is something like this: Nagios machine --- Router A Router B Router B --- Router C --- Router D --- Router E ---Router F --- Router B (ring closing itself) but on the Router B ring I can't define parent relationships in a circular way because nagios refuses to start when it detects this. (I defined on Router C parent = Router B and Router D and so on for each of them) How should I configure them ? Any ideas / help ? Regards, Mihai - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Ring topology parent/child relation Nagios
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Mihai Tanasescu wrote: | I have some problems defining the parent/child relationships to reflect | changes and monitoring on the map. | | My topology is something like this: | | Nagios machine --- Router A Router B | | Router B --- Router C --- Router D --- Router E ---Router F --- Router B | (ring closing itself) | | but on the Router B ring I can't define parent relationships in a | circular way because nagios refuses to start when it detects this. The whole concept of a ring setup is that a single disaster can not cause a network failure. For this setup I would only follow the ring halfway. So you get 2 chains: Nagios -- A -- B -- C -- D Nagios -- A -- B -- F -- E Make sure you monitor each neighbor on each ring router to make sure the ring is working as expected. If you use dynamic routing you might want to monitor route changes relevant for the proper operation of your ring setup. Hugo. - -- [EMAIL PROTECTED] http://hugo.vanderkooij.org/ PGP/GPG? Use: http://hugo.vanderkooij.org/0x58F19981.asc A: Yes. Q: Are you sure? A: Because it reverses the logical flow of conversation. Q: Why is top posting frowned upon? Bored? Click on http://spamornot.org/ and rate those images. -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.7 (GNU/Linux) iD8DBQFIKDbGBvzDRVjxmYERAt2PAJ986/BS0M0kgZhAgQfROUgG9ct7rwCfUu2u 7lPWytTiYn0B7T7QAYwpj9c= =BM9j -END PGP SIGNATURE- - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] Way to replicate external commands to failover server?
I'm working on a central server failover strategy for a distributed setup using Nagios 3.0.1. In our case, I don't want to lose any history while the main server is down (or graph data) and I don't want to think about merging data. So I have the 2nd central server as a sort of hot standby and it's accepting check data from the distributed nodes just like the main (non-failover) central server. That seems to work OK, and in the event of emergency, I tell the failover server to enable notifications and away we go. What has me a little concerned is that if someone went into the web interface on the main server and say scheduled downtime or disabled notifications, the backup server would never know about it. In the even to failure people could find themselves getting alerts for a host that should have been in scheduled downtime (or it was on the main server). While I realize I would not want to capture and retransmit *all* external commands to the backup host, if I could somehow get at them I could filter them over to the backup host (i.e. ignore most commands, but pass a few like downtime or host notifications, etc). Is there any mechanism that allows me to do this? As I understand it the global host and service events really only capture check results -- they're not going to fire if someone schedules downtime. Thanks Mark - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] Hostgroup definition
I know that lots of the documentation about how to configure Nagios's cfg files is in transition from 2.x (where you mostly did it one way) to 3.0 (where you mostly do it another way), but there's a point I'm not clear on. It seems to me that it would be much easier to maintain if member machines were placed in hostgroups *in each member machine's cfg file* (cause yes, I'm using a separate file for each machine). This doesn't seem to be the way Nagios expects me to do it, and I don't see that there's a way to do it this way; you appear to have to define the hostgroup in some amorphous 'somewhere', and then add all the hosts to it *there* (which means that there are two places you have to change when you add a new host, which I'm not fond of). Have any of the DBMS config builder front-ends been updated to 3.0 yet? Cheers, -- jra -- Jay R. Ashworth Baylink [EMAIL PROTECTED] Designer The Things I Think RFC 2100 Ashworth Associates http://baylink.pitas.com '87 e24 St Petersburg FL USA http://photo.imageinc.us +1 727 647 1274 Those who cast the vote decide nothing. Those who count the vote decide everything. -- (Joseph Stalin) - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] SMS and ATT with Nagios
Hi, Luis Fernando Lacayo [EMAIL PROTECTED] wrote on 09.05.08 16:38: Good Morning all, I have to change my NAGIOS platform to a Dell Blade on RHEL 5. I currently use a modem and qpage to send out notifications. Since there is no way to attach a modem to a blade, I am thinking on sending the alerts via SMS. Our carrier is ATT, is there anyone out there currently doing this? Can you share how you are doing this? Thanks, Luis Multitech has Modems with Ethernet Interfaces. You just connect them to the Network and send a text oder sms via Telnet commands: http://www.multitech.com/PRODUCTS/Categories/Device_Networking/ We use SMS for our alerts, it is our primary mechanism. We send them out using the same Nagios command as the email alerts. To send alerts to ATT SMS addresses you send them as email from Nagios to [EMAIL PROTECTED] HTH -greg - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Hostgroup definition
-Original Message- From: [EMAIL PROTECTED] [mailto:nagios-users- [EMAIL PROTECTED] On Behalf Of Jay R. Ashworth Sent: Monday, May 12, 2008 9:32 AM To: nagios-users@lists.sourceforge.net Subject: [Nagios-users] Hostgroup definition I know that lots of the documentation about how to configure Nagios's cfg files is in transition from 2.x (where you mostly did it one way) to 3.0 (where you mostly do it another way), but there's a point I'm not clear on. I've not done the transition but there don't appear to be significant changes from 2.x to 3.x... It seems to me that it would be much easier to maintain if member machines were placed in hostgroups *in each member machine's cfg file* (cause yes, I'm using a separate file for each machine). You have been able to do that since 2.x. This doesn't seem to be the way Nagios expects me to do it, and I don't see that there's a way to do it this way; you appear to have to define the hostgroup in some amorphous 'somewhere', and then add all the hosts to it *there* (which means that there are two places you have to change when you add a new host, which I'm not fond of). You do have to define the hostgroup but you don't have to specify members there. http://nagios.sourceforge.net/docs/3_0/objectdefinitions.html#hostgroup hostgroup_members: This _optional_ directive can be used to include hosts from other sub host groups in this host group. Specify a comma-delimited list of short names of other host groups whose members should be included in this group. (emphasis mine) http://nagios.sourceforge.net/docs/3_0/objectdefinitions.html#host hostgroups: This directive is used to identify the short name(s) of the hostgroup(s) that the host belongs to. Multiple hostgroups should be separated by commas. This directive may be used as an alternative to (or in addition to) using the members directive in hostgroup definitions. -- marc - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Hostgroup definition
Jay R. Ashworth wrote: I know that lots of the documentation about how to configure Nagios's cfg files is in transition from 2.x (where you mostly did it one way) to 3.0 (where you mostly do it another way), but there's a point I'm not clear on. It seems to me that it would be much easier to maintain if member machines were placed in hostgroups *in each member machine's cfg file* (cause yes, I'm using a separate file for each machine). You can do that. This doesn't seem to be the way Nagios expects me to do it, and I don't see that there's a way to do it this way; you appear to have to define the hostgroup in some amorphous 'somewhere', and then add all the hosts to it *there* (which means that there are two places you have to change when you add a new host, which I'm not fond of). Not really, no, but the hostgroup needs to be defined somewhere. When it is, you can do something like the following define host { use template_with_all_required_variables hostgroups hostgroup1,hostgroup2,hostgroup4,hostgroupn } -- Andreas Ericsson [EMAIL PROTECTED] OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] SMS and ATT with Nagios
We use SMS for our alerts, it is our primary mechanism. We send them out using the same Nagios command as the email alerts. To send alerts to ATT SMS addresses you send them as email from Nagios to [EMAIL PROTECTED] providing your network is still abot to connect to the outside world - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] SMS and ATT with Nagios
Correct, it is good to have at least two ways out, modem and network. -Original Message- From: Tom Brown [mailto:[EMAIL PROTECTED] Sent: Monday, May 12, 2008 8:04 AM To: Frater, Greg J Cc: nagios-users@lists.sourceforge.net Subject: Re: [Nagios-users] SMS and ATT with Nagios We use SMS for our alerts, it is our primary mechanism. We send them out using the same Nagios command as the email alerts. To send alerts to ATT SMS addresses you send them as email from Nagios to [EMAIL PROTECTED] providing your network is still abot to connect to the outside world - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Hostgroup definition
On Mon, May 12, 2008 at 10:01:56AM -0500, Marc Powell wrote: I know that lots of the documentation about how to configure Nagios's cfg files is in transition from 2.x (where you mostly did it one way) to 3.0 (where you mostly do it another way), but there's a point I'm not clear on. I've not done the transition but there don't appear to be significant changes from 2.x to 3.x... The entries themselves, no. But it did seem to me that the approach to which things go in what files -- as exemplified by the default sample configs -- changed a bit, no? Or do I just think that because the segregate in cfg_dirs by type of object; everything in its own file approach made more sense to me? It seems to me that it would be much easier to maintain if member machines were placed in hostgroups *in each member machine's cfg file* (cause yes, I'm using a separate file for each machine). You have been able to do that since 2.x. Yeah, so I found out; see my other reply. You do have to define the hostgroup but you don't have to specify members there. http://nagios.sourceforge.net/docs/3_0/objectdefinitions.html#hostgroup hostgroup_members:This _optional_ directive can be used to include hosts from other sub host groups in this host group. Specify a comma-delimited list of short names of other host groups whose members should be included in this group. I did see that, when I actually looked far enough. :-) For what it's worth, it doesn't *actually* say that if you are going to declare an object a member of a hostgroup, you *do* still actually have to *define* it somewhere, which it probably should. Certainly it's implied, but I'm not sure that's good enough, as complicated as Nagios is. Cheers, -- jra -- Jay R. Ashworth Baylink [EMAIL PROTECTED] Designer The Things I Think RFC 2100 Ashworth Associates http://baylink.pitas.com '87 e24 St Petersburg FL USA http://photo.imageinc.us +1 727 647 1274 Those who cast the vote decide nothing. Those who count the vote decide everything. -- (Joseph Stalin) - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Hostgroup definition
On May 12, 2008, at 7:46 AM, Jay R. Ashworth wrote: On Mon, May 12, 2008 at 05:03:17PM +0200, Andreas Ericsson wrote: Not really, no, but the hostgroup needs to be defined somewhere. When it is, you can do something like the following define host { use template_with_all_required_variables hostgroups hostgroup1,hostgroup2,hostgroup4,hostgroupn } It doesn't cause any confusion, statistical or otherwise, to put a host in more than one group, does it? Nope. I do this for pretty much all of my hosts, actually. For example, all printers are in a Printer host group, to associate services as well as group all printers together. Additionally, I have a host group for each location we have machines in, so a printer in Barrow would be in both the printer hostgroup (for the services) and the Barrow hostgroup (for the location). Makes it easy to find all the machines in barrow, as well as all the printers. --- Israel Brewster Computer Support Technician Frontier Flying Service Inc. 5245 Airport Industrial Rd Fairbanks, AK 99709 (907) 450-7250 x293 --- Cheers, -- jra -- Jay R. Ashworth Baylink [EMAIL PROTECTED] Designer The Things I Think RFC 2100 Ashworth Associates http:// baylink.pitas.com '87 e24 St Petersburg FL USA http://photo.imageinc.us +1 727 647 1274 Those who cast the vote decide nothing. Those who count the vote decide everything. -- (Joseph Stalin) - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Ring topology parent/child relation Nagios
Mihai Tanasescu wrote: | I have some problems defining the parent/child relationships to reflect | changes and monitoring on the map. | | My topology is something like this: | | Nagios machine --- Router A Router B | | Router B --- Router C --- Router D --- Router E ---Router F --- Router B | (ring closing itself) | | but on the Router B ring I can't define parent relationships in a | circular way because nagios refuses to start when it detects this. The whole concept of a ring setup is that a single disaster can not cause a network failure. For this setup I would only follow the ring halfway. So you get 2 chains: Nagios -- A -- B -- C -- D Nagios -- A -- B -- F -- E Make sure you monitor each neighbor on each ring router to make sure the ring is working as expected. If you use dynamic routing you might want to monitor route changes relevant for the proper operation of your ring setup. Hugo. Hello Hugo, Thanks for the tip but I have one more question which refers to my current problem in fact. (I configured sms sending for down events). In case for example router B loses both its links to C and F (2 fibercuts on the network), then I will be getting SMSes stating that C,D,F,E are down. B in fact will not be down as a system but will be unable to reach the others. How could I solve this and avoid sending misleading sms messages regarding down events? Thanks, Mihai - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] is there a maximum length for performance data?
Hello, I'm using Nagios 2.10 and the ndo2db add-on in a Gentoo environment, and I'm running into problems with the check_disk command, which I'm executing via NRPE. I suspect I may actually be hitting a limitation of the ndo2db add-on, though... In the Nagios config file for server1, I have: define service{ host_name server2,server1,server3 service_description disk usage check_command check_nrpe!check_disk use serviceTemplate } In the NRPE config file on server2, I have: command[check_disk]=/usr/nagios/libexec/check_disk -w 40% -c 20% In the database, the nagios_servicechecks table shows truncated perfdata for checks on server2. The output field is intact: DISK CRITICAL - free space: / 3806 MB (16% inode=93%): /dev 1005 MB (100% inode=98%): /home 85167 MB (77% inode=99%): /usr/portage 85167 MB (77% inode=99%): /dev/shm 1005 MB (100% inode=100%): However, the perfdata field is truncated: /=19666MB;14082;18776;93;23471 /dev=0MB;602;803;97;1004 /home=25220MB;66232;88309;98;110387 /usr/portage=25220MB;66232;88309;98;110387 / The database is set up to allow 255 characters for the perfdata field -- the above, truncated value is only 136 characters long. When I run the following command from server1's CLI, I get what I expect, without truncation: CLI: /usr/nagios/libexec/check_nrpe -H dev -c check_disk Result: DISK CRITICAL - free space: / 3804 MB (16% inode=93%); /dev 1005 MB (100% inode=98%); /home 85167 MB (77% inode=99%); /usr/portage 85167 MB (77% inode=99%); /dev/shm 1005 MB (100% inode=100%);| /=19667MB;14082;18776;93;23471 /dev=0MB;602;803;97;1004 /home=25220MB;66232;88309;98;110387 /usr/portage=25220MB;66232;88309;98;110387 /dev/shm=0MB;602;803;99;1004 The CLI result is 357 characters long. If you truncate it where the truncation occurs in the database, you end up with a string exactly 330 characters long. I wonder if that is some sort of magic number for the ndo2db add-on. I suspect that there isn't any such limitation in the check_disk or check_nrpe commands themselves, since the full output is displayed when called from the command line. I hope I didn't supply so much detail that nobody bothers to read this! Has anyone experienced something similar? -Frank - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] is there a maximum length for performance data?
I confirmed this 330 character limit on another server. However, I'm trying to determine whether the limit is being imposed by the ndo2db add-on or someplace else -- not that it matters, I guess... I'll probably just work around the limitation. I noticed that the perfdata is also truncated in this file: /var/nagios/status.log. I'm not sure where Nagios logs data though... Anyone? Thanks! -Frank On Mon, May 12, 2008 at 3:42 PM, Frank J. Gómez [EMAIL PROTECTED] wrote: Hello, I'm using Nagios 2.10 and the ndo2db add-on in a Gentoo environment, and I'm running into problems with the check_disk command, which I'm executing via NRPE. I suspect I may actually be hitting a limitation of the ndo2db add-on, though... In the Nagios config file for server1, I have: define service{ host_name server2,server1,server3 service_description disk usage check_command check_nrpe!check_disk use serviceTemplate } In the NRPE config file on server2, I have: command[check_disk]=/usr/nagios/libexec/check_disk -w 40% -c 20% In the database, the nagios_servicechecks table shows truncated perfdata for checks on server2. The output field is intact: DISK CRITICAL - free space: / 3806 MB (16% inode=93%): /dev 1005 MB (100% inode=98%): /home 85167 MB (77% inode=99%): /usr/portage 85167 MB (77% inode=99%): /dev/shm 1005 MB (100% inode=100%): However, the perfdata field is truncated: /=19666MB;14082;18776;93;23471 /dev=0MB;602;803;97;1004 /home=25220MB;66232;88309;98;110387 /usr/portage=25220MB;66232;88309;98;110387 / The database is set up to allow 255 characters for the perfdata field -- the above, truncated value is only 136 characters long. When I run the following command from server1's CLI, I get what I expect, without truncation: CLI: /usr/nagios/libexec/check_nrpe -H dev -c check_disk Result: DISK CRITICAL - free space: / 3804 MB (16% inode=93%); /dev 1005 MB (100% inode=98%); /home 85167 MB (77% inode=99%); /usr/portage 85167 MB (77% inode=99%); /dev/shm 1005 MB (100% inode=100%);| /=19667MB;14082;18776;93;23471 /dev=0MB;602;803;97;1004 /home=25220MB;66232;88309;98;110387 /usr/portage=25220MB;66232;88309;98;110387 /dev/shm=0MB;602;803;99;1004 The CLI result is 357 characters long. If you truncate it where the truncation occurs in the database, you end up with a string exactly 330 characters long. I wonder if that is some sort of magic number for the ndo2db add-on. I suspect that there isn't any such limitation in the check_disk or check_nrpe commands themselves, since the full output is displayed when called from the command line. I hope I didn't supply so much detail that nobody bothers to read this! Has anyone experienced something similar? -Frank - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Ring topology parent/child relation Nagios
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Mihai Tanasescu wrote: | Mihai Tanasescu wrote: | | I have some problems defining the parent/child relationships to reflect | | changes and monitoring on the map. | | | | My topology is something like this: | | | | Nagios machine --- Router A Router B | | | | Router B --- Router C --- Router D --- Router E ---Router F --- Router B | | (ring closing itself) | | | | but on the Router B ring I can't define parent relationships in a | | circular way because nagios refuses to start when it detects this. | | The whole concept of a ring setup is that a single disaster can not | cause a network failure. For this setup I would only follow the ring | halfway. | | So you get 2 chains: | | Nagios -- A -- B -- C -- D | Nagios -- A -- B -- F -- E | | Make sure you monitor each neighbor on each ring router to make sure the | ring is working as expected. | | If you use dynamic routing you might want to monitor route changes | relevant for the proper operation of your ring setup. | Thanks for the tip but I have one more question which refers to my | current problem in fact. (I configured sms sending for down events). | | In case for example router B loses both its links to C and F (2 | fibercuts on the network), then I will be getting SMSes stating that | C,D,F,E are down. | B in fact will not be down as a system but will be unable to reach the | others. | | How could I solve this and avoid sending misleading sms messages | regarding down events? This problem should not exist. Because if you cut the ring in 1 place all nodes can still be reached. So no router will go down. If you cut it in 2 places you loose part of the ring and only get alerts for the nodes directly on the other side of the cuts from your perspective. If you alert on unreachable as well then you get all the alerts you tried to get rid of by introducing the parent relation in the first place. So don't use them. You need an additional means of detecting your first cut in the ring as all routers can still be reached at that time and you will never know you had a problem unless you alert on the actual link conditions. Now getting the link condition to Nagios is something you need to work out. Due to the lack of details it will be hard to help you there at the moment. But considere the links to be the vital services for the host. Hugo. - -- [EMAIL PROTECTED] http://hugo.vanderkooij.org/ PGP/GPG? Use: http://hugo.vanderkooij.org/0x58F19981.asc A: Yes. Q: Are you sure? A: Because it reverses the logical flow of conversation. Q: Why is top posting frowned upon? Bored? Click on http://spamornot.org/ and rate those images. -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.7 (GNU/Linux) iD8DBQFIKKn4BvzDRVjxmYERAqs5AKCQVpx9YEJtti6ghzB6f70MKRsMWwCgmJk5 MYJnCshGVZeHPXVYT2w3JrU= =Y46N -END PGP SIGNATURE- - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] Build problem with NRPE on Slack 10.2
I'm rolling Nagios out onto the rest of my server, most of which are running Slack 10 or 12. Since I don't have packages for Slack, and am not inclined to learn how to build them (I have three other package managers to learn, thanks :-), I'm source building, and while NRPE built fine on this 10.2 box, the plugins themselves are failing the make with this error: if gcc -DLOCALEDIR=\/usr/local/nagios/share/locale\ -DHAVE_CONFIG_H -I. -I. -\ then mv -f .deps/netutils.Tpo .deps/netutils.Po; else rm -f .deps/i /bin/sh ../libtool --tag=CC --mode=link gcc -g -O2 -L. -L/usr/local/ssl/lib - gcc -g -O2 -o check_http check_http.o sslutils.o netutils.o utils.o -L/appl/doo /usr/local/ssl/lib/libcrypto.a(dso_dlfcn.o)(.text+0x35): In function `dlfcn_loa: : undefined reference to `dlopen' /usr/local/ssl/lib/libcrypto.a(dso_dlfcn.o)(.text+0x95): In function `dlfcn_loa: : undefined reference to `dlclose' /usr/local/ssl/lib/libcrypto.a(dso_dlfcn.o)(.text+0xbc): In function `dlfcn_loa: : undefined reference to `dlerror' /usr/local/ssl/lib/libcrypto.a(dso_dlfcn.o)(.text+0x147): In function `dlfcn_bi: : undefined reference to `dlsym' /usr/local/ssl/lib/libcrypto.a(dso_dlfcn.o)(.text+0x172): In function `dlfcn_bi: : undefined reference to `dlerror' /usr/local/ssl/lib/libcrypto.a(dso_dlfcn.o)(.text+0x237): In function `dlfcn_bi: : undefined reference to `dlsym' /usr/local/ssl/lib/libcrypto.a(dso_dlfcn.o)(.text+0x262): In function `dlfcn_bi: : undefined reference to `dlerror' /usr/local/ssl/lib/libcrypto.a(dso_dlfcn.o)(.text+0x50b): In function `dlfcn_un: : undefined reference to `dlclose' collect2: ld returned 1 exit status make[2]: *** [check_http] Error 1 make[2]: Leaving directory `/appl/downloads/nagios-plugins-1.4.11/plugins' make[1]: *** [all-recursive] Error 1 make[1]: Leaving directory `/appl/downloads/nagios-plugins-1.4.11' make: *** [all] Error 2 This appears to be a common build error, but no one ever answers when prople ask, that Google wants to tell me about. Is this a missing library? Wrong version? Cheers, -- jra -- Jay R. Ashworth Baylink [EMAIL PROTECTED] Designer The Things I Think RFC 2100 Ashworth Associates http://baylink.pitas.com '87 e24 St Petersburg FL USA http://photo.imageinc.us +1 727 647 1274 Those who cast the vote decide nothing. Those who count the vote decide everything. -- (Joseph Stalin) - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Ring topology parent/child relation Nagios
This problem should not exist. Nagios -- Router A -- Router B uplink1+2 ring (and Router B is in a ring topology which closes in it) http://tinypic.com/view.php?pic=11uhx7as=3 (this is the logical layout) Yes. But if you cut the 2 uplinks from Router B, then the Nagios machine will see Router B as up but will not be able to reach any other router from the ring and will thus alert that all other routers are down (which is not true). I mean having split the ring into the 2 halves you suggested that: C has parent B, D has parent C, E has parent D G has parent B, F has parent G = B up but B uplinks to C and G down - alerts that C and G are down although they aren't Can this be eliminated ? (I'm sure the solution should be simple and obvious but I'm not being as careful as I should to see it) Am I right ? P.S. Currently I am monitoring each link state (up/down) by using SNMP interface queries (on Cisco routers) and the hosts themselves with ping/icmp on loopback interfaces that are propagated throughout the network for reachability(OSPF). Because if you cut the ring in 1 place all nodes can still be reached. So no router will go down. If you cut it in 2 places you loose part of the ring and only get alerts for the nodes directly on the other side of the cuts from your perspective. If you alert on unreachable as well then you get all the alerts you tried to get rid of by introducing the parent relation in the first place. So don't use them. You need an additional means of detecting your first cut in the ring as all routers can still be reached at that time and you will never know you had a problem unless you alert on the actual link conditions. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] Monitoring your office's Coffee Machine?
Hello, I know, totally off topic but what if you really wanted to? I want to monitor our Coffee Machine to warn me when it is running low (so that I can go there put a new coffee in for some fresssh coffee). Now I know it has nothing that Nagios can talk to; so my question does anyone know of a product you can attach to it that has network capabilities that Nagios can talk to? Lol Thanks! -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Mihai Tanasescu Sent: May/12/2008 1:55 PM To: Nagios Users Mailinglist Subject: Re: [Nagios-users] Ring topology parent/child relation Nagios This problem should not exist. Nagios -- Router A -- Router B uplink1+2 ring (and Router B is in a ring topology which closes in it) http://tinypic.com/view.php?pic=11uhx7as=3 (this is the logical layout) Yes. But if you cut the 2 uplinks from Router B, then the Nagios machine will see Router B as up but will not be able to reach any other router from the ring and will thus alert that all other routers are down (which is not true). I mean having split the ring into the 2 halves you suggested that: C has parent B, D has parent C, E has parent D G has parent B, F has parent G = B up but B uplinks to C and G down - alerts that C and G are down although they aren't Can this be eliminated ? (I'm sure the solution should be simple and obvious but I'm not being as careful as I should to see it) Am I right ? P.S. Currently I am monitoring each link state (up/down) by using SNMP interface queries (on Cisco routers) and the hosts themselves with ping/icmp on loopback interfaces that are propagated throughout the network for reachability(OSPF). Because if you cut the ring in 1 place all nodes can still be reached. So no router will go down. If you cut it in 2 places you loose part of the ring and only get alerts for the nodes directly on the other side of the cuts from your perspective. If you alert on unreachable as well then you get all the alerts you tried to get rid of by introducing the parent relation in the first place. So don't use them. You need an additional means of detecting your first cut in the ring as all routers can still be reached at that time and you will never know you had a problem unless you alert on the actual link conditions. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] FW: Monitoring your office's Coffee Machine?
Mirza Dedic wrote: snip anyone know of a product you can attach to it that has network capabilities that Nagios can talk to? Lol Thanks! http://tldp.org/HOWTO/Coffee.html -- Flambeau Inc. Technology Center - Baraboo, WI Email: [EMAIL PROTECTED] Keyserver: http://pgp.mit.edu KeyID: 0x00E9EC2C - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] FW: Monitoring your office's Coffee Machine?
Hello, I know, totally off topic but what if you really wanted to? I want to monitor our Coffee Machine to warn me when it is running low (so that I can go there put a new coffee in for some fresssh coffee). Now I know it has nothing that Nagios can talk to; so my question does anyone know of a product you can attach to it that has network capabilities that Nagios can talk to? Lol Thanks! -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Mihai Tanasescu Sent: May/12/2008 1:55 PM To: Nagios Users Mailinglist Subject: Re: [Nagios-users] Ring topology parent/child relation Nagios This problem should not exist. Nagios -- Router A -- Router B uplink1+2 ring (and Router B is in a ring topology which closes in it) http://tinypic.com/view.php?pic=11uhx7as=3 (this is the logical layout) Yes. But if you cut the 2 uplinks from Router B, then the Nagios machine will see Router B as up but will not be able to reach any other router from the ring and will thus alert that all other routers are down (which is not true). I mean having split the ring into the 2 halves you suggested that: C has parent B, D has parent C, E has parent D G has parent B, F has parent G = B up but B uplinks to C and G down - alerts that C and G are down although they aren't Can this be eliminated ? (I'm sure the solution should be simple and obvious but I'm not being as careful as I should to see it) Am I right ? P.S. Currently I am monitoring each link state (up/down) by using SNMP interface queries (on Cisco routers) and the hosts themselves with ping/icmp on loopback interfaces that are propagated throughout the network for reachability(OSPF). Because if you cut the ring in 1 place all nodes can still be reached. So no router will go down. If you cut it in 2 places you loose part of the ring and only get alerts for the nodes directly on the other side of the cuts from your perspective. If you alert on unreachable as well then you get all the alerts you tried to get rid of by introducing the parent relation in the first place. So don't use them. You need an additional means of detecting your first cut in the ring as all routers can still be reached at that time and you will never know you had a problem unless you alert on the actual link conditions. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Ring topology parent/child relation Nagios
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Mihai Tanasescu wrote: | This problem should not exist. | Nagios -- Router A -- Router B uplink1+2 ring (and Router B is in a | ring topology which closes in it) | | http://tinypic.com/view.php?pic=11uhx7as=3 (this is the logical layout) | | Yes. But if you cut the 2 uplinks from Router B, then the Nagios machine | will see Router B as up but will not be able to reach any other router | from the ring and will thus alert that all other routers are down (which | is not true). | I mean having split the ring into the 2 halves you suggested that: | C has parent B, D has parent C, E has parent D | G has parent B, F has parent G | = B up but B uplinks to C and G down - alerts that C and G are down | although they aren't | | Can this be eliminated ? (I'm sure the solution should be simple and | obvious but I'm not being as careful as I should to see it) A ring config is a nightmare from the perspective of Nagios. The maths simply do not work. The whole parent concept does not work for a ring. The best you can do is some half way concept that will never show the proper state in all cases. Building a config to keep the amount of down reports to a minimum is not a simple thing. The key is to cut thing in half and make sure you get the timing right. Each node further away must wait longer to go from soft fail to hard fail state. The manual handdles that subject and it is mandatory to read it before you even try to use the parent feature. So either spend many hours in perfecting a model to get a half way there solution or accept the extra down reports and learn to interprete them as an exact way of telling where you ring did break up. There is no simple solution. Hugo. - -- [EMAIL PROTECTED] http://hugo.vanderkooij.org/ PGP/GPG? Use: http://hugo.vanderkooij.org/0x58F19981.asc A: Yes. Q: Are you sure? A: Because it reverses the logical flow of conversation. Q: Why is top posting frowned upon? Bored? Click on http://spamornot.org/ and rate those images. -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.7 (GNU/Linux) iD8DBQFIKLfgBvzDRVjxmYERAsphAJ0R79rfSgtvCTNXwT0Iaxolv+2S3gCeO4fv Ut/6lXS+4+udsR2pUbMGY/o= =1Qrk -END PGP SIGNATURE- - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] is there a maximum length for performance data?
I didn't see much info out there about a limit, but I did find this link, which (along with my findings in the status.log file) leads me to believe that the 330 character limit is imposed by Nagios per se, rather than the ndo2db add-on: http://tinyurl.com/4lpeog. I guess I'll be writing a less verbose check_disk plugin now... On Mon, May 12, 2008 at 4:28 PM, Frank J. Gómez [EMAIL PROTECTED] wrote: I confirmed this 330 character limit on another server. However, I'm trying to determine whether the limit is being imposed by the ndo2db add-on or someplace else -- not that it matters, I guess... I'll probably just work around the limitation. I noticed that the perfdata is also truncated in this file: /var/nagios/status.log. I'm not sure where Nagios logs data though... Anyone? Thanks! -Frank On Mon, May 12, 2008 at 3:42 PM, Frank J. Gómez [EMAIL PROTECTED] wrote: Hello, I'm using Nagios 2.10 and the ndo2db add-on in a Gentoo environment, and I'm running into problems with the check_disk command, which I'm executing via NRPE. I suspect I may actually be hitting a limitation of the ndo2db add-on, though... In the Nagios config file for server1, I have: define service{ host_name server2,server1,server3 service_description disk usage check_command check_nrpe!check_disk use serviceTemplate } In the NRPE config file on server2, I have: command[check_disk]=/usr/nagios/libexec/check_disk -w 40% -c 20% In the database, the nagios_servicechecks table shows truncated perfdata for checks on server2. The output field is intact: DISK CRITICAL - free space: / 3806 MB (16% inode=93%): /dev 1005 MB (100% inode=98%): /home 85167 MB (77% inode=99%): /usr/portage 85167 MB (77% inode=99%): /dev/shm 1005 MB (100% inode=100%): However, the perfdata field is truncated: /=19666MB;14082;18776;93;23471 /dev=0MB;602;803;97;1004 /home=25220MB;66232;88309;98;110387 /usr/portage=25220MB;66232;88309;98;110387 / The database is set up to allow 255 characters for the perfdata field -- the above, truncated value is only 136 characters long. When I run the following command from server1's CLI, I get what I expect, without truncation: CLI: /usr/nagios/libexec/check_nrpe -H dev -c check_disk Result: DISK CRITICAL - free space: / 3804 MB (16% inode=93%); /dev 1005 MB (100% inode=98%); /home 85167 MB (77% inode=99%); /usr/portage 85167 MB (77% inode=99%); /dev/shm 1005 MB (100% inode=100%);| /=19667MB;14082;18776;93;23471 /dev=0MB;602;803;97;1004 /home=25220MB;66232;88309;98;110387 /usr/portage=25220MB;66232;88309;98;110387 /dev/shm=0MB;602;803;99;1004 The CLI result is 357 characters long. If you truncate it where the truncation occurs in the database, you end up with a string exactly 330 characters long. I wonder if that is some sort of magic number for the ndo2db add-on. I suspect that there isn't any such limitation in the check_disk or check_nrpe commands themselves, since the full output is displayed when called from the command line. I hope I didn't supply so much detail that nobody bothers to read this! Has anyone experienced something similar? -Frank - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null