[Nagios-users] Current-load plugin gives odd output
I have Nagios 3.3.1, and current load trips critical on some machines all the time, however the load numbers do not look very high. This is the service definition I use for all machines, and it works on about 3/4ths of them: define service{ use local-service host_name TEMPLATE-HOSTNAME service_description Current Load check_command check_by_ssh!22!/usr/local/nagios/libexec/check_load!5.0,4.0,3.0!10.0,6.0,4.0 notifications_enabled 1 max_check_attempts 3 check_interval 5 retry_interval 3 check_period24x7 notification_interval 15 notification_period 24x7 notification_optionsw,c,r contact_groups admins register1 } On remote machines, I install the nagios plugins tarball, but not the nagios tarball. As I say this works on 3/4 of the machines, and doesn't always fail on the machines it traditionally fails on. Example output to the front-end: Current Load Notifications for this service have been disabled CRITICAL09-12-2012 09:57:03 4d 23h 42m 14s 3/3 CRITICAL - load average: 0.00, 0.01, 0.05 The /usr/local/nagios folder on the remote machines is chown -R nagios:nagios Other check_by_ssh plugins are working on the machines where this one is failing. It is a mystery. -- Wolf Halton This Apt Has Super Cow Powers - http://sourcefreedom.com Open-Source Software in Libraries - http://FOSS4Lib.org Advancing Libraries Together - http://LYRASIS.org Apache Open Office Developer wolfhal...@apache.org -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Nagios Config Help
On Mon, Jul 2, 2012 at 12:39 AM, Nair vman...@rediffmail.com wrote: Hi All Is there any harm is setting retry check interval plus max check attempt greater than normal check interval. Say like configs below: Config#1 normal_check_interval=10min retry_check_interval=5min max_check_attempt=4 Config#2 normal_check_interval=5min retry_check_interval=10min max_check_attempt=3 Thank you in advance. Regards Nair http://sigads.rediff.com/RealMedia/ads/click_nx.ads/www.rediffmail.com/signatureline.htm@Middle? Follow *Rediff Deal ho jaye!http://track.rediff.com/click?url=___http://dealhojaye.rediff.com?sc_cid=rediffmailsignature___cmp=signaturelnk=rediffmailsignaturenewservice=deals * to get exciting offers in your city everyday. -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null Nair, As all good consultants would say, It depends. There is a small network load for every test so setting shorter intervals increases the load. If you double the number of tests in an hour, you double the load. So if you have a large number of tests on a large number of machines, doubling the frequency of the tests is probably a bad idea. What value do you think would accrue from increasing the frequency of tests? In my network, Nagios is set to start sending email notifications if the failure condition persists more than 3 test periods, i.e., the remote disk root partition is unreachable for longer than 30 minutes. This particular test fails on my ftp server during large file transfers, with the message, plug-in timed out. This has not turned out to be an actual problem, but is an artifact of flooding the network with file-transfer traffic. In this case it is more sensible to let the file transfers go through than to know to an utmost certainty that the drive is not too full. YMMV Wolf -- This Apt Has Super Cow Powers - http://sourcefreedom.com Open-Source Software in Libraries - http://FOSS4Lib.org Advancing Libraries Together - http://LYRASIS.org Apache Open Office Developer wolfhal...@apache.org -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Root_partition check not reading correctly
I am not sure how I am launching the service, which I admit is bad, but I inherited the setup. I am not sure if it ever worked properly, because it took a catastrophic rampant application running amok, spewing 9GB files to clue me in. I am running nagios 3.3.1 and nagios-plugins 1.4.15 the application is check_local_disk but I think that must have been a check from the previous nagios the one in libexec is check_disk so I tried [code] define service{ use local-service host_name LTS-MASTERKEY-000 service_description Root Partition check_command check_disk -w 20% -c 10% --path=/ notifications_enabled 1 max_check_attempts 3 check_interval 5 retry_interval 3 check_period24x7 notification_interval 15 notification_period 24x7 notification_optionsw,c,r contact_groups admins } [/code] and nagios will not start - says there is a configuration error On Fri, Mar 9, 2012 at 7:49 AM, Claudio Kuenzler c...@claudiokuenzler.com wrote: Please show the service definition. How do you launch the check? By ssh, by nrpe? Seems you're using the same IP address or dns name as the hostname value. Can you verify this? On Fri, Mar 9, 2012 at 1:19 PM, Wolf Halton wolf.hal...@gmail.com wrote: All my machines show a similar output, regardless of how much is available on their root partitions. Root Partition OK 03-09-2012 07:11:08 28d 22h 18m 15s 1/3 DISK OK - free space: / 15903 MB (86% inode=93%): Up to and including ones that are 100% full. No alarms - ever. Is a client app needed on the monitored clients that has not been mentioned? -Wolf -- This Apt Has Super Cow Powers - http://sourcefreedom.com Advancing Libraries Together - http://LYRASIS.org -- Virtualization Cloud Management Using Capacity Planning Cloud computing makes use of virtualization - but cloud computing also focuses on allowing computing to be delivered as a service. http://www.accelacomm.com/jaw/sfnl/114/51521223/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- Virtualization Cloud Management Using Capacity Planning Cloud computing makes use of virtualization - but cloud computing also focuses on allowing computing to be delivered as a service. http://www.accelacomm.com/jaw/sfnl/114/51521223/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- This Apt Has Super Cow Powers - http://sourcefreedom.com Advancing Libraries Together - http://LYRASIS.org -- This SF email is sponsosred by: Try Windows Azure free for 90 days Click Here http://p.sf.net/sfu/sfd2d-msazure ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Root_partition check not reading correctly
I found it check_command check_disk! -w 20% -c 10% --path=/ It was missing the ! Thanks for helping me sort it out. Wolf On Mon, Mar 19, 2012 at 10:37 AM, Wolf Halton wolf.hal...@gmail.com wrote: I am not sure how I am launching the service, which I admit is bad, but I inherited the setup. I am not sure if it ever worked properly, because it took a catastrophic rampant application running amok, spewing 9GB files to clue me in. I am running nagios 3.3.1 and nagios-plugins 1.4.15 the application is check_local_disk but I think that must have been a check from the previous nagios the one in libexec is check_disk so I tried [code] define service{ use local-service host_name LTS-MASTERKEY-000 service_description Root Partition check_command check_disk -w 20% -c 10% --path=/ notifications_enabled 1 max_check_attempts 3 check_interval 5 retry_interval 3 check_period 24x7 notification_interval 15 notification_period 24x7 notification_options w,c,r contact_groups admins } [/code] and nagios will not start - says there is a configuration error On Fri, Mar 9, 2012 at 7:49 AM, Claudio Kuenzler c...@claudiokuenzler.com wrote: Please show the service definition. How do you launch the check? By ssh, by nrpe? Seems you're using the same IP address or dns name as the hostname value. Can you verify this? On Fri, Mar 9, 2012 at 1:19 PM, Wolf Halton wolf.hal...@gmail.com wrote: All my machines show a similar output, regardless of how much is available on their root partitions. Root Partition OK 03-09-2012 07:11:08 28d 22h 18m 15s 1/3 DISK OK - free space: / 15903 MB (86% inode=93%): Up to and including ones that are 100% full. No alarms - ever. Is a client app needed on the monitored clients that has not been mentioned? -Wolf -- This Apt Has Super Cow Powers - http://sourcefreedom.com Advancing Libraries Together - http://LYRASIS.org -- Virtualization Cloud Management Using Capacity Planning Cloud computing makes use of virtualization - but cloud computing also focuses on allowing computing to be delivered as a service. http://www.accelacomm.com/jaw/sfnl/114/51521223/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- Virtualization Cloud Management Using Capacity Planning Cloud computing makes use of virtualization - but cloud computing also focuses on allowing computing to be delivered as a service. http://www.accelacomm.com/jaw/sfnl/114/51521223/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- This Apt Has Super Cow Powers - http://sourcefreedom.com Advancing Libraries Together - http://LYRASIS.org -- This Apt Has Super Cow Powers - http://sourcefreedom.com Advancing Libraries Together - http://LYRASIS.org -- This SF email is sponsosred by: Try Windows Azure free for 90 days Click Here http://p.sf.net/sfu/sfd2d-msazure ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Root_partition check not reading correctly
More stuff I am in commands.cfg and added check_disk as a command to check discks on the remote server as well as check_local_disk, which I understand to be about checking the nagios server disk. [code] define command{ command_namecheck_local_disk command_line$USER1$/check_disk -w $ARG1$ -c $ARG2$ -p $ARG3$ } define command{ command_namecheck_disk command_line$USER1$/check_disc -w $ARGS$ -c $ARGS$ -p $ARGS$ #command_line $USER1$/check_disc } The front-end error is now (Return code of 127 is out of bounds - plugin may be missing) On Mon, Mar 19, 2012 at 11:27 AM, Wolf Halton wolf.hal...@gmail.com wrote: I found it check_command check_disk! -w 20% -c 10% --path=/ It was missing the ! Thanks for helping me sort it out. Wolf On Mon, Mar 19, 2012 at 10:37 AM, Wolf Halton wolf.hal...@gmail.com wrote: I am not sure how I am launching the service, which I admit is bad, but I inherited the setup. I am not sure if it ever worked properly, because it took a catastrophic rampant application running amok, spewing 9GB files to clue me in. I am running nagios 3.3.1 and nagios-plugins 1.4.15 the application is check_local_disk but I think that must have been a check from the previous nagios the one in libexec is check_disk so I tried [code] define service{ use local-service host_name LTS-MASTERKEY-000 service_description Root Partition check_command check_disk -w 20% -c 10% --path=/ notifications_enabled 1 max_check_attempts 3 check_interval 5 retry_interval 3 check_period 24x7 notification_interval 15 notification_period 24x7 notification_options w,c,r contact_groups admins } [/code] and nagios will not start - says there is a configuration error On Fri, Mar 9, 2012 at 7:49 AM, Claudio Kuenzler c...@claudiokuenzler.com wrote: Please show the service definition. How do you launch the check? By ssh, by nrpe? Seems you're using the same IP address or dns name as the hostname value. Can you verify this? On Fri, Mar 9, 2012 at 1:19 PM, Wolf Halton wolf.hal...@gmail.com wrote: All my machines show a similar output, regardless of how much is available on their root partitions. Root Partition OK 03-09-2012 07:11:08 28d 22h 18m 15s 1/3 DISK OK - free space: / 15903 MB (86% inode=93%): Up to and including ones that are 100% full. No alarms - ever. Is a client app needed on the monitored clients that has not been mentioned? -Wolf -- This Apt Has Super Cow Powers - http://sourcefreedom.com Advancing Libraries Together - http://LYRASIS.org -- Virtualization Cloud Management Using Capacity Planning Cloud computing makes use of virtualization - but cloud computing also focuses on allowing computing to be delivered as a service. http://www.accelacomm.com/jaw/sfnl/114/51521223/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- Virtualization Cloud Management Using Capacity Planning Cloud computing makes use of virtualization - but cloud computing also focuses on allowing computing to be delivered as a service. http://www.accelacomm.com/jaw/sfnl/114/51521223/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- This Apt Has Super Cow Powers - http://sourcefreedom.com Advancing Libraries Together - http://LYRASIS.org -- This Apt Has Super Cow Powers - http://sourcefreedom.com Advancing Libraries Together - http://LYRASIS.org -- This Apt Has Super Cow Powers - http://sourcefreedom.com Advancing Libraries Together - http://LYRASIS.org -- This SF email is sponsosred by: Try Windows Azure free for 90 days Click Here http://p.sf.net/sfu/sfd2d-msazure ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include
Re: [Nagios-users] Root_partition check not reading correctly
I checked that, and I had made that error, however when I fixed that error, I still have a reading only from the nagios server rather than the masterkey server --- In my spare time, I found check_by_ssh and added that to the commands.cfg and its's counterpart lines in my service definitions for that server. The error there is that the -H $HOSTNAME$ that is suggested in check_by_ssh is wanting a FQDN or an IP address. It says the hostname is invalid (and it is) Remote Root PartitionUNKNOWN03-19-2012 14:58:39 0d 0h 6m 51s 3/3 check_by_ssh: Invalid hostname/address - LTS-MASTERKEY-000 I changed $HOSTNAME$ to $HOSTADDRESS$ which gives me an access denied error, so at least the system is believing in the HOSTADDRESS Remote Root Partition UNKNOWN 03-19-2012 15:16:56 0d 0h 27m 11s 3/3 Remote command execution failed: Permission denied, please try again. Running direct gets me a real answer: # /usr/local/nagios/libexec/check_by_ssh -H 192.168.10.160 /nagios/check_diskfree.sh sda2 70 90 OK. Free Space: 24GB, 95% What user is the thing expecting? On Mon, Mar 19, 2012 at 1:31 PM, Andrew Thompson and...@fulgent.co.uk wrote: Check out the command lines and the way you are spelling disk/disc as you have 2 different spellings. command_line $USER1$/check_disk -w $ARG1$ -c $ARG2$ -p $ARG3$ command_line $USER1$/check_disc -w $ARGS$ -c $ARGS$ -p $ARGS$ Check the actual name of the file in the libexec folder. Also what is $ARGS$ shouldn’t it be a $ARGnumber$ as you correctly have in the top example? -Original Message- From: Wolf Halton [mailto:wolf.hal...@gmail.com] Sent: 19 March 2012 16:50 To: Nagios Users List Subject: Re: [Nagios-users] Root_partition check not reading correctly More stuff I am in commands.cfg and added check_disk as a command to check discks on the remote server as well as check_local_disk, which I understand to be about checking the nagios server disk. [code] define command{ command_name check_local_disk command_line $USER1$/check_disk -w $ARG1$ -c $ARG2$ -p $ARG3$ } define command{ command_name check_disk command_line $USER1$/check_disc -w $ARGS$ -c $ARGS$ -p $ARGS$ #command_line $USER1$/check_disc } The front-end error is now (Return code of 127 is out of bounds - plugin may be missing) On Mon, Mar 19, 2012 at 11:27 AM, Wolf Halton wolf.hal...@gmail.com wrote: I found it check_command check_disk! -w 20% -c 10% --path=/ It was missing the ! Thanks for helping me sort it out. Wolf On Mon, Mar 19, 2012 at 10:37 AM, Wolf Halton wolf.hal...@gmail.com wrote: I am not sure how I am launching the service, which I admit is bad, but I inherited the setup. I am not sure if it ever worked properly, because it took a catastrophic rampant application running amok, spewing 9GB files to clue me in. I am running nagios 3.3.1 and nagios-plugins 1.4.15 the application is check_local_disk but I think that must have been a check from the previous nagios the one in libexec is check_disk so I tried [code] define service{ use local-service host_name LTS-MASTERKEY-000 service_description Root Partition check_command check_disk -w 20% -c 10% --path=/ notifications_enabled 1 max_check_attempts 3 check_interval 5 retry_interval 3 check_period 24x7 notification_interval 15 notification_period 24x7 notification_options w,c,r contact_groups admins } [/code] and nagios will not start - says there is a configuration error On Fri, Mar 9, 2012 at 7:49 AM, Claudio Kuenzler c...@claudiokuenzler.com wrote: Please show the service definition. How do you launch the check? By ssh, by nrpe? Seems you're using the same IP address or dns name as the hostname value. Can you verify this? On Fri, Mar 9, 2012 at 1:19 PM, Wolf Halton wolf.hal...@gmail.com wrote: All my machines show a similar output, regardless of how much is available on their root partitions. Root Partition OK 03-09-2012 07:11:08 28d 22h 18m 15s 1/3 DISK OK - free space: / 15903 MB (86% inode=93%): Up to and including ones that are 100% full. No alarms - ever. Is a client app needed on the monitored clients that has not been mentioned? -Wolf -- This Apt Has Super Cow Powers - http://sourcefreedom.com Advancing Libraries Together - http://LYRASIS.org --- --- Virtualization Cloud Management Using Capacity Planning Cloud computing makes use of virtualization - but cloud computing also
[Nagios-users] Root_partition check not reading correctly
All my machines show a similar output, regardless of how much is available on their root partitions. Root Partition OK 03-09-2012 07:11:08 28d 22h 18m 15s 1/3 DISK OK - free space: / 15903 MB (86% inode=93%): Up to and including ones that are 100% full. No alarms - ever. Is a client app needed on the monitored clients that has not been mentioned? -Wolf -- This Apt Has Super Cow Powers - http://sourcefreedom.com Advancing Libraries Together - http://LYRASIS.org -- Virtualization Cloud Management Using Capacity Planning Cloud computing makes use of virtualization - but cloud computing also focuses on allowing computing to be delivered as a service. http://www.accelacomm.com/jaw/sfnl/114/51521223/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null