Re: [Nagios-users] check_procs through nrpe gives wrong results
Hi, The problem is fixed and the monitoring is enabled based on args in the running process using ' -a ' option. Defined the following in... In commands definitions: define command { command_name nrpe_check_procs4 command_line $USER1$/check_nrpe -n -H $HOSTADDRESS$ -c check_procs4 -a $ARG1$ $ARG2$ $ARG3$ } In nrpe.cfg command[check_procs4]=/usr/lib/nagios/plugins/check_procs -w $ARG1$ -c $ARG2$ -a $ARG3$ Checking via command line: /usr/lib64/nagios/plugins/check_nrpe -n -H ORACLESERVER -c check_procs4 -a 1:4 1:8 ora_pmon_orcldb Regards, Ajay From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Phil Costelloe Sent: Thursday, May 24, 2007 6:56 PM To: nagios-users@lists.sourceforge.net Subject: Re: [Nagios-users] check_procs through nrpe gives wrong results The exact process string is appearing with the options you are giving to ps. However Nagios may be using different options or even a binary it builds itself. You can find out by grepping PS_COMMAND in config.h in the plugins build directory. On Fedora Core 4, that gives: #define PS_COMMAND /bin/ps axwo 'stat uid pid ppid vsz rss pcpu comm args' On Solaris, it gives: #define PS_COMMAND /usr/local/nagios/libexec/pst3 Phil From: M V Ajay (vMoksha) [mailto:[EMAIL PROTECTED] Sent: 24 May 2007 17:04 To: Phil Costelloe; nagios-users@lists.sourceforge.net Subject: RE: [Nagios-users] check_procs through nrpe gives wrong results Hi, Exact process string is appearing in the ps output. oracle9882 1 0 May18 ?00:01:40 ora_pmon_orcldb Nagios: 2.7 on x86_64 (RHEL4) Nagios plugins: 1.4.5 check_procs : (nagios-plugins 1.4.5) 1.54 NRPE: 2.6 (RHEL4) Nagios server is on x86_64 and monitored server is an i386. And RPM packages were built for the respective platforms. Plugins were built as RPM packages on similar OS. Also there is only one 'ps' binary which is /bin/ps. Regards, Ajay From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Phil Costelloe Sent: Thursday, May 24, 2007 5:40 PM To: nagios-users@lists.sourceforge.net Subject: Re: [Nagios-users] check_procs through nrpe gives wrong results That explains why the process you want to monitor is called ora_pmon_orcldb but not why check_procs can't detect it. That's likely to be down to that exact string not appearing in the ps output that check_procs is using, maybe due to truncation. Back to basics. What version of Nagios are you using? What version of the plugins? What version of nrpe? How did you compile/install the plugins that are on the remote server, specifically were they compiled either on that server or on a server running the same OS? -- Phil Costelloe Foundation IT, Hermitage Berkshire RG18 9SE From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of M V Ajay (vMoksha) Sent: 24 May 2007 12:28 To: nagios-users@lists.sourceforge.net Subject: Re: [Nagios-users] check_procs through nrpe gives wrong results Hi, I found the reason for the behaviour. Oracle have a process naming mechanism that enables you to distinguish between multiple instances of the database. With Oracle, when you open an instance, the $ORACLE_HOME/bin/oracle executable renames itself using the UNIX environment variable ORACLE_SID for a given database. This variable is used in generating Oracle process names: ora_process_name_$ORACLE_SID. So is there a away to monitor process (using the process name) as they appear in 'ps -ef' output? Regards, Ajay From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of M V Ajay (vMoksha) Sent: Wednesday, May 23, 2007 6:41 PM To: nagios-users@lists.sourceforge.net Subject: [Nagios-users] check_procs through nrpe gives wrong results Hi, I am monitoring a Redhat enterprise Linux 4 server running oracle using Nagios and NRPE. When I monitor Oracle PMON process (ora_pmon_orcldb) on the server I get the message that 'PROCS CRITICAL: 0 processes with command name 'ora_pmon_orcldb''. I have confirmed that process 'ora_pmon_orcldb' is actually running on the remote server. The same type of monitoring works for other processes such as 'httpd
Re: [Nagios-users] check_procs through nrpe gives wrong results
Hi, I found the reason for the behaviour. Oracle have a process naming mechanism that enables you to distinguish between multiple instances of the database. With Oracle, when you open an instance, the $ORACLE_HOME/bin/oracle executable renames itself using the UNIX environment variable ORACLE_SID for a given database. This variable is used in generating Oracle process names: ora_process_name_$ORACLE_SID. So is there a away to monitor process (using the process name) as they appear in 'ps -ef' output? Regards, Ajay From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of M V Ajay (vMoksha) Sent: Wednesday, May 23, 2007 6:41 PM To: nagios-users@lists.sourceforge.net Subject: [Nagios-users] check_procs through nrpe gives wrong results Hi, I am monitoring a Redhat enterprise Linux 4 server running oracle using Nagios and NRPE. When I monitor Oracle PMON process (ora_pmon_orcldb) on the server I get the message that 'PROCS CRITICAL: 0 processes with command name 'ora_pmon_orcldb''. I have confirmed that process 'ora_pmon_orcldb' is actually running on the remote server. The same type of monitoring works for other processes such as 'httpd'. Nagios command definition: define command { command_name nrpe_check_procs3 command_line $USER1$/check_nrpe -n -H $HOSTADDRESS$ -c check_procs3 -a $ARG1$ $ARG2$ $ARG3$ } NRPE configuration: dont_blame_nrpe=1 command[check_procs3]=/usr/lib/nagios/plugins/check_procs -w $ARG1$ -c $ARG2$ -C $ARG3$ From the command line I see the following message: [EMAIL PROTECTED] ~]$ /usr/lib64/nagios/plugins/check_nrpe -H oracleserver -n -c check_procs3 -a 1:4 1:8 ora_pmon_orcldb PROCS CRITICAL: 0 processes with command name 'ora_pmon_orcldb' Any idea what is wrong? Thanks, Ajay Legal Notice: This electronic mail and its attachments are intended solely for the person(s) to whom they are addressed and contain information which is confidential or otherwise protected from disclosure, except for the purpose for which they are intended. Dissemination, distribution, or reproduction by anyone other than the intended recipients is prohibited and may be illegal. If you are not an intended recipient, please immediately inform the sender and return the electronic mail and its attachments and destroy any copies which may be in your possession. UCB screens electronic mails for viruses but does not warrant that this electronic mail is free of any viruses. UCB accepts no liability for any damage caused by any virus transmitted by this electronic mail. - Legal Notice: This electronic mail and its attachments are intended solely for the person(s) to whom they are addressed and contain information which is confidential or otherwise protected from disclosure, except for the purpose for which they are intended. Dissemination, distribution, or reproduction by anyone other than the intended recipients is prohibited and may be illegal. If you are not an intended recipient, please immediately inform the sender and return the electronic mail and its attachments and destroy any copies which may be in your possession. UCB screens electronic mails for viruses but does not warrant that this electronic mail is free of any viruses. UCB accepts no liability for any damage caused by any virus transmitted by this electronic mail. - - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] check_procs through nrpe gives wrong results
Hi, I am monitoring a Redhat enterprise Linux 4 server running oracle using Nagios and NRPE. When I monitor Oracle PMON process (ora_pmon_orcldb) on the server I get the message that 'PROCS CRITICAL: 0 processes with command name 'ora_pmon_orcldb''. I have confirmed that process 'ora_pmon_orcldb' is actually running on the remote server. The same type of monitoring works for other processes such as 'httpd'. Nagios command definition: define command { command_name nrpe_check_procs3 command_line $USER1$/check_nrpe -n -H $HOSTADDRESS$ -c check_procs3 -a $ARG1$ $ARG2$ $ARG3$ } NRPE configuration: dont_blame_nrpe=1 command[check_procs3]=/usr/lib/nagios/plugins/check_procs -w $ARG1$ -c $ARG2$ -C $ARG3$ From the command line I see the following message: [EMAIL PROTECTED] ~]$ /usr/lib64/nagios/plugins/check_nrpe -H oracleserver -n -c check_procs3 -a 1:4 1:8 ora_pmon_orcldb PROCS CRITICAL: 0 processes with command name 'ora_pmon_orcldb' Any idea what is wrong? Thanks, Ajay - Legal Notice: This electronic mail and its attachments are intended solely for the person(s) to whom they are addressed and contain information which is confidential or otherwise protected from disclosure, except for the purpose for which they are intended. Dissemination, distribution, or reproduction by anyone other than the intended recipients is prohibited and may be illegal. If you are not an intended recipient, please immediately inform the sender and return the electronic mail and its attachments and destroy any copies which may be in your possession. UCB screens electronic mails for viruses but does not warrant that this electronic mail is free of any viruses. UCB accepts no liability for any damage caused by any virus transmitted by this electronic mail. - - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Problem: Perfmon counter gives wrong results w hile using check_nt
Dear All: I am using pnsclient on Windows servers as Nagios agent. Any Solution for this problem? Regards, Ajay -Original Message- From: Anthony Montibello [mailto:[EMAIL PROTECTED] Sent: Wednesday, December 06, 2006 9:38 PM To: nagios-users@lists.sourceforge.net Cc: M V Ajay (vMoksha) Subject: Re: [Nagios-users] Problem: Perfmon counter gives wrong results while using check_nt Hi, If your windows agent is NC_NEt, I know the reason why, It has to do with delay between taking samples of the performance counter object before reporting it to back to check_nt To elaborate, most Performance monitor objects in windows takes a sample every one second then return the caalculated results to Performance monitor for graphing. NC_NEt does not wait a full second for this to happen, and in the current inplementation it is actually too quick between samples and thus gives a result of 0. I have experimented with different delay times (in Miliseconds) and found a good speed/accuraccy compromise that I will implement as adjustable variable in the configuration of the next release of NC_NEt. Note: NC_NEt would report a -1 if the it could not locate the counter because of an error or spelling. but a 0 means really 0 or the above mentioned bug. Just to defend NC_NEt. in the past, this bug slipped through due to the Performance counter documentation of Dot NEt specifying just to call the get sample function again, and the documentation failed to mention any standards as to a minimum delay between requests. Also most of NC_NEt testing in the past has been done offline of Nagios. I am about to install nagios for testing, and hopefully would have that running by the time I get the updates into NC_Net. As a workaround you may be able to retrieve the value you need through WMI or by creating a custom performance counter that is of a different type but gets its value from the one that is mis-informing NC_NEt. NC_NEt is availible at sourceforge: http://sourceforge.net/projects/nc-net http://sourceforge.net/projects/nc-net If it was a different Windows Client then I do not know the source of the reason but next generation of NC_Net should work more efficiently. Thank you TOny (Author of NC_NEt) On 12/5/06, M V Ajay (vMoksha) [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: Dear All: I am having problem in monitoring a particular windows perfmon counter. All other counter gives proper results. I am using check_nt (check_nt (nagios-plugins 1.4.3) 1.39) with Nagios Agent running on WIndows NT 4.0, Exchange 5.5 Below is the Microsoft Exchange counter: Object: MSExchangeMTA Connections Counter: Queue Length Instance: (INTERNET MAIL CONNECTOR (MAILHOST001)) When I run the plugin command mannualy I get, ./check_nt -s nagiosagentpassword -H mailhost001 -v COUNTER -l \MSExchangeMTA Connections(INTERNET MAIL CONNECTOR (MAILHOST001))\Queue Length 0.00 and output is always 0. But if I use windows perfmon, I get the proper results. Any idea why this is not working? Regards, Ajay - Legal Notice: This electronic mail and its attachments are intended solely for the person(s) to whom they are addressed and contain information which is confidential or otherwise protected from disclosure, except for the purpose for which they are intended. Dissemination, distribution, or reproduction by anyone other than the intended recipients is prohibited and may be illegal. If you are not an intended recipient, please immediately inform the sender and return the electronic mail and its attachments and destroy any copies which may be in your possession. UCB screens electronic mails for viruses but does not warrant that this electronic mail is free of any viruses. UCB accepts no liability for any damage caused by any virus transmitted by this electronic mail. - - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT business topics through brief surveys - and earn cash http://www.techsay.com/default.php?page=join.php http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV p=sourceforgeCID=DEVDEV ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net mailto:Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null - Legal Notice: This electronic mail and its attachments are intended solely for the person(s) to whom they are addressed
[Nagios-users] NSCA --daemon problem: too many child process
Title: NSCA --daemon problem: too many child process Dear All: I have a Nagios distributed setup (Central: Solaris 9, sparc , 1 GB RAM, Nagios 2.4, nsca 2.6/Distributed: Solaris 10, sparc , 1 GB RAM, Nagios 2.4, nsca 2.6) monitoring more than 600 servers and 2000+ services. I am running nsca in daemon mode and have noticed that it spawned child processes around 2800+ and my system swap usage was full. None of the service checks was happening there on. I could see lot of WARNING messages in nagios.log like, [1154408570] Warning: The results of service 'disk_d' on host 'server001' are stale by 21 seconds (threshold=862 seconds). I'm forcing an immediate check of the service. Also I could see lot of messages in /var/adm/messages like this, Aug 2 00:28:02 nagiosserver genunix: [ID 470503 kern.warning] WARNING: Sorry, no swap space to grow stack for pid 13531 (sshd) I had to login thorugh system console and kill nsca and restart for rectifying this problem. Any idea what would have caused this problem? Thanks and Regards, Ajay - Legal Notice: This electronic mail and its attachments are intended solely for the person(s) to whom they are addressed and contain information which is confidential or otherwise protected from disclosure, except for the purpose for which they are intended. Dissemination, distribution, or reproduction by anyone other than the intended recipients is prohibited and may be illegal. If you are not an intended recipient, please immediately inform the sender and return the electronic mail and its attachments and destroy any copies which may be in your possession. UCB screens electronic mails for viruses but does not warrant that this electronic mail is free of any viruses. UCB accepts no liability for any damage caused by any virus transmitted by this electronic mail. - - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] automatic discovery of windows services
Title: automatic discovery of windows services Hi All, Is there any way (tool) to discover all services running on a Windows box and prepare Nagios configuration files to enable monitoring of all windows services? Thanks and in advance! Ajay - Legal Notice: This electronic mail and its attachments are intended solely for the person(s) to whom they are addressed and contain information which is confidential or otherwise protected from disclosure, except for the purpose for which they are intended. Dissemination, distribution, or reproduction by anyone other than the intended recipients is prohibited and may be illegal. If you are not an intended recipient, please immediately inform the sender and return the electronic mail and its attachments and destroy any copies which may be in your possession. UCB screens electronic mails for viruses but does not warrant that this electronic mail is free of any viruses. UCB accepts no liability for any damage caused by any virus transmitted by this electronic mail. -