Re: [Nagios-users] check_Openmanage trouble
"Weberskirch, Timo" writes: > the check_openmanage –no-storage options works (surely without any physical > disk… :( ). > > I was on the phone with the Dell Pro Support. They told me that the MD3 > only schows the raid disk Information (not the physical > disk informations) to external devices. > > Also they told me that there is no way to filter out the SAS-Card in OMSA. > > I have to live with „—no-storage“ option… Hmm.. Ok, so this particular server doesn't have any storage other than the SAS card (connected to the MD3xxx), which OMSA can't manage? If so, that is exactly what the '--no-storage' option is for :) You should use the '--no-storage' option if 1. The server has no storage, which is entirely possible; or 2. The only storage present is something that OMSA doesn't recognize Regards, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- Get 100% visibility into Java/.NET code with AppDynamics Lite! It's a free troubleshooting tool designed for production. Get down to code-level detail for bottlenecks, with <2% overhead. Download for free and get started troubleshooting in minutes. http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clktrk ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_Openmanage trouble
"Weberskirch, Timo" writes: > thank you all for your fast and helpful response. Unfortunately the problem > persists. > > Is there a way to filter out the (in my opinion faulty) SAS card? Storage components are tightly interconnected, so from the plugin side your only option is to not check storage at all: check_openmanage --no-storage But I still believe that this is a software problem, i.e. in OMSA. Regards, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- Get 100% visibility into Java/.NET code with AppDynamics Lite! It's a free troubleshooting tool designed for production. Get down to code-level detail for bottlenecks, with <2% overhead. Download for free and get started troubleshooting in minutes. http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clktrk ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_Openmanage trouble
Rich writes: > Usually, when I've seen this, it's been after doing an upgrade of an > existing OMSA install (<= 6.x to 7.x). > > In general, I haven't found a good way to resolve it other than > automating a complete uninstall of OMSA prior to installing the newer > version. Yes, I think the logical next step in this case is to do a complete uninstall, then reinstall of OMSA on the host. The problem is in OMSA and must be fixed there. The plugin is simply complaining that OMSA isn't responding as expected. Regards, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- Get 100% visibility into Java/.NET code with AppDynamics Lite! It's a free troubleshooting tool designed for production. Get down to code-level detail for bottlenecks, with <2% overhead. Download for free and get started troubleshooting in minutes. http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clktrk ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_Openmanage trouble
"Weberskirch, Timo" writes: > maybe one of you has the same problem with the check_openmanage plugin… > > Last week we installed two new Dell PowerEdge R720 with OMSA v 7.3.0 > (check_openmange version: 3.7.10). > > Everytime I try to check my Server I get this error message: > > “SNMP ERROR [storage / pdisk]: Requested entries are empty or do not exist.” Hello Timo, There seems to be some sort of issue with the Openmanage installation on this server. First thing to do is double-check that everything is installed properly. On a RHEL6 system, the following storage related RPM packages should be installed: # rpm -qa|grep srvadmin-storage srvadmin-storageservices-7.3.0-4.4.1.el6.x86_64 srvadmin-storage-7.3.0-4.93.2.el6.x86_64 srvadmin-storage-cli-7.3.0-4.93.2.el6.x86_64 srvadmin-storageservices-snmp-7.3.0-4.4.1.el6.x86_64 srvadmin-storage-snmp-7.3.0-4.93.2.el6.x86_64 srvadmin-storageservices-cli-7.3.0-4.4.1.el6.x86_64 Do you see any physical disks in the Openmanage Web Console? (point your browser to https://:1311/ and log in as root) Regards, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- Get your SQL database under version control now! Version control is standard for application code, but databases havent caught up. So what steps can you take to put your SQL databases under version control? Why should you start doing it? Read more to find out. http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clktrk ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] rpmbuild nagios-3.5.0
alexus writes: > I'm unable to build RPM w/ nagios 3.5.0, last one that worked for me was > 3.2.3. > any ideas/suggestions? I'd recommend using the already prebuilt package for rhel6 which is available from EPEL[1]. Add the EPEL repo and you can simply do "yum install nagios" and be done :) [1] http://fedoraproject.org/wiki/EPEL Cheers, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- See everything from the browser to the database with AppDynamics Get end-to-end visibility with application monitoring from AppDynamics Isolate bottlenecks and diagnose root cause in seconds. Start your free trial of AppDynamics Pro today! http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage improvement request
"John Skarbek" writes: > I?ve recently deployed the check_openmanage script and it works very well. > Except for hosts that run esxi. Unless I?m doing something wrong. You're not doing anything wrong. Openmanage, when deployed on ESXi, doesn't have the necessary capabilities for it to work. > I?ve discovered that Open Manage doesn?t broadcast it?s OID?s through ESXi > like > it would if it were a linux or windows host. However I did find that the > iDRAC7 does have similar snmp responses that I?d like to capture. However > when > pointing check_openmanage to the drac interface, I get the message indicating > that OMSA must not be installed correctly. However, looking into the script I > found: > > my $chassisModelName = '1.3.6.1.4.1.674.10892.1.300.10.1.9.1'; > > Which does indeed NOT exist. However, a similar OID with the same information > we are looking for is located here: > >$chassisModelName = '1.3.6.1.4.1.674.10892.5.1.3.12.0'; Actually, the OID is 1.3.6.1.4.1.674.10892.5.4.300.10.1.9.1. I've toyed around with this a bit, and for the most part you can simply replace "1.3.6.1.4.1.674.10892.1" with "1.3.6.1.4.1.674.10892.5.4". Same goes for storage OIDs, to a degree. > After modifying the script a little bit I was able to get past that, but now > check_openamange is complaining, ?SNMP ERROR [memory]: The requested entries > are empty or do not exist. ? > > I presume the entire set of OID?s is in a different spot when being checked > through the drac versus the standard windows snmp service. I would love to > assist in enhancing this script, but I?m not sure how I should start. Let me > know who I should contact, or feel free to reach out to me to assist with this > awesome plugin. I have a modified prealpha version for testing, available in the test branch in git: http://git.uio.no/git/?p=check_openmanage.git;a=shortlog;h=refs/heads/test Note that it's NOT production ready, I have only done some very limited testing. I had to simplify some stuff: * Storage: The storage OIDs from the iDRAC7 are somewhat different, compared to Openmanage. Some information that the plugin needs is not available, such as numbered identifiers for components (used in blacklisting). There are even some OIDs that aren't present in Openmanage. In short, it's a mess, and the storage bit is very simplistic. Perhaps the missing info will be added in a later firmware release, we can only hope. * ESM health OIDs are missing completely, so ESM health check is omitted. Same for SD card check. To use the new feature you have to specify '--idrac', like this: check_openmanage --idrac -H Test it, break it and tell me what you think :) I've noticed that neither the rollup-status or component-status for controllers catches that the controller is actually degraded from out-of-date firmware. Hopefully it's an anomaly that doesn't apply to other aspects of controllers, or other components. Cheers, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- See everything from the browser to the database with AppDynamics Get end-to-end visibility with application monitoring from AppDynamics Isolate bottlenecks and diagnose root cause in seconds. Start your free trial of AppDynamics Pro today! http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Nagios openmanage ERROR: XML transformation failed
"Lorenz, Stephan" writes: > since installing libxml2, libxml2-devel and curl, the Nagios installation on > our Dell R720xd server reports XML errors. > > > > Problem running 'omreport storage controller': Error! XML Transformation > failed > Problem running 'omreport chassis memory': Error! XML Transformation > failedProblem running 'omreport chassis fans': Error! XML Transformation > failedProblem running 'omreport chassis pwrsupplies': Error! XML > Transformation failedProblem running 'omreport chassis temps': Error! XML > Transformation failedProblem running 'omreport chassis processors': > Error! > XML Transformation failedProblem running 'omreport chassis volts': Error! > XML Transformation failedProblem running 'omreport chassis batteries': > Error! XML Transformation failedProblem running 'omreport chassis > pwrmonitoring': Error! XML Transformation failedProblem running 'omreport > chassis intrusion': Error! XML Transformation failedProblem running > 'omreport chassis removableflashmedia': Error! XML Transformation failed > Chassis Service Tag is bogus: 'N/A' > > > > I am using Nagios 3.5.1, check_openmanage 3.7.9, Openmanage 7.2.0 on Centos > 6.4 > 2.6.32-358.11.1.el6.centos.plus.x86_64. > > > > When I run check_openmanage or omreport manually everything is fine. I tried > to > reinstall nagios-plugins-openmanage and php-xml for a start, but that did not > help. I cannot remove libxml2 and the rest since it is needed elsewhere. > > > > Does anyone have a suggestion of how to fix this error? Given that it works when you run the commands manually I'm suspecting some sort of permission issue. Try running the commands as the NRPE user, and also try running it from Nagios with SELinux in permissive mode (needs to be run by the NRPE daemon with the correct SELinux domain). Check out this link about using check_openmanage with SELinux in enforcing mode: http://folk.uio.no/trondham/software/check_openmanage.html#selinux-considerations Regards, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- This SF.net email is sponsored by Windows: Build for Windows Store. http://p.sf.net/sfu/windows-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Problem with check_openmanage plugin and storage
Nic Bernstein writes: > Regarding the non-certified disks problem... There is a special > blacklisting keyword to suppress the message about non-certified disks: > > check_openmanage -b pdisk_cert=all > > Please try this and see if it resolves your issue. Using blacklisting > should also disable the global health check. > > > Ah, that's just what we need. Much appreciated... > > No, that doesn't seem to be in my version (3.7.9, downloaded yesterday) > > onlight@monitor:~$ perl check_openmanage -H host -C secret -b > pdisk_cert=all > Physical Disk 0:1:0 [Ata ST2000DM001-9YN164, 2.0TB] on ctrl 0 is Online > Physical Disk 0:1:1 [Ata ST2000DM001-9YN164, 2.0TB] on ctrl 0 is Online > onlight@monitor:~$ echo $? > 1 > > I guess I'll wait for a patch. Are you sure you didn't test this with the 7.1.0 workaround manually removed? > Say Trond, I sent you some notes last week about enhancements we made to your > check_linux_bonding plugin. Would you prefer I re-post those to the list > instead? Sorry for being non-responsive of late. I've been swamped at work lately and have attained somewhat of an email backlog. No need to resend :) Regards, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- This SF.net email is sponsored by Windows: Build for Windows Store. http://p.sf.net/sfu/windows-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Problem with check_openmanage plugin and storage
Nic Bernstein writes: > We've recently been experimenting with Trond Hasle Amundsen's check_openmanage > on a large network with about a hundred Dell servers of various ages, > capabilities, etc. Mostly PE-2950, R210, R410 and R720. Much thanks to Trond > for all his great work on Nagios plugins and other projects, by the way. > > We've hit a wall, however, with the storage monitoring aspects of this plugin. > > For example, here's a quite specific case. This is a new PE R720, in debug: > > onlight@monitor:~$ check_openmanage -H host -C secret -d >System: PowerEdge R720 OMSA version:7.1.0 >ServiceTag: ### Plugin version: 3.7.9 >BIOS/date: 1.2.6 05/10/2012 Checking mode: SNMPv2c UDP/IPv4 > > - >Storage Components > > = > STATE |ID| MESSAGE TEXT > > -+--+ > OK |0 | Controller 0 [PERC H310 Mini] is Ready > WARNING | 0:0:1:0 | Physical Disk 0:1:0 [Ata ST2000DM001-9YN164, 2.0TB] > on ctrl 0 is Online, Not Certified > WARNING | 0:0:1:1 | Physical Disk 0:1:1 [Ata ST2000DM001-9YN164, 2.0TB] > on ctrl 0 is Online, Not Certified > OK | 0:0 | Logical Drive '/dev/sda' [RAID-1, 1862.50 GB] is > Ready > OK | 0:0 | Connector 0 [SAS] on controller 0 is Ready > OK | 0:1 | Connector 1 [SAS] on controller 0 is Ready > OK |0:0:1 | Enclosure 0:0:1 [Backplane] on controller 0 is Ready > > - >Chassis Components > > = > STATE | ID | MESSAGE TEXT > > -+--+ > OK |0 | Memory module 0 [DIMM_A1, 4096 MB] is Ok > OK |1 | Memory module 1 [DIMM_A2, 4096 MB] is Ok > OK |2 | Memory module 2 [DIMM_A3, 4096 MB] is Ok > OK |3 | Memory module 3 [DIMM_A4, 4096 MB] is Ok > OK |0 | Chassis fan 0 [System Board Fan1 RPM] reading: 1200 RPM > OK |1 | Chassis fan 1 [System Board Fan2 RPM] reading: 1080 RPM > OK |2 | Chassis fan 2 [System Board Fan3 RPM] reading: 1200 RPM > OK |3 | Chassis fan 3 [System Board Fan4 RPM] reading: 1080 RPM > OK |4 | Chassis fan 4 [System Board Fan5 RPM] reading: 1080 RPM > OK |5 | Chassis fan 5 [System Board Fan6 RPM] reading: 1080 RPM > OK |0 | Power Supply 0 [AC]: Presence detected > OK |0 | Temperature Probe 0 [System Board Inlet Temp] reads 26 > C (min=3/-7, max=42/47) > OK |1 | Temperature Probe 1 [System Board Exhaust Temp] reads > 33 C (min=8/3, max=70/75) > OK |2 | Temperature Probe 2 [CPU1 Temp] reads 49 C (min=8/3, > max=83/88) > OK |0 | Processor 0 [Intel Xeon E5-2603 0 1.80GHz] is Present > OK |0 | Voltage sensor 0 [CPU1 VCORE PG] is Good > OK |1 | Voltage sensor 1 [System Board 3.3V PG] is Good > OK |2 | Voltage sensor 2 [System Board 5V PG] is Good > OK |3 | Voltage sensor 3 [CPU1 PLL PG] is Good > OK |4 | Voltage sensor 4 [System Board 1.1V PG] is Good > OK |5 | Voltage sensor 5 [CPU1 M23 VDDQ PG] is Good > OK |6 | Voltage sensor 6 [CPU1 M23 VTT PG] is Good > OK |7 | Voltage sensor 7 [System Board FETDRV PG] is Good > OK |8 | Voltage sensor 8 [CPU1 VSA PG] is Good > OK |9 | Voltage sensor 9 [CPU1 M01 VDDQ PG] is Good > OK | 10 | Voltage sensor 10 [System Board NDC PG] is Good > OK | 11 | Voltage sensor 11 [CPU1 VTT PG] is Good > OK | 12 | Voltage sensor 12 [System Board 1.5V PG] is Good > OK | 13 | Voltage sensor 13 [PS2 PG Fail] is Good > OK | 14 | Voltage sensor 14 [System Board PS1 PG Fail] is Good > OK | 15 | Voltage sensor 15 [System Board BP1 5V PG] is Good > OK | 16 | Voltage sensor 16 [CPU1 M01 VTT PG] is Good > OK | 17 | Voltage sensor 17 [PS1 Voltage 1] reads 114 V > OK |0 | Battery probe 0 [System Board CMOS Battery] is Presence > Detected > OK |0 | Amperage probe 0 [PS1 Current 1] reads 0.6 A > OK |1 | Amperage probe 1 [System Board Pwr Consumption] reads > 56 W > OK |0 | Chassis intrusion 0 detection: Ok (Not Breached) > OK |0 | SD Card 0 [vFlash] is Absent > > - >Other messages > > ===
Re: [Nagios-users] Check_Openmanage not ignoring non-certified drives
"Bob The Junkie" writes: > I m using Nagios and Check_Openmange to keep an eye on some Dell R710 servers > we ve recently acquired, and I m having problems trying to stop warnings with > non-dell certified drives appearing in the alert log. > > I ve separated out the different components on the servers to check into their > own nagios checks so my config files appear as such: > > In nagios: > > SERVICES.CFG > > > > Check_command check_dell_components!memory > > > > Check_command check_dell_components!alertlog > > COMMANDS.CFG > > Command_name Check_dell_components > > Command_line check_nrpe H $HOSTADDRESS$ -p 5666 t 30 c Check_OpenManage a > > only $ARG1$ > > On each Server in nsclient.ini: > > Check_OpenManage = scripts\\check_openmanage.exe $ARG1$ --perfdata > > The problem I m having is that in one of my checks that checks the health of > the alert log, I m getting a consistent warning message (Alert log content: 0 > critical, 6 non-critical, 36 ok ). I ve traced this down to the 6 non-dell > certified drives in the server, and I can indeed see within OMSA that the only > 6 warnings all state Controller event log: PD 04(e0x20/s4) is not a certified > drive: Controller 0 (PERC 6/i Integrated) . > > So far, so good. Reading through the documentation I can see the > Check_Openmanage includes a blacklisting option specifically for this event > pdisk_cert - Suppress warning message about non-certified physical disk but > no > matter what I try, I can t seem to get Check_Openmanage to ignore these > problems. An example of the command I m running on the command line is: > > check_openmanage.exe -s -a -b pdisk_cert=all > > Which returns: > > WARNING: Alert log content: 0 critical, 6 non-critical, 36 ok > > Now I m assuming the problem here is being caused by the Alert Log generating > the errors, and not the physical disk directly causing the errors, which is > why > blacklisting the certificate problem on the physical disk isn t doing me any > good. > > Which leads me onto my question is there anything I can do to ignore these > errors (and thus stop Nagios from complaining) apart from excluding the alert > log when I do my checks? Hi, Your analysis is correct. The check_openmanage plugin's check of the log content is limited to counting the number of critical, warning and ok messages. It doesn't do any log parsing. The intended usage of the log checking is as a precausion, if you're concerned about missing some temporary problem. After all, the plugin does active checking and will only report the state of the hardware right now. In your case I think that the easiest solution would be to stop using the log checking with check_openmanage, and either use a fully fledged log parsing plugin (such as check_logfiles) or write your own simple plugin where you just filter out the certificate stuff. Regards, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS, MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft MVPs and experts. SALE $99.99 this month only -- learn more at: http://p.sf.net/sfu/learnmore_122412 ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] New check_openmanage error after updating to OMSA 7.2.0-4
Steve Jenkins writes: > And... to answer my own question, yes - 3.7.9 does indeed fix > this. New version is probably already in the repos, waiting out the > testing period. Not sure which repos you're referring to, but I'll assume Fedora and/or Fedora EPEL. I didn't get around to submitting updates until today. They should arrive in the testing repos in a couple of days. The updates need to stay in testing for a week for Fedora and two weeks for EPEL before they can be pushed to stable. If you can't wait, you can download the RPMs via the Fedora build system, you'll find links here: https://admin.fedoraproject.org/updates/search/nagios-plugins-openmanage When it has arrived in testing (and in your local mirror), you can install it with (example for EPEL): yum --enablerepo=epel-testing update nagios-plugins-openmanage Cheers, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS, MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft MVPs and experts. ON SALE this month only -- learn more at: http://p.sf.net/sfu/learnmore_122712 ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage: timeout vs. SNMP timeout
Andrew Daugherity writes: >> Please try this version (named 3.7.8-beta2) and let me know if it works >> around your problem. Usage: >> >> check_openmange --snmp-timeout > > I think I fixed my problem (for the time being at least) by restarting > OMSA on that server. Restarting snmpd didn't solve anything, nor did > my timeout hack (which just gave me an UNKNOWN status - plugin timeout > instead of SNMP CRITICAL when it randomly failed). Whenever the check > failed, it would hang indefinitely, so it was not a case of slow SNMP. > Thanks for the added option, though; I think someone may find it > useful. Yes, I agree. I'll keep it. > Regarding your fix: > The timeout option does appear to get passed to SNMP, however the > actual timeout is twice what is specified. E.g. --snmp=timeout=1, get > SNMP critical message after 2 seconds; --snmp-timeout=14, SNMP > critical at 28 seconds; --snmp-timeout=15 or higher, get UNKNOWN: > PLUGIN TIMEOUT message at 30 seconds. (I used a host without snmpd > running for the timeout tests.) I can't see anything obviously wrong > with your code, but it behaves this way both on both SLES 11 SP1 (Perl > 5.10, net-snmp 5.4.2.1, Net::SNMP 6.0.1) and OS X 10.8 (Perl 5.12.4, > net-snmp 5.6, Net::SNMP 6.1 [from CPAN]). Hmm.. kind of confusing. It is due to the fact that Net::SNMP does one retry (with the same timeout) before it bails out. This is adjustable with the '-retries' parameter to the SNMP object. The default is 1. If I set it to 0, the plugin times out in the SNMP object at the specified time as you would expect. Thanks for pointing this out, I should make a note of it in the manual page. > You probably also want to add this option to the help/usage message. I won't make the help output, as that only covers the most popular options, but I'll add it to the manual page. Cheers, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial Remotely access PCs and mobile devices and provide instant support Improve your efficiency, and focus on delivering more value-add services Discover what IT Professionals Know. Rescue delivers http://p.sf.net/sfu/logmein_12329d2d ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage: timeout vs. SNMP timeout
Trond Hasle Amundsen writes: > A new option to specify the SNMP object timeout would be easy to add, > and is in my opinion a cleaner approach than just passing the plugin > timeout. Such an option is now implemented in the Git version: http://git.uio.no/git/?p=check_openmanage.git;a=commit;h=32564b44c2631eeac03a920f0c180fb12e4b29c8 Please try this version (named 3.7.8-beta2) and let me know if it works around your problem. Usage: check_openmange --snmp-timeout Regards, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial Remotely access PCs and mobile devices and provide instant support Improve your efficiency, and focus on delivering more value-add services Discover what IT Professionals Know. Rescue delivers http://p.sf.net/sfu/logmein_12329d2d ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage: timeout vs. SNMP timeout
Andrew Daugherity writes: > I'm troubleshooting an issue where one server is occasionally not responding > (I think it's a firewall or snmpd issue, not this plugin), and I noticed that > changing the timeout option to check_openmanage did not affect how long it > took before receiving the > SNMP CRITICAL: No response from remote host A.B.C.D > > message. Looking at the code I see the timeout option is _not_ passed to the > Net::SNMP session object, so the SNMP connection timeout uses the default > value (5 seconds according to the Net::SNMP man page, but 10 seconds in my > testing). > > If I pass the timeout option to the Net::SNMP->session object like so: > > diff --git a/check_openmanage b/check_openmanage > index b6abec5..3558ed4 100755 > --- a/check_openmanage > +++ b/check_openmanage > @@ -860,6 +860,7 @@ sub snmp_initialize { > '-port' => $opt{port}, > '-hostname' => $opt{hostname}, > '-version' => $opt{protocol}, > +'-timeout' => $opt{timeout}, > ); > > # Setting the domain (IP version and transport protocol) > > Then it does obey the timeout option and I instead get the > PLUGIN TIMEOUT: check_openmanage timed out after 30 seconds > > message. This might be by design though, to have a shorter SNMP timeout and > different error messages, but it was perplexing to me why the timeout option > was seemingly not working. Perhaps a different option for the SNMP timeout, > or a documentation clarification, is a better way? Hello Andrew, Your analysis of this problem is correct, you're hitting the Net::SNMP timeout which is default 5 seconds. There are two reasons why the --timeout parameter isn't passed to the SNMP object: 1. I never saw any reason to :) This is the first time I've heard of problems relating to it. 2. The SNMP object timeout has limitations, it can only be between 1 and 60 seconds. I don't know how Net::SNMP reacts if the specified value is outside of this range. The documentation is lacking on this, as you pointed out, and I'll fix that. A new option to specify the SNMP object timeout would be easy to add, and is in my opinion a cleaner approach than just passing the plugin timeout. PS. I'm going away for the weekend and I'm leaving in a few minutes, so I'll get back to you on this early next week. Regards, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial Remotely access PCs and mobile devices and provide instant support Improve your efficiency, and focus on delivering more value-add services Discover what IT Professionals Know. Rescue delivers http://p.sf.net/sfu/logmein_12329d2d ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Problem with check_openmanage
"Jens Hyllegaard (Soft Design A/S)" writes: > I am using version 3.7.6 of check_openmanage. > > I have disabled notifications for battery charge events in the call to > check_openmanage but I still get notifications from Nagios. > > > > This is command line I use: > > $USER1$/check_openmanage -s -p -H $HOSTADDRESS$ -b ps=all -b bat_charge > > > > This is the current output from check_openmanage for one the servers. > > WARNING: Cache Battery 0 in controller 0 is Charging (Ready) [probably > harmless] Hello Jens, There is a slight typo in your command definition. Replace with: $USER1$/check_openmanage -s -p -H $HOSTADDRESS$ -b ps=all -b bat_charge=all ..and you should be fine :) Cheers, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- Don't let slow site performance ruin your business. Deploy New Relic APM Deploy New Relic app performance management and know exactly what is happening inside your Ruby, Python, PHP, Java, and .NET app Try New Relic at no cost today and get our sweet Data Nerd shirt too! http://p.sf.net/sfu/newrelic-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage: fix build on SUSE (docbook pkg name)
Andrew Daugherity writes: > Simple fix -- the package is named 'docbook-xsl-stylesheets' instead > of 'docbook-style-xsl'. I added a variable for this to the global "if > suse" section. Thanks Andrew, applied and pushed to master. Cheers, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Dell Openmanage
"Sven Dohmen" writes: > Since several months we are using the Dell Openmanage plugin from http:// > folk.uio.no/trondham/software/check_openmanage.html. This has been working > fine > untill the last couple weeks. > > For some servers we are getting the following results back: > > W: Controller 0 [PERC 6/i Integrated]: Firmware '6.2.0-0013' is out of date > -- SYSTEM: PowerEdge R710, SN: > INTERNAL ERROR: Use of uninitialized value within %fw_type in string eq at > (eval 1) line 4976. > INTERNAL ERROR: Use of uninitialized value within %fw_type in pattern match > (m/ > /) at (eval 1) line 4980. > > I noticed this only happens when 1 of the drivers is out of date. Is there a > solution for without directly updating the firmware (which is already planned > over several weeks). In case anyone else has this issue.. Sven and I worked on this off-list, and we identified this to be an error related to using the '-o' option over SNMP, on servers equipped with iDRAC6 or iDRAC7 management cards. The plugin check_openmanage has been fixed and a new release (versjon 3.7.6) is available: http://folk.uio.no/trondham/software/check_openmanage.html#download Notice for For RHEL and Fedora users: The new release has been submitted as an update for Fedora and Fedora EPEL. It is currently in testing, and can be updated with: yum --enablerepo=\*testing update nagios-plugins-openmanage Regards, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Warning alert isn't working
Leonardo Bacha Abrantes writes: > Hi everybody! > > I'm using check_openmanage plugin in nagios to monitoring the temperature of > my > dell servers. > It's working, however, the warning and critical alerts that I configure are > not > working. > > [root@monitor:/etc/openmanage]# /usr/lib/nagios/plugins/check_openmanage -w 25 > -c 30 -H 10.11.12.1 -C Test--only temp > TEMPERATURES OK - 1 temperature probes checked:Temperature Probe 0 [System > Board Ambient Temp] reads 30 C (min=8/3, max=42/47) > > The temperature is 30 and the check should appear WARNING because I used -w > 25. Hello Leonardo, The syntax you're using with the '-w' and '-c' options is wrong. From the manual page: -w, --warning STRING or FILE Override the machine-default temperature warning thresholds. Syntax is "id1=max[/min],id2=max[/min],...". The following example sets warning limits to max 50C for probe 0, and max 45C and min 10C for probe 1: check_openmanage -w 0=50,1=45/10 The minimum limit can be omitted, if desired. Most often, you are only interested in setting the maximum thresholds. This parameter can be either a string with the limits, or a file containing the limits string. The option can be specified multiple times. NOTE: This option should only be used to narrow the field of OK temperatures wrt. the OMSA defaults. To expand the field of OK temperatures, increase the OMSA thresholds. See the plugin web page for more information. -c, --critical STRING or FILE Override the machine-default temperature critical thresholds. Syntax and behaviour is the same as for warning thresholds described above. The reason that you need to specify the ID of the temperature probes is that there may be more than one, each with its own thresholds. In your case there is only one probe and its ID is 0, so replace your command above with: check_openmanage -w 0=25 -c 0=30 -H 10.11.12.1 -C Test --only temp That should do the trick. Regards, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- For Developers, A Lot Can Happen In A Second. Boundary is the first to Know...and Tell You. Monitor Your Applications in Ultra-Fine Resolution. Try it FREE! http://p.sf.net/sfu/Boundary-d2dvs2 ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage: Physical Disk ... Undefined value 4096
Helmut Wollmersdorfer writes: > Physical Disk 0:0:0 [Dell WDC WD1003FBYX-18Y7B0, 1.0TB] on ctrl 0 > needs attention: Undefined value 4096 Hello Helmut, The "state" value for physical disks via SNMP is an integer, which is translated by the plugin. There are a few defined values, and 4096 is not one of them. > On the console of the server: > > # /opt/dell/srvadmin/bin/omreport storage pdisk controller=0 vdisk=0 > List of Physical Disks belonging to VD10A > > Controller PERC H700 Integrated (Slot 4) > > Span 0 > ID: 0:0:0 > Status: Unknown > Name : Physical Disk 0:0:0 > State : Unknown > Power Status : Spun Up > Bus Protocol : SATA > Media : HDD > Revision : 01.01V02 > Failure Predicted : No > Certified : Yes > Encryption Capable: No > Encrypted : Not Applicable > Progress : Not Applicable > Mirror Set ID : 0 > Capacity : 931.00 GB (999653638144 bytes) > Used RAID Disk Space : 931.00 GB (999653638144 bytes) > Available RAID Disk Space : 0.00 GB (0 bytes) > Hot Spare : No > Vendor ID : DELL > Product ID: WDC WD1003FBYX-18Y7B0 > Serial No.: WD-WCAW3145836558365 > Part Number : TH0V8FCR1255213BC4RGA00 > Negotiated Speed : 3.00 Gbps > Capable Speed : 3.00 Gbps > Manufacture Day : Not Available > Manufacture Week : Not Available > Manufacture Year : Not Available > SAS Address : 443322110700 > > [same for all 4 disks of the array] > > Thus it seems that check_openmanage works correctly. Also the disk- > array seems to work correctly (no error messages in the logs). > > Is this some sort of wrong diagnostic from the firmware/controller? No, this is not normal behaviour. I've seen this only on disks that were so damaged that Openmanage failed miserably when attempting to get info from them. Clearly this is not the case here, as you get the same error on multiple disks and they otherwise work fine. If you haven't already, you should try upgrading all BIOS and firmware on the server, especially the controller firmware. You should also upgrade Openmanage if you're not running the latest version (6.5.0). If all else fails, I would contact Dell support and have them look at it. Cheers, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- Keep Your Developer Skills Current with LearnDevNow! The most comprehensive online learning library for Microsoft developers is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, Metro Style Apps, more. Free future releases when you subscribe now! http://p.sf.net/sfu/learndevnow-d2d ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] SELinux and RHEL6.2 preventing disk checks via NRPE
Dennis Kuhlmeier writes: > Geez, there are a lot more contexts set than I thought. I should > probably remove duplicate entries, right? The labels in /etc/selinux/targeted/contexts/files/file_contexts is there by default and these should not be touched. The file /etc/selinux/targeted/contexts/files/file_contexts.local contains local additions or adjustments. If there are entries there that you think ought to be removed, you should remove them with: semanage fcontext -d '' Don't edit the file directly :) Cheers, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- Systems Optimization Self Assessment Improve efficiency and utilization of IT resources. Drive out cost and improve service delivery. Take 5 minutes to use this Systems Optimization Self Assessment. http://www.accelacomm.com/jaw/sdnl/114/51450054/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage spec file fixes for SUSE
"Daugherity, Andrew W" writes: > First of all, thanks for making this plugin. It works well and is > very handy. As requested in the documentation, I am sending this to > the nagios-users list rather than directly to the author. Hello Andrew, Excellent :) Usually a public forum is better, where everybody can participate and share their insight. > With some minor modifications, the package builds properly on SUSE. > There are two main Nagios packaging differences from RedHat: > > 1) All Nagios plugins are installed to /usr/lib/nagios/plugins, even > on 64-bit (there is no /usr/lib64/nagios directory). This may not > make the most sense, but it is what is, and being consistent with > other Nagios packages is good. > > 2) Non-binary plugin RPMs (e.g. Perl scripts only) use noarch, while > binary plugins use the corresponding arch. For examples of both, > browse the build service repo at > http://download.opensuse.org/repositories/server:/monitoring/SLE_11.1/ > Being a Perl script, check_openmanage falls under the former. > > This is easily solved with an %if block to make a universal RPM spec: > BEGIN PATCH > --- nagios-plugins-openmanage.spec.orig 2011-10-05 10:00:18.0 > -0500 > +++ nagios-plugins-openmanage.spec2011-12-01 15:02:10.0 -0600 > @@ -5,6 +5,16 @@ > # No binaries here, do not build a debuginfo package > %global debug_package %{nil} > > +# SUSE installs Nagios plugins under /usr/lib, even on 64-bit > +# It also uses noarch for non-binary Nagios plugins > +%if %{defined suse_version} > +%global nagiospluginsdir /usr/lib/nagios/plugins > +BuildArch: noarch > +%else > +%global nagiospluginsdir %{_libdir}/nagios/plugins > +%endif > + > + > Name: nagios-plugins-openmanage > Version: 3.7.3 > Release: 1%{?dist} > END PATCH > > I also tested building on CentOS 5 to make sure nothing broke there, > and indeed, nothing changed there. Thanks for the patch, applied. However, there are some changes to the spec file lately. Among them is an added Requires to the nagios-plugins package, which owns the /usr/lib(64)?/nagios/plugins directory. Hopefully SUSE does the same in this respect. The updated spec file is available here: http://folk.uio.no/trondham/software/tmp/nagios-plugins-openmanage.spec PS. check_openmanage has been added to Fedora and EPEL, but there are some SELinux issues. Until these are resolved I'll hold off pushing it to stable, but it is available in testing. Cheers, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- Learn Windows Azure Live! Tuesday, Dec 13, 2011 Microsoft is holding a special Learn Windows Azure training event for developers. It will provide a great way to learn Windows Azure and what it provides. You can attend the event by watching it streamed LIVE online. Learn more at http://p.sf.net/sfu/ms-windowsazure ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] SELinux and RHEL6.2 preventing disk checks via NRPE
Dennis Kuhlmeier writes: > Hello, > > after upgrading to RHEL6.2 I have problems checking some > filesystems. Always the same three FS on all hosts, others work fine. > > /boot > /home > /var/log/audit > > $ ./check_nrpe -H backup -c check_fs_boot > DISK CRITICAL - /boot is not accessible: Permission denied > > Now I disable SELinux and it works! > $ ./check_nrpe -H backup -c check_fs_boot > DISK OK - free space: /boot 36 MB (39% inode=99%);| /boot=55MB;96;;0;96 > > Although not a single line is logged on the monitored host, neither > in messages nor in audit.log > > I already had a local policy created for the nrpe daemon when RHEL6 > was introduced, as somehow many checks failed, although the user > nrpe was running in was allowed to perform all checks, the nrpe > daemon itself couldn't. I'll attach the policy, although at one > point I gave up and just set the entire process to permissive mode. > (note that I tried to extend rights on boot filesystem in this > policy already, although it would seem to be unnecessary) > > Anybody experiencing something alike or any suggestions about how to > handle nrpe and RHEL6(.2) in a better way than I am? RHEL6 has the following labels for use with Nagios plugins: # grep nagios /etc/selinux/targeted/contexts/files/file_contexts | grep plugin_exec | cut -d: -f3 | sort -u nagios_admin_plugin_exec_t nagios_checkdisk_plugin_exec_t nagios_mail_plugin_exec_t nagios_services_plugin_exec_t nagios_system_plugin_exec_t nagios_unconfined_plugin_exec_t Try setting the confined types first, e.g.: chcon -t nagios_checkdisk_plugin_exec_t /path/to/check_fs_boot If none of them works properly, you have nagios_unconfined_plugin_exec_t as a last resort. When you find one that works, make it permanent with: semanage fcontext -a -t '/path/to/check_fs_boot' You may also have to set proper labels on the path leading up to the actual plugin. Regards, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- Cloud Services Checklist: Pricing and Packaging Optimization This white paper is intended to serve as a reference, checklist and point of discussion for anyone considering optimizing the pricing and packaging model of a cloud services business. Read Now! http://www.accelacomm.com/jaw/sfnl/114/51491232/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage plugin: " Couldn't run command ..."
Corcoran Smith writes: > First message, so please excuse any failures in format, etc! > > Got two issues with two boxes (out of 160!) using check_openmanage: > > 1) Couldn't run command 'c:\pro... ' etc > 2) U nrecognized character xA8: marked by <-- HERE after <-- HERE near column > 1 at /loader/HASH(0xa7c42c)/UNIVERSAL.pm line 1. > > both are using the windows exe Hi Corcoran, I'll need more data to debug the first issue, e.g. the full error message from the plugin. Unless they appear on the same server(?), in which case issue 1 is probably caused by issue 2. Regarding issue 2, I've seen this once before. A disk was so damaged that OMSA failed while getting info from it, and gave an error message like above: "unrecognized character...". This output is not something that the plugin doesn't expect and couldn't possibly prepare for, so it throws an error. You need to identify the failed component, it probably needs to be replaced. Try running 'omreport' commands to find it. Start with 'omreport storage pdisk controller=0'. Regards, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-novd2d ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage on CentOS 5.6 Hosts
the entrox writes: > i've been using the check_openmanage script to monitor about two dozens of > dell > servers without a hitch (all Windows based) and we just set up about 15 or so > new servers but this time running CentOS, i of course installed the OMSA via > Dell's repository and also enabled SNMP but i cant seem to get the command to > work on those hosts. > > i am trying to run the debug command to look at the entire output like this: > > [root@MONITOR02 plugins]# ./check_openmanage -H HOSTIP -C COMMUNITY -d > ERROR: (SNMP) OpenManage is not installed or is not working correctly This error means that the SNMP service on the monitored host is working and we get a reply, but the OIDs for OMSA are not present. > i of course checked where the omreport binary was at and its where the script > is looking for it: > > [root@mvarutestvmbase01 ~]# find / -name omreport > /opt/dell/srvadmin/sbin/omreport > /opt/dell/srvadmin/bin/omreport > [root@mvarutestvmbase01 ~]# When using SNMP, the plugin doesn't utilize the omreport binary in any way. It doesn't care where it is installed. BTW, the above location is correct and is the default. > just to double check i went ahead and looked if the OMSA was working, i went > via web and the console shows up no problem at all, if i authenticate it shows > all the information that it should be showing, i also restarted all the > services on the OMSA just to see if something was up but nothing, it still > claims its not working: > > http://pics.entrox.me/983ygh426g.png This is interesting. The SNMP service wasn't started. You should see something like this: Starting dsm_sa_snmpd: [ OK ] The dsm_sa_snmpd service is started by /etc/init.d/dataeng. This script is also responsible for starting other components such as dsm_sa_datamgrd, and that seems to work fine. You should also see dsm_sa_snmpd in the process list if it's running: # ps axww | grep dsm_sa_snmpd 4967 ?Ssl0:00 /opt/dell/srvadmin/sbin/dsm_sa_snmpd >From what I can gather from the dataeng init script, it won't start dsm_sa_snmpd if this file exists: /opt/dell/srvadmin/var/lib/srvadmin-deng/dcsnmp.off If it exists on your system, try removing it and restart OMSA. Also verify that your /etc/snmp/snmpd.conf contains the following at the very end: # Allow Systems Management Data Engine SNMP to connect to snmpd using SMUX smuxpeer .1.3.6.1.4.1.674.10892.1 This should have been added by OMSA at install time. > i also read on the man page of the script > (http://folk.uio.no/trondham/software > /check_openmanage.html) that i could use the --omreport option but no dice > with > that, even trying the bin and sbin omreport binary file i got the exact same > message: This option allows you to specify the location of the omreport command. It has no effect when using SNMP, and is only really usable on Windows systems, where OMSA can be installed on drives other than C:. Regards, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- RSA(R) Conference 2012 Save $700 by Nov 18 Register now http://p.sf.net/sfu/rsa-sfdev2dev1 ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage: OOPS! Something is wrong...
Lois Garcia writes: > This is the output from "omreport chassis pwrsupplies -fmt ssv": > > C:\Users\Administrator>omreport chassis pwrsupplies -fmt ssv > Power Supplies Information > > Power Supply Redundancy > Redundancy Status;Lost > > Individual Power Supply Elements > > Index;Status;Location;Type;Rated Input Wattage;Maximum Output Wattage;Online > Sta > tus;Power Monitoring Capable > 0;Ok;PS 1 Status;AC;[No Value];[No Value];Presence Detected;Yes > 1;Ok;PS 2 Status;AC;1080 W;870 W;Presence Detected;Yes Thanks. This shows that the plugin's behaviour was correct in my opinion. OMSA states that both PSUs are OK, which is what the plugin reports. There is a bug somewhere, but it is probably in OMSA. My guess is that there is a rare and unknown error condition in PSU1, which OMSA doesn't handle correctly. Regards, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2dcopy2 ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage: OOPS! Something is wrong...
Lois Garcia writes: > Thank you, Trond! It looks like a power supply problem. I will take the issue > to Dell: > > C:\Users\Administrator>omreport system > Health > > SEVERITY : COMPONENT > Critical : Main System Chassis > > > C:\Users\Administrator>omreport chassis > Health > > Main System Chassis > > SEVERITY : COMPONENT > Ok : Fans > Ok : Intrusion > Ok : Memory > Critical : Power Supplies > Ok : Power Management > Ok : Processors > Ok : Temperatures > Ok : Voltages > Ok : Hardware Log > Ok : Batteries Hmm... there is obviously something amiss with the power supplies, but the plugin didn't catch it. I'd like to know why. Can you provide the output from: omreport chassis pwrsupplies -fmt ssv This is the command that the plugin runs to get the status of the power supplies. > Thank you also for putting such a great plugin into the > community. Without it, monitoring the few Windows machines in our all > Linux environment would have been a chore I don't care to contemplate. Thank you, glad you like it :) > I don't see a donation link on your website at http://folk.uio.no/trondham/ > software/check_openmanage.html - ? No, there is no donation link, the thought never crossed my mind. I have benefitted enormously (personally and professionally) from free and open source software for many years. This is just my way of giving back. Besides, I've found that creating and maintaining open source software is by itself rewarding, in many different ways. Cheers, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage: OOPS! Something is wrong...
lois garcia writes: > I have check_openmanage running successfully on 13 out of 16 Dell R710s. > I am really puzzled at what is going wrong, as it seems different on each > machine. I have tried different versions of check_openmanage and > reinstalling the same version of Dell OMSA. > > The first eight servers were built from the same Ghost image, and last > month, one of those servers started showing the check_openmanage error: > > UNKNOWN 09-13-2011 17:04:23 7d 1h 7m 54s 4/4 > UNKNOWN: Storage Error! No > controllers found > UNKNOWN: Problem running 'omreport chassis memory': > Error: Memory object not found > UNKNOWN: Problem running 'omreport chassis fans': > Error! No fan probes found on > this system. > UNKNOWN: Problem running 'omreport chassis temps': > Error! No temperature probes > found on this system. > UNKNOWN: Problem running 'omreport chassis volts': > Error! No voltage probes > found on this system. > > I reinstalled the Dell software, fixing the UNKNOWN error, and now have > this error: > > OOPS! Something is wrong with this server, but I don't know what. The > global system health status is CRITICAL, but every component check is > OK. This may be a bug in the Nagios plugin, please file a bug report. > > The server is a Dell R710, running Windows Server 2008 R2 Enterprise. Hello Lois, (I shortened the subject) When the plugin is used in local mode, as in your case, the plugin checks the global health status using this command: # omreport system Health SEVERITY : COMPONENT Ok : Main System Chassis For further help, type the command followed by -? If everything is OK you'll get the output above. What do you get when running this command on the troubled server? Does the ESM log contain any clues? Try running 'omreport system esmlog' and see. Try running 'omreport chassis' as well. There are two possible causes for the oops error. Either Openmanage isn't behaving properly, or your server has an error that the plugin doesn't catch. Regards, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage - Feature request
Russell Kackley writes: > I recently downloaded and started using the Nagios plugin > check_openmanage to provide information on our Dell PowerEdge servers to > Nagios. check_openmanage works very well for us, but there is one thing > that I would like to see added. Another possibility is that other > check_openmanage users could point me to a way to accomplish what I want > to do using the existing code. > > We have two PowerEdge 2950 servers, named s1 and s2. Each server has a > PERC 6/E card installed in them. We also have a PowerVault MD1000 > storage unit. Both servers are connected to the MD1000. s1 is our > primary server so we boot it first and Dell OMSA reports that the > physical disks are "Online", which is what we want. s2 is our backup > server and we boot that second. For this server, Dell OMSA reports that > the physical disk status is "Non-Critical" and the state is "Foreign". > This is ok for us, but the problem is that check_openmanage sees the > "Non-Critical" status and reports a Warning for the physical disks. I > would like check_openmanage to ignore the "Non-Critical" status when the > state is "Foreign", preferably via a blacklist option, e.g., > pdisk_foreign. I think that this is similar to the blacklist option > pdisk_cert, in which check_openmanage ignores the "Non-Critical" status > when a disk is not certified by Dell. Note that I did investigate the > blacklist item pdisk and the "--check storage=0" option, but my > understanding of those options is that they suppress all checks of the > disks, which is not what I want. > > Do the users of check_openmanage 1) have any suggestions for how I can > tell check_openmanage to ignore the "Non-Critical"/"Foreign" state of > the disks, or 2) think that this would be a useful feature to add to > check_openmanage? Hi Russel, This would be a nice feature to add. Please try the latest development version (3.7.1-beta2), it includes the new blacklisting keyword 'pdisk_foreign' as you suggest: http://folk.uio.no/trondham/software/check_openmanage.html#download Regards, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- uberSVN's rich system and user administration capabilities and model configuration take the hassle out of deploying and managing Subversion and the tools developers use with it. Learn more about uberSVN and get a free download at: http://p.sf.net/sfu/wandisco-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] home made php script
"Erik Olsen" writes: > I've been trying to make my own script now for a few hours but im not > getting it to work with nagios. > Im most familiar with php so I used that to make the script. > > My setup: > Ubuntu 11.4 server > Nagios 3.2.3 > > The host/command/and service are all in the same .cfg file. > > define command{ > command_name check_ups_temprature2 > command_line $USER$/check_ups_temp.php > } > > define service{ > use generic-service > host_name ups1 > service_description Temp ups env sensor > check_command eaton_ups_temp > } > > Status Information (Return code of 127 is out of bounds - plugin may be > missing) Hi Erik, There is a typo on the "command_line" line. The $USER$ macro doesn't exist. There are 32 possible user macros, named $USER1$ through $USER32$. Try replacing $USER$ with $USER1$, or simply the actual path leading up to the plugin. Cheers, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- FREE DOWNLOAD - uberSVN with Social Coding for Subversion. Subversion made easy with a complete admin console. Easy to use, easy to manage, easy to install, easy to extend. Get a Free download of the new open ALM Subversion platform now. http://p.sf.net/sfu/wandisco-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] omreport and check_openmanage
Emilio Bruna writes: > Thanks a lot for your hints Trond, > check_openmanage is already at latest version. > > We will try with an OMSA update first and then (if the issue persist) > we will update BIOS too. If all else fails, you have the option of disabling the power management check completely, by using '--check amperage=0': check_openmanage --check amperage=0 By using this option you're telling the plugin that it shouldn't even attempt to run 'omreport chassis pwrmonitoring'. Regards, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2d-c2 ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] omreport and check_openmanage
Emilio Bruna writes: > Omsa version is 6.2.0.1 > so: windows 2008 storage server SP2 > Hardware is Dell NX 300 Storage server (a derivate of R410 or R310 i think) This combination should be ok. I don't know the NX300, but if it's based on the R310 or R410 it shouldn't be a problem. There was a bug in check_openmanage related to power monitoring on the R410, but this was fixed in version 3.6.5 of the plugin. Are you using the latest version of check_openmanage, which is 3.6.8? Also, would it be possible for you to upgrade OMSA to the latest version, 6.5.0? This really is an OMSA issue. If the power supplies don't support power monitoring, omreport should just that say that and check_openmanage is happy. But in your case, OMSA is responding with an error. One last tip. In some cases I've seen that certain capabilities in OMSA depends on BIOS and/or firmware versions. You should verify that the BIOS and firmware is relatively up-to-date. Regards, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2d-c2 ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage error on W2k8r2 Dell R900
Jay Wahl writes: > Love check_openmanage plugin for Nagios! It has been a great help for > monitoring our Dell hardware. I recently built 3 Dell 900s with W2K8r2 with > check_openmanage (v 3.6.8) and Dell OMSA (v 6.5.0). Hi Jay, Are you completely sure that you're using version 3.6.8? My reason for asking is that the errors you get don't make sense (details below). > I am getting the following errors: > C:\Program Files\NSClient++\scripts>check_openmanage > Problem running 'omreport chassis memory': Error Correction;Multibit ECC This was fixed a while back (version 3.6.3 IIRC). The "Error Correction" field appeared in OMSA 6.4.0 and check_openmanage triggers on strings containing "Error". The particular string above obviously does not indicate an actual error and was put in the whitelist for errors shortly after OMSA 6.4.0 was released. > INTERNAL ERROR: Use of uninitialized value in concatenation (.) or string at > script/check_openmanage line 1650. > INTERNAL ERROR: Use of uninitialized value in concatenation (.) or string at > script/check_openmanage line 1650. These two don't make any sense, since line 1650 only contains a comment. They are also probably not related to the memory check. Please verify the version of check_openmanage. The plugin will output its version number with either of these options: check_openmanage -V check_openmanage -d Regards, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2d-c2 ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] omreport and check_openmanage
Emilio Bruna writes: > Hello all, > i'm monitoring several Dell windows servers with nagios and NSClient++ > and OMSA + check_openmanage. On one of these, i'm getting a problem > monitoring the redundant power supplies. > > Running the command below LOCALLY on the machine being monitored i got > the right data from omreport.exe: > > c:\Program Files (x86)\Dell\SysMgt\oma\bin>omreport.exe chassis pwrsupplies > Power Supplies Information > > --- > Main System Chassis Power Supplies : Ok > --- > > Power Supply Redundancy : Ok > Attribute : Redundancy Status > Value : Full > Individual Power Supply Elements > Index : 0 > Status : Ok > Location : PS 1 Status > Type : AC > Rated Input Wattage : 680 W > Maximum Output Wattage : 500 W > Online Status : Presence Detected > Power Monitoring Capable : Yes > > Index : 1 > Status : Ok > Location : PS 2 Status > Type : AC > Rated Input Wattage : 680 W > Maximum Output Wattage : 500 W > Online Status : Presence Detected > Power Monitoring Capable : Yes > > running the below command (the ones needed to check_openmanage): > > c:\Program Files > (x86)\Dell\SysMgt\oma\bin>c:\Users\administrator.CMVC\Desktop\ > check_openmanage.exe --omreport "c:\Program Files (x86)\Dell\SysMg > mreport.exe" > Problem running 'omreport chassis pwrmonitoring': Error: Current probes not > found > > i've noticed that the switches coming from check_openmanage are > slightly different from the ones passed from omreport.exe ("omreport > chassis pwrmonitoring" instead of "omreport chassis pwrsupplies") > > so it seems that check_openmanage has the wrong switches regard to the > powermonitoring check status; or maybe the omsa version i'm using is > not at the correct version to work in the right way with > check_openmanage. Hi Emilio, Don't confuse the two arguments 'pwrsupplies' and 'pwrmonitoring'. They do different things, and check_openmanage uses both of them. It runs 'omreport chassis pwrsupplies' to get the status of the power supplies, and it runs 'omreport chassis pwrmonitoring' to get the status and value of the amperage probes. The latter includes the overall power consumption of the server. In your case, it's the 'pwrmonitoring' command that fails. This is a known problem with some older versions of OMSA. Which version of OMSA are you running, and on what kind of PowerEdge server? Regards, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2d-c2 ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Check_Openmanage configuration question
Daniel Ceola writes: > Hello all! Hi Daniel, > I have a question regarding the initial configuration of > check_openmanage. I downloaded the version of the script dated Feb 9 > (I don?t see a version number in the script) and am attempting to use > the script through SNMP. Tip: Run the plugin with the '-V' or '--version' switch to view the version number. > I?m attempting to begin using check_openmanage with our Dell servers. > I have installed the Dell OMSA software on one server and it seems to > be working just fine. I configured my command definition in a simple > fashion, according to the installation guide: > > # Dell Check openmanage > > define command{ > command_namecheck_openmanage > command_line$USER1$/check_openmanage -H $HOSTADDRESS$ > } > > I also configured my service definition in a simple fashion, according > to the installation guide: > > define service{ > use generic-service > host_name Server_Name > service_description Dell OMSA > check_command check_openmanage > } This looks correct to me. > However ? my Nagios console is reporting the status as (null). Also, > when I attempt to run the script from the command line (note the file > is saved as check_openmanage with no file extension, I also tried > check_openmanage.pl and receive the same results), I receive a few > errors > > nagios@UbuntuTest:/usr/local/nagios/libexec$ ./check_openmanage 192.168.1.5 > ./check_openmanage: line 27: require: command not found > ./check_openmanage: line 28: use: command not found > ./check_openmanage: line 29: use: command not found > ./check_openmanage: line 30: syntax error near unexpected token `(' > ./check_openmanage: line 30: `use POSIX qw(isatty ceil);' Weird. Your system seem to be running the plugin through a shell. The output above is exactly what you'll get if you run sh ./check_openmanage To specify perl as interpreter, run: perl ./check_openmanage However, this should not be needed. The system should identify it as a perl script and use perl to execute it by default. Have you edited the plugin in some way? Check that the md5sum is correct: $ md5sum check_openmanage 5281718fe9e5c4b9570fe76f0fb424ec check_openmanage The above sum is correct for version 3.6.6. You should verify that you get the same (if running 3.6.6). The latest version and its md5sum are available here: http://folk.uio.no/trondham/software/check_openmanage.html#download PS. In your example above you have forgotten the '-H' switch. PPS. The file extension (or the name itself) is unimportant. Cheers, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- Achieve unprecedented app performance and reliability What every C/C++ and Fortran developer should know. Learn how Intel has extended the reach of its next-generation tools to help boost performance applications - inlcuding clusters. http://p.sf.net/sfu/intel-dev2devmay ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] check_openmanage PNP template (Was: check_openmanage errors)
"Randal, Phil" writes: > Is the beta of check_openmanage.php available for testing? Sure, I put it here: http://folk.uio.no/trondham/software/beta/ Highlights of the template are: - works with the plugin's new perfdata API - removed unnecessary dependence on PHP >= 5.2 (good for rhel/centos 5 users) - calculate power usage for the selected time period, in Watt hours and BTU > I'm currently using a slightly modified version of the one in the latest PNP > release. > > Two cosmetic issues came to mind: > > 1: Temperature is measured in Celsius, not Celcius Yep, I know. That typo was the first thing I fixed :) > 2: Formatting when reporting multiple sensors in one graph is irksome > - the values don't align in a nice column (e.g. temperatures). I > 'solve' this by a judicious use of substr() and str_pad() to normalise > the length of reported sensor names. Hm... this could be tricky to do in a consistent and general manner (at least the substr() part). The sensor names are as reported by OMSA. Perhaps this could be accomplished with some RRD magic instead? Tips and hints are welcome, since I'm neither a PHP expert nor an RRD ninja :) Cheers, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- WhatsUp Gold - Download Free Network Management Software The most intuitive, comprehensive, and cost-effective network management toolset available today. Delivers lowest initial acquisition cost and overall TCO of any competing solution. http://p.sf.net/sfu/whatsupgold-sd ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage errors
Steve Glasser writes: >> That combination should work just fine. Please try either of the beta >> versions, as I suggested in my previous email. The issue you're having >> may very well be fixed in the betas. > > Tried check_openmanage-3.7.0-beta2.0-beta2, problem solved. Excellent, thanks for testing and reporting back. I've just released versjon 3.6.6, which contains the same bugfixes as the 3.7 beta, but not the (unfinished) new features :) Cheers, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- WhatsUp Gold - Download Free Network Management Software The most intuitive, comprehensive, and cost-effective network management toolset available today. Delivers lowest initial acquisition cost and overall TCO of any competing solution. http://p.sf.net/sfu/whatsupgold-sd ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage errors
Steve Glasser writes: > D'oh. We are using check_openmanage with NRPE. The host o/s is CentOS > release 5.5. Perl is perl-5.8.8 (from rpm). That combination should work just fine. Please try either of the beta versions, as I suggested in my previous email. The issue you're having may very well be fixed in the betas. Cheers, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- WhatsUp Gold - Download Free Network Management Software The most intuitive, comprehensive, and cost-effective network management toolset available today. Delivers lowest initial acquisition cost and overall TCO of any competing solution. http://p.sf.net/sfu/whatsupgold-sd ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check-openmanage errors after upgrade of openmanage
Ashcor Technologies writes: > I ran check_openmanage.exe --only storage locally and it worked fine. > > I then changed the NSC.ini to have: > > command[check_openmanage]=check_openmanage.exe --only storage > > and restarted the NSCLient++ (x64) service in test mode. > > the results: > > d NSClient++.cpp(1106) Injecting: check_openmanage: > d NSClient++.cpp(1142) Injected Result: WARNING 'Problem running > 'omreport chass is fans': Error! No fan probes found on this > system.Problem running 'omreport chassis temps': Error! No > temperature probes found on this system.Proble m running 'omreport > chassis volts': Error! No voltage probes found on this system.' Ok, this actually clarifies things. Clearly, NSClient++ ignores everything after 'check_openmanage.exe' in your NSC.ini. There is no way that check_openmanage would complain about fans etc. when the option '--only storage' is specified. Since it works from command line we can safely assume that NSClient++ is the problem. This explains your issues with the timeout option as well. > I've looked on your site for the dev versions and am happy to try them > but don't see a zip with the .exe. Is there an .exe available for the > dev? also, which dev version would you prefer I try, 3.6 or 3.7? I could make a PE32 executable for the dev versions, but in your case it won't help, so there is really no point. Your problem is that NSClient++ ignores the plugin options. Since I don't use NSClient++ I can't offer any insight into how it should be configured, but my first attempt at a fix would be to put the entire command in quotes in NSC.ini. Cheers, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- WhatsUp Gold - Download Free Network Management Software The most intuitive, comprehensive, and cost-effective network management toolset available today. Delivers lowest initial acquisition cost and overall TCO of any competing solution. http://p.sf.net/sfu/whatsupgold-sd ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check-openmanage errors after upgrade of openmanage
Trond Hasle Amundsen writes: > Are you using check_openmanage with NRPE or similar in local mode, or > checking via SNMP? I have an idea of what the problem might be. Can you try either of the development versions of check_openmanage available here: http://folk.uio.no/trondham/software/check_openmanage.html#download Regards, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- WhatsUp Gold - Download Free Network Management Software The most intuitive, comprehensive, and cost-effective network management toolset available today. Delivers lowest initial acquisition cost and overall TCO of any competing solution. http://p.sf.net/sfu/whatsupgold-sd ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage
Ashcor Technologies writes: > ok, talked to dell, there is no hardware on the T105 that will allow > monitoring of the fan, voltage, etc.. basically the only thing you can > monitor is the raid array which is fine as that's all I really want to > check with nagios. Ok. I don't know the 100 series, but from what I understand they are entry-level servers with limited capabilities and a low price tag. The plugin will barf at servers that don't have the basic monitoring probes, unless they are absent for obvious reasons (e.g. blades don't have fans). I still think this is a good idea, as I've seen plenty of instances where OMSA malfunctions in such a way that it will say a probe doesn't exist when it actually does. I'm reluctant to change that policy, so users of the 100 series will have to exclude certain checks in the plugin. It is not ideal, but I believe the problem to be limited since most would go for servers with better monitoring capabilities (i.e. 200 series and beyond). > Still have that pesky timeout after 30 seconds error though. tried > with --timeout 60 and with -t 60 and nothing seems to change the > behavior. Still weird. Did you try running the plugin manually with the timeout option? Try 'check_openmanage.exe -t 60 [other options]' Perhaps OMSA on the T105 hangs on some probe that doesn't exist. If you're only interested in monitoring storage, you could try: check_openmanage.exe --only storage Cheers, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- WhatsUp Gold - Download Free Network Management Software The most intuitive, comprehensive, and cost-effective network management toolset available today. Delivers lowest initial acquisition cost and overall TCO of any competing solution. http://p.sf.net/sfu/whatsupgold-sd ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check-openmanage errors after upgrade of openmanage
Steve Glasser writes: > Since upgrading dell openmanage from v 6.3 to 6.5 we have errors using > the check-openmanage plugin. The errors are: > > INTERNAL ERROR: Use of uninitialized value in hash element at > /usr/lib64/nagios/plugins/check_openmanage line 4599. > INTERNAL ERROR: Use of uninitialized value in length at > /usr/lib64/nagios/plugins/check_openmanage line 4599. > INTERNAL ERROR: Use of uninitialized value in hash element at > /usr/lib64/nagios/plugins/check_openmanage line 4599. > INTERNAL ERROR: Use of uninitialized value in concatenation (.) or > string at /usr/lib64/nagios/plugins/check_openmanage line 4599. > INTERNAL ERROR: Use of uninitialized value in hash element at > /usr/lib64/nagios/plugins/check_openmanage line 4601. > INTERNAL ERROR: Use of uninitialized value in hash element at > /usr/lib64/nagios/plugins/check_openmanage line 4601. > INTERNAL ERROR: Use of uninitialized value in hash element at > /usr/lib64/nagios/plugins/check_openmanage line 4599. > INTERNAL ERROR: Use of uninitialized value in length at > /usr/lib64/nagios/plugins/check_openmanage line 4599. > INTERNAL ERROR: Use of uninitialized value in hash element at > /usr/lib64/nagios/plugins/check_openmanage line 4599. > INTERNAL ERROR: Use of uninitialized value in concatenation (.) or > string at /usr/lib64/nagios/plugins/check_openmanage line 4599. > INTERNAL ERROR: Use of uninitialized value in hash element at > /usr/lib64/nagios/plugins/check_openmanage line 4601. > INTERNAL ERROR: Use of uninitialized value in hash element at > /usr/lib64/nagios/plugins/check_openmanage line 4601. > INTERNAL ERROR: Use of uninitialized value in hash element at > /usr/lib64/nagios/plugins/check_openmanage line 4599. > INTERNAL ERROR: Use of uninitialized value in length at > /usr/lib64/nagios/plugins/check_openmanage line 4599. > INTERNAL ERROR: Use of uninitialized value in hash element at > /usr/lib64/nagios/plugins/check_openmanage line 4599. > INTERNAL ERROR: Use of uninitialized value in concatenation (.) or > string at /usr/lib64/nagios/plugins/check_openmanage line 4599. > INTERNAL ERROR: Use of uninitialized value in hash element at > /usr/lib64/nagios/plugins/check_openmanage line 4601. > INTERNAL ERROR: Use of uninitialized value in hash element at > /usr/lib64/nagios/plugins/check_openmanage line 4601. > > The plugin reports "status unknown". > > Openmanage is version check-openmanage-3.6.5-1.el5 installed from rpm. > The host is an dell 2950. Please let me know if I can provide any > additional information. Hi Steve, Are you using check_openmanage with NRPE or similar in local mode, or checking via SNMP? Regards, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- WhatsUp Gold - Download Free Network Management Software The most intuitive, comprehensive, and cost-effective network management toolset available today. Delivers lowest initial acquisition cost and overall TCO of any competing solution. http://p.sf.net/sfu/whatsupgold-sd ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage
Ashcor Technologies writes: > the server is a PowerEdge T105. It IS running slow but I'll be damned > if I can figure out why, I'm beggining to suspect bad ram as the > performance meter reports minimal load. One thing to check is the power management setting in the BIOS. We set up a few blade servers recently that had set this to "active power controller", and this caused the server to be extremely sluggish. Setting this to "OS Control" or "Maximum Performance" solved the issue. Try: # omreport chassis pwrmanagement config=profile Power Profiles Maximum Performance : Not Selected Active Power Controller : Not Selected OS Control : Selected Custom : Not Selected You can set the profile to max performance with: omconfig chassis pwrmanagement config=profile profile=maxperformance Just a tip, but worth checking. > here is the command line in the NSC.ini > > [modules] > command[check_openmanage]=check_openmanage.exe -t 60 --check > fans=0,volt=0 > > on the nagios server: > > /usr/lib/nagios/plugins/check_nrpe -H $hostname$ -p 5666 -c > check_openmanage -t 60 > > I'm pretty sure it's not the Check_nrpe command line as this works fine > on several other servers. it's def something on the client server > itself so this points to the NSClient++ setup. Can't see anything wrong with these definitions.. > note I have been testing by running NSClient++.exe /test so i can watch > the client server and it is getting the injection command and reporting > the timeout locally. Good. But it's still weird that you get a timeout after 30 seconds even when you specify a 60 sec timeout. Try running check_openmanage.exe manually on the server with the same options and see if it then behaves in the same way. If so there is some sort of bug in the plugin that only affects the .exe version. Regards, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- WhatsUp Gold - Download Free Network Management Software The most intuitive, comprehensive, and cost-effective network management toolset available today. Delivers lowest initial acquisition cost and overall TCO of any competing solution. http://p.sf.net/sfu/whatsupgold-sd ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage
Ashcor Technologies writes: > Ok, now new and exciting changes... no matter what I do I get: WARNING > PLUGIN TIMEOUT: check_openmanage timed out after 30 seconds. > > I have -t 60 set on the check_openmanage command and also on the NRPE > check command line and in the NSC.ini. nothing seems to change the > timout beyond 30 seconds. I forgot to mention that since you get that particular error it's the plugin that times out, not NRPE or NSClient++. The fact that you're unable to change that behaviour with the '-t' or '--timeout' option is strange, but it would usually indicate a configuration error on your part. You'll have to post the command definition etc. for me (and others on this list) to be able to spot the error. Regards, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- WhatsUp Gold - Download Free Network Management Software The most intuitive, comprehensive, and cost-effective network management toolset available today. Delivers lowest initial acquisition cost and overall TCO of any competing solution. http://p.sf.net/sfu/whatsupgold-sd ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage
Ashcor Technologies writes: > Ok, now new and exciting changes... no matter what I do I get: WARNING > PLUGIN TIMEOUT: check_openmanage timed out after 30 seconds. > > I have -t 60 set on the check_openmanage command and also on the NRPE > check command line and in the NSC.ini. nothing seems to change the > timout beyond 30 seconds. > > (yes, I've restarted the nsclient++.exe on the remote server). Hmm.. Unless the server is under very heavy load you're still having OMSA problems. I'm guessing that some probe doesn't respond properly and just hangs. If the problem is load related, you should consider checking via SNMP instead. What model PowerEdge is this btw? Regards, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- WhatsUp Gold - Download Free Network Management Software The most intuitive, comprehensive, and cost-effective network management toolset available today. Delivers lowest initial acquisition cost and overall TCO of any competing solution. http://p.sf.net/sfu/whatsupgold-sd ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage
Ashcor Technologies writes: > now my problem is this... > > Problem running 'omreport chassis fans': Error! No fan probes found on > this system.Problem running 'omreport chassis temps': Error! No > temperature probes found on this system.Problem running 'omreport > chassis volts': Error! No voltage probes found on this system. > > on the NSC.ini i have the following line added and I restarted the > NSClient++ service > > command[check_openmanage]=check_openmanage.exe -b fan=all > > even tried > > command[check_openmanage]=check_openmanage.exe -b fan=0 > > however it still tries to check the fan. I suppose i have a syntax > error? No, that is the correct syntax. Blacklisting won't prevent the component class from being checked in the first place, it will only suppress any info about blacklisted components it in the output and plugin return value. To skip fans alltogether use the '--check' option like this: '--check fans=0'. However, unless this is a blade system and the plugin is unable to identify it as such for some reason, your server HAS fan probes and you're having an OMSA problem. The fact that you get errors for other probes such as temperature and voltage confirms this. You need to recheck that OMSA works, that all relevant OMSA components are installed and running etc. It may be as simple as restarting OMSA, but it could also be more complex (e.g. BIOS/firmware upgrade needed). These errors are pretty generic, but the problem is that OMSA isn't working properly on that server. PS. See this URL about configuring Nagios to not escape HTML code in the plugin output (to avoid the literal ''): http://folk.uio.no/trondham/software/check_openmanage.html#multiline-output Cheers, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- WhatsUp Gold - Download Free Network Management Software The most intuitive, comprehensive, and cost-effective network management toolset available today. Delivers lowest initial acquisition cost and overall TCO of any competing solution. http://p.sf.net/sfu/whatsupgold-sd ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage
Ashcor Technologies writes: > Thanks for the reply. I just realized from your question that I'm using > a pre-compiled .exe version of your check_openmanage from here: > > https://www.monitoringexchange.org/inventory/Check-Plugins/Hardware/check_openmanage-exe > > which was probably created from an older version... Yeah I think it's pretty old. A PE32 executable for Windows is available in the zip and tar.gz archives, and as a single file download: http://folk.uio.no/trondham/software/check_openmanage.html#download Upgrading to the latest version will probably solve your problem. Let me know if it doesn't. Cheers, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- WhatsUp Gold - Download Free Network Management Software The most intuitive, comprehensive, and cost-effective network management toolset available today. Delivers lowest initial acquisition cost and overall TCO of any competing solution. http://p.sf.net/sfu/whatsupgold-sd ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage
Ashcor Technologies writes: > on two of my dell servers check_openmanage (via nsclient++ and nrpe) > return the same error: > > "Use of uninitialized value in concatenation (.) or string at > script/check_openmanage.pl line 1386." > > both dell systems are running the latest OpenManage version 6.5.0. Hi Jeff, Which version of check_openmanage is this? Regards, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- WhatsUp Gold - Download Free Network Management Software The most intuitive, comprehensive, and cost-effective network management toolset available today. Delivers lowest initial acquisition cost and overall TCO of any competing solution. http://p.sf.net/sfu/whatsupgold-sd ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Why is check_openmanage so slow on PowerEdge R510?
"C. Bensend" writes: >Is there anything in OMSA that tells how *long* a battery has > been charging? I simply got so tired of the charging warnings > that I blacklisted the bat_charge totally, but I'd still like to > detect that type of error - where the battery never finishes > charging. > >If OMSA has it, it would be great to have the option within > check_openmanage to specify a length of time threshold for battery > charging. :) Hi Benny, Unfortunately OMSA has no info on when the charge cycle is expected to be finished, or how long it has been in its current learn/charge state: # omreport storage battery controller=1 Battery 0 on Controller PERC 6/E Adapter (Slot 1) Controller PERC 6/E Adapter (Slot 1) ID: 0 Status: Non-Critical Name : Battery 0 State : Charging Recharge Count: Not Applicable Max Recharge Count: Not Applicable Predicted Capacity Status : Ready Learn State : Requested Next Learn Time : 0 hours Maximum Learn Delay : 7 days 0 hours Learn Mode: Auto I could make the plugin record it, but then I would violate my principle that the plugin should be stateless... Introducing state in the plugin complicates things. There is another reason that you would want to know that the battery is charging, and I suspect that this is also why Dell has OMSA report it as a non-critical (warning) status. During (some of) the charge cycle, write-back for vdisks (i.e. use of the cache) is disabled. This means that the RAID performance is degraded, and depending on the nature of your disk usage you'll want to know about this when it happens. OMSA also lets you delay the charge cycle for up to seven days. Cheers, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- Benefiting from Server Virtualization: Beyond Initial Workload Consolidation -- Increasing the use of server virtualization is a top priority.Virtualization can reduce costs, simplify management, and improve application availability and disaster protection. Learn more about boosting the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Why is check_openmanage so slow on PowerEdge R510?
Helmut Wollmersdorfer writes: > Another question: > > I always get on all of the R510s (few days old): > > root@xen11:~# /usr/lib/nagios/plugins/check_openmanage > Cache Battery 0 in controller 0 is Charging (Ready) [probably harmless] > root@xen11:~# uptime > 12:08:35 up 2 days, 1:22, 1 user, load average: 0.00, 0.00, 0.00 > > I wonder a little bit that the batteries are not full after some days powered, > or if the information is wrong. The plugin is simply reporting what OMSA says, so if the info is wrong it would have to be in the hardware or OMSA level. However I don't think that this is the case. Batteries take a long time to charge for new servers, i.e. if the battery is brand new and hasn't been charged before. At one time we had a battery that didn't finish charging for a week, called Dell and got a replacement battery. This was during a regular charge cycle. In your case I would give it a few more days. > Also I tried to '--blacklist bat_charge=0,0' (and other combinations), but > blacklisting does not work. Look in the debug output for the battery ID, which consists of the controller number and battery number with colon as delimiter. In your case it would be --blacklist bat_charge=0:0 or simply use 'all': --blacklist bat_charge=all But, as we in fact did experience a case where the battery never finished charging I would advice against this. We just ignore the battery charge warnings unless they persist for days. It can be annoying, but we decided that we can live with it :) Cheers, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- Benefiting from Server Virtualization: Beyond Initial Workload Consolidation -- Increasing the use of server virtualization is a top priority.Virtualization can reduce costs, simplify management, and improve application availability and disaster protection. Learn more about boosting the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Why is check_openmanage so slow on PowerEdge R510?
Helmut Wollmersdorfer writes: > new to this architecture I installed the monitoring plugin check- > openmanage and was surprised about the performance: > > root@xen10:~# time perl /usr/lib/nagios/plugins/check_openmanage -d | > head -n 3 > sh: /bin/rpm: not found > System: PowerEdge R510 II OMSA version: > 6.5.0 > ServiceTag: 1Z7215J Plugin version: > 3.6.5 > BIOS/date:1.6.3 02/01/2011Checking mode: > local > > real 0m3.426s > user 0m2.456s > sys 0m0.544s > > OS: Debian > root@xen10:~# uname -a > Linux xen10 2.6.32-5-xen-amd64 #1 SMP Tue Mar 8 00:01:30 UTC 2011 > x86_64 GNU/Linux > > Most calls of check_openmanage (from the shell) take 3 - 4 seconds, > some with '--only' are faster, but not as fast as omreport: > > root@xen10:~# time perl /usr/lib/nagios/plugins/check_openmanage -- > only fans > FANS OK - 5 fan probes checked > > real 0m0.716s > > > root@xen10:~# time /opt/dell/srvadmin/bin/omreport chassis fans > Fan Probes Information > > Fan Redundancy > Redundancy Status : Full > [...] > > real 0m0.037s > > In comparison called with the option --help (does nearly nothing) the > execution time is as expected for loading the perl interpreter and > compiling the source: > > root@xen10:~# time perl /usr/lib/nagios/plugins/check_openmanage -h > [...] > real 0m0.064s > > What can be the reason? Hi Helmut, The simple answer is that omreport commands take time. They represent the vast majority of the plugin execution time. The reason that 'check_openmanage --only fans' takes significantly more time than the corresponding omreport command is that the plugin first runs 'omreport -?' to determine if this is a blade or not. If you add the time it takes to run 'omreport -?', the omreport fans command and perl interpreter time you should arrive at about the time it takes 'check_openmanage --only fans' to finish. Note that storage takes time to check, since the omreport commands for storage are slow. This is especially true if you have a lot of storage (e.g. an R510). Also note that if you use the '-d' option, check_openmanage will run 'omreport about' to determine the OMSA version. This is a slow command and adds to the overall execution time. The plugin is much faster if used in SNMP mode, especially if you lots of storage. Example from a 2950 with a couple of MD1000 shelves of extra storage: $ time ./check_openmanage -H foo OK - System: 'PowerEdge 2950 III', SN: 'XXX', 16 GB ram (8 dimms), 3 logical drives, 32 physical drives real0m1.725s user0m0.397s sys 0m0.013s foo /# time /usr/lib64/nagios/plugins/check_openmanage OK - System: 'PowerEdge 2950 III', SN: 'XXX, 16 GB ram (8 dimms), 3 logical drives, 32 physical drives real0m4.188s user0m2.997s sys 0m0.821s As you can see the footprint is significantly smaller with SNMP, so if this is a concern then SNMP should be your weapon of choice :) Cheers, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- Forrester Wave Report - Recovery time is now measured in hours and minutes not days. Key insights are discussed in the 2010 Forrester Wave Report as part of an in-depth evaluation of disaster recovery service providers. Forrester found the best-in-class provider in terms of services and vision. Read this report now! http://p.sf.net/sfu/ibm-webcastpromo ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage internal error
Adam Caines writes: > Looks like it's reporting "path health". The 6e has both sas ports > connected to redundant controllers in the MD1120. It's strange on > another server, I also have a PERC H700 connect to a MD1220 with > redundant links and it does not output the "path health" section. [snip] > ID : 0 > Status : Ok > Name : Logical Connector > State : Ready > Connector Type : SAS Port RAID Mode > Termination : Not Applicable > SCSI Rate : Not Applicable > > Path Health > Status : Ok > Name : Connector 0 > State : Available > > Status : Ok > Name : Connector 1 > State : Available Yes, so this is the culprit... check_openmanage did not expect this output. It looks like the controller is connected to the enclosure in redundant path mode, according to the OMSA documentation[1]. I really need to see how this looks with SSV format, can you provide the output from this command: omreport storage connector controller=1 -fmt ssv In case of redundant path mode, the plugin should check the path health and report on it, in addition to the connector health. This functionality must be added to the plugin. Is it possible for you to check how check_openmanage handles this when checking via SNMP as well? [1] http://support.euro.dell.com/support/edocs/software/svradmin/6.4/en/CLI/HTML/reportst.htm#wp1077100 Cheers, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- Colocation vs. Managed Hosting A question and answer guide to determining the best fit for your organization - today and in the future. http://p.sf.net/sfu/internap-sfd2d ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage internal error
Adam Caines writes: > Looks like some strange output on the lines for controller 1? The > formatting is breaking there. I checked omreport storage controller > and didn't see anything that stood out as being strange. [snip] > OK | 0:0 | Connector 0 [SAS Port RAID Mode] on controller 0 is > Ready > OK | 0:1 | Connector 1 [SAS Port RAID Mode] on controller 0 is > Ready > OK | 1:0 | Logical Connector [SAS Port RAID Mode] on controller 1 > is Ready > | 1:Status | State [Name] on controller 1 is Status > | 1:Ok | Available [Unknown type] on controller 1 is Unknown > state > | 1:Ok | Available [Unknown type] on controller 1 is Unknown > state Ok, something strange going on here. This seems to be a parsing error in the plugin, related to the connectors. As I don't have any MD1120 enclosures, I'm curious if these errors are related to the MD1120 being different somehow. Can you send the output from these commands: omreport storage connector controller=0 omreport storage connector controller=1 and also: omreport storage connector controller=0 -fmt ssv omreport storage connector controller=1 -fmt ssv The latter is what the plugin is using as it is easier to parse. Cheers, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- Colocation vs. Managed Hosting A question and answer guide to determining the best fit for your organization - today and in the future. http://p.sf.net/sfu/internap-sfd2d ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage internal error
Adam Caines writes: > All of the status lines appear to be Ok. Indeed they do. This appears to be trickier than I initially thought. Perhaps the debug output from the plugin has some clues? Try 'check_openmanage -d --only storage'. It will attempt to print the status of all monitored storage components. Regards, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- What You Don't Know About Data Connectivity CAN Hurt You This paper provides an overview of data connectivity, details its effect on application quality, and explores various alternative solutions. http://p.sf.net/sfu/progress-d2d ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage internal error
Adam Caines writes: > Having a strange problem with check_openmanage. Use it without error on many > other systems. Any help would be appreciated. > > check_openmanage version: 3.6.5 (.exe version) > Dell OMSA version: 6.4.0 > OS: Windows Server 2008 R2 > Hardware: Poweredge 1950 with PERC 6/i and PERC 6/e connected to MD1120 > > > > check_openmanage output: > > OK - System: 'PowerEdge 1950 III', SN: 'XXX', 8 GB ram (4 dimms), 2 > logical > drives, 28 physical drives > INTERNAL ERROR: Use of uninitialized value in numeric lt (<) at script/ > check_openmanage line 4634. > INTERNAL ERROR: Use of uninitialized value in numeric lt (<) at script/ > check_openmanage line 4634. > INTERNAL ERROR: Use of uninitialized value in numeric lt (<) at script/ > check_openmanage line 4634. > INTERNAL ERROR: Use of uninitialized value in numeric lt (<) at script/ > check_openmanage line 4634. > INTERNAL ERROR: Use of uninitialized value in numeric lt (<) at script/ > check_openmanage line 4634. > INTERNAL ERROR: Use of uninitialized value in numeric lt (<) at script/ > check_openmanage line 4634. > INTERNAL ERROR: Use of uninitialized value $level in numeric eq (==) at > script/ > check_openmanage line 4637. > INTERNAL ERROR: Use of uninitialized value $level in numeric eq (==) at > script/ > check_openmanage line 4637. > INTERNAL ERROR: Use of uninitialized value $level in numeric eq (==) at > > > > If I run check_openmanage --no-storage the errors are not present: Hi Adam, Interesting. This is the status of the device (as reported by omreport) that is garbled somehow. The plugin will set the status to 'Unknown' if the field is missing or empty, so this means that omreport is reporting the status as something new that check_openmanage doesn't recognize. That you're getting so many of them (and you have established that it's a storage issue), makes me think that it is related to physical disks. We need to see what omreport says about storage, in particular the disk drives. Can you send the output from omreport storage pdisk controller=X where 'X' is the controller number (0,1) , for each of the controllers. If the Status field is 'Ok' for all the disks, we need to look further. Cheers, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- What You Don't Know About Data Connectivity CAN Hurt You This paper provides an overview of data connectivity, details its effect on application quality, and explores various alternative solutions. http://p.sf.net/sfu/progress-d2d ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Error in performance-data-output
"Lichterfeld, Dirk" writes: > I compare the response time of the nagios check and I see, that the DELL > Server R710 needs over 10 seconds to answer. Another server (DELL R310) > answer in 8 seconds (the check of this server is ok.) > > The response time depends on various Dell hardware. Yes, this is expected when using the win32 binary file. It contains a perl interpreter and is slow to start up and execute. When monitoring windows machines, SNMP is preferable unless your security policies prohibits this. > What I do? I expanded the check-command of the check_openmange from > "check_nrpe -H $HOSTADDRESS$ -c Check_Openmanage" with the parameter "-t > 30" to extend the time for this check. 30 seconds is the default timeout for check_openmanage. I would set the timeout to slightly more than the check_openmanage timeout. If you do that, you'll get a meaningful error message from check_openmanage instead of a cryptic one from NSClient++, if check_openmanage times out for some reason. Anyway, the '-t 30' parameter to check_nrpe should work... > Is there another way to set the timeout? I'm not familiar with NSClient++, perhaps it has its own timeout? Cheers, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- What You Don't Know About Data Connectivity CAN Hurt You This paper provides an overview of data connectivity, details its effect on application quality, and explores various alternative solutions. http://p.sf.net/sfu/progress-d2d ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Error in performance-data-output
"Lichterfeld, Dirk" writes: > Hi Trond, > > I´m sorry, at my company we use Outlook, so the highlighted text is > distinctly and visibly. > > I will try to specify the problem I mean. > > If I run NSClient++ in testmode I will get the follow output: > > d NSClient++.cpp(1106) Injecting: Check_OpenManage: > d NSClient++.cpp(1142) Injected Result: OK 'OK - System: 'PowerEdge > R710 II', SN: 'XXX', 4 GB ra > m (2 dimms), 1 logical drives, 4 physical drives' > d NSClient++.cpp(1143) Injected Performance Result: > 'fan_0_system_board_fan_1_rpm=3600;0;0 fan_1_sys > tem_board_fan_2_rpm=3600;0;0 fan_2_system_board_fan_3_rpm=3600;0;0 > fan_3_system_board_fan_4_rpm=3600 > ;0;0 fan_4_system_board_fan_5_rpm=3600;0;0 > pwr_mon_0_ps_1_current=0.4;0;0 pwr_mon_1_ps_2_current=0.4 > ;0;0 pwr_mon_2_system_board_system_level=175;917;966 > temp_0_system_board_ambient=20;42;47 > ' > > You can see, the injected perfomance result beginns and ends with a '. Yes, but I think that NSClient++ is responsible for that, putting everything inside single quotes. As you can see it does that for the plugin output as well. > 1. I mean, that every description and only the description must be inside of > the signs ' > our output: fan_2_system_board_fan_3_rpm > must be:'fan_2_system_board_fan_3_rpm' > 2. At the end is no special sign approved. > > You can read this in "chapter 2.6 Performance data" at > http://nagiosplug.sourceforge.net/developer-guidelines.html > > I hope I could describe the problem well enough. Yes, thank you, this was much clearer :) However, the quotes are not needed according to the guidelines for performance data[1]: 3. the single quotes for the label are optional. Required if spaces, = or ' are in the label The perfdata labels don't contain any of the offending characters. Could it be that this is a Windows issue, or perhaps NSClient++? Any NSClient++ users here who can confirm if this is the case? I'm thinking that perhaps the underscore character '_' is throwing off Windows or NSClient++. [1] http://nagiosplug.sourceforge.net/developer-guidelines.html#AEN201 Regards, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- Free Software Download: Index, Search & Analyze Logs and other IT data in Real-Time with Splunk. Collect, index and harness all the fast moving IT data generated by your applications, servers and devices whether physical, virtual or in the cloud. Deliver compliance at lower cost and gain new business insights. http://p.sf.net/sfu/splunk-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Error in performance-data-output
"Lichterfeld, Dirk" writes: > I want to use "check_openmanage" to monitor some Dell servers with Nagios. At > the most of them we use as OS Windows Server 2003. > > I have a problem to get the performance results from the machines. With the > command "check_openmanage -p" I get a failure in Nagios like "CHECK_NRPE: > Socket Time Out" > > I checked the result of the Check an see what is happend. In the return of the > performance-data fails some signs. I highlight the missing signs: > > > OUTPUT with NSClient++ in TestMode (where the highlighted signs are not in the > output): > > d NSClient++.cpp(1106) Injecting: Check_OpenManage: > d NSClient++.cpp(1142) Injected Result: OK 'OK - System: 'PowerEdge R710 II', > SN: 'XXX', 4 GB ra > m (2 dimms), 1 logical drives, 4 physical drives' > d NSClient++.cpp(1143) Injected Performance Result: > 'fan_0_system_board_fan_1_rpm'=3600;0;0 'fan_1_sys > tem_board_fan_2_rpm'=3600;0;0 'fan_2_system_board_fan_3_rpm'=3600;0;0 ' > fan_3_system_board_fan_4_rpm'=3600 > ;0;0 'fan_4_system_board_fan_5_rpm'=3600;0;0 'pwr_mon_0_ps_1_current'=0.4;0;0 > ' > pwr_mon_1_ps_2_current'=0.4 > ;0;0 'pwr_mon_2_system_board_system_level'=175;917;966 ' > temp_0_system_board_ambient'=20;42;47 > ' > > OUTPUT with MS-DOS-Window (the highlighted signs are not in the output): > > C:\Programme\check_openmanage-3.6.5>check_openmanage.exe -p > OK - System: 'PowerEdge R710 II', SN: 'XXX', 4 GB ram (2 dimms), 1 logical > drives, 4 physical drives|'fan_0_system_board_fan_1_rpm'=3600;0;0 ' > fan_1_system_board_fan_2_rpm'=3600;0;0 'fan_2_system_board_fan_3_rpm'=3600;0;0 > 'fan_3_system_board_fan_4_rpm'=3600;0;0 > 'fan_4_system_board_fan_5_rpm'=3600;0;0 > 'pwr_mon_0_ps_1_current'=0.4;0;0 'pwr_mon_1_ps_2_current'=0.4;0;0 ' > pwr_mon_2_system_board_system_level'=175;917;966 > 'temp_0_system_board_ambient'= > 20;42;47 Hi Dirk, I'm having a hard time understanding what you mean. Perhaps my mail client is playing tricks on me, but I can't see anything highlighted. Except for weird line breaks the perfdata looks OK to me in both examples. Can you be more specific and pinpoint exactly where the problem is? Regards, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- Free Software Download: Index, Search & Analyze Logs and other IT data in Real-Time with Splunk. Collect, index and harness all the fast moving IT data generated by your applications, servers and devices whether physical, virtual or in the cloud. Deliver compliance at lower cost and gain new business insights. http://p.sf.net/sfu/splunk-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Check_openmanage-- Current probes not found
Joe Beck writes: > Yes, just after sending this post I did the things you identified. > Verifed model vs others where this issue was not happening > We have several r610's & this is only one with the issue. > Then I went & looked at the omsa version & found this one was running 5.9 > where the others had 6.4 > I removed & installed 6.4 but same result. > I also had some question/confusion about best way to identify the version; > in fact it may have already been running 6.4. > > I'm grep'ing for version; tried running cmds with -v & --version, etc but no > luck in seeing which version via the cmds This command will tell you which version of OMSA you're running: omreport about There are other ways as well: http://folk.uio.no/trondham/software/check_openmanage.html#how-can-i-find-out-which-version-of-omsa-my-server-is-running I'm not sure if you understood my question about the servers being identical. I didn't mean the model (I assumed the model would be the same), but hardware-wise. Specifically, are they alike with respect to number of power supplies? In any case, the next step will be to examine the installed OMSA software components. On RHEL and derivatives such as CentOS, you can do this by comparing the output from 'rpm -qa|grep srvadmin' from healthy boxes versus the failing one. Also check that the running OMSA services are the same. Since this is happening on only one server, and you have probably installed OMSA in exactly the same way on all the servers, you may have a real hardware problem. If all else fails, you should contact Dell support and have them look at it. Regards, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- Free Software Download: Index, Search & Analyze Logs and other IT data in Real-Time with Splunk. Collect, index and harness all the fast moving IT data generated by your applications, servers and devices whether physical, virtual or in the cloud. Deliver compliance at lower cost and gain new business insights. http://p.sf.net/sfu/splunk-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Check_openmanage-- Current probes not found
Joe Beck writes: > I have a couple R610’s > Some run omreport chassis pwrmonitoring & return output > I also have 1 that returns: > # omreport chassis pwrmonitoring > Power Consumption Information > > Error : Current probes not found > > Does this mean that this module just isn’t installed or ??? > > At this point, do I just alter the nagios service to exclude pwrmonitoring? Hi Joe, I think the next point should be to investigate why OMSA behaves like this. I've seen this error before, but on older servers with old OMSA versions (5.4.0). A simple restart of OMSA (srvadmin-services.sh restart) may be the solution and should be attempted first. The next step would be to reinstall OMSA and verify that everything gets installed. Usually, if power monitoring information is not available, OMSA should say something else and more informative. Is the problematic machine identical to the ones that work? Cheers, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- Free Software Download: Index, Search & Analyze Logs and other IT data in Real-Time with Splunk. Collect, index and harness all the fast moving IT data generated by your applications, servers and devices whether physical, virtual or in the cloud. Deliver compliance at lower cost and gain new business insights. http://p.sf.net/sfu/splunk-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage SNMP Error
Shawn Green writes: > I?m in the process of rolling out check_openmanage to monitor a variety of > hardware including R510s, M600s, and M610s. I?m running into an interesting > issue where the alert is reporting back: > > SNMP ERROR [cooling]: The requested entries are empty or do not exist. > > I understand this is an SNMP error (not check_openmanage), but what?s baffling > me is how to work around it. My Net::SNMP module is up to date (v6.0.1) as > are > net-snmp packages on all hosts. > > A good majority of hosts that are getting this error are M600/M610 blades, yet > other blades in the same chassis? do not get this error. I?m also seeing > these > on several R510s, yet other R510s have no problems. > > All hosts are Centos 5.5 64 bit with OMSA 6.2.0. Hi Shawn, One thing that is really peculiar is that you're getting this error from blade servers. The plugin should identify blades and ignore the fact that they don't have cooling devices (i.e. fans). You should never get this error from blades. Are you really sure that the error from your blades are with cooling and not something else? (If so, we'll need to investigate why the plugin doesn't identify the blade servers correctly). Your Net::SNMP version is fine and not to blame. The error lies with OMSA and/or the SNMP service. Try running on the servers: omreport chassis fans On the blades, you should get an error saying that no fan probes where found, which is normal. But the R510s should display fan info. If they don't, the problem is not SNMP related but with OMSA itself. If you haven't already done so, try restarting OMSA (i.e. run 'srvadmin-services.sh restart') on the servers. Reinstalling OMSA (or better yet: reinstall with version 6.4.0) is the logical next step. Make sure that there are no errors during installation and that everything gets installed. Cheers, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE: Pinpoint memory and threading errors before they happen. Find and fix more than 250 security defects in the development cycle. Locate bottlenecks in serial and parallel code that limit performance. http://p.sf.net/sfu/intel-dev2devfeb ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage and linebreaks
"Bryan O'Shea" writes: > check_openmanage and linebreaks not working in $SERVICEOUTPUT$ emails. > > When using the either of the following options the linebreaks seem to be > broken: > -e or --postmsg > > This is what i get in my service notification emails instead of the > desired output of seperate lines. > > "Power Supply 1 [AC] needs attention: Presence detected, Failure > detected, AC lostbr/NOTE: PowerEdge 2950 III 437RQH1 - > 555-1212" > > It puts a br/ in instead of a "\n". Hi Bryan, The default behaviour of check_openmanage is to use HTML linebreaks when run from Nagios, NRPE etc., and regular linebreaks in a console which has a TTY. The reason for this is that the plugin monitors several things, and in case of multiple alerts it's practical to display them each on a different line. However, since this behaviour doesn't fit everyone you can modify it with the '--linebreak' switch. To switch to regular (\n) linebreaks: check_openmanage --linebreak=REG You can also specify any string as a custom linebreak: check_openmanage --linebreak=' -- ' If you choose regular linebreaks, the first line will be put in the SERVICEOUTPUT macro, while any subsequent lines will be put in the LONGSERVICEOUTPUT macro. This is how Nagios 3.x handles multiline output from plugins. PS. In order for the default HTML linebreaks to work as indended in the web frontend, you should set "escape_html_tags=0" in the Nagios config. Cheers, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE: Pinpoint memory and threading errors before they happen. Find and fix more than 250 security defects in the development cycle. Locate bottlenecks in serial and parallel code that limit performance. http://p.sf.net/sfu/intel-dev2devfeb ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage: 'Amperage probe 0 [System Board System Level] reads 0 W'
"Tom Sommer" writes: >>> After upgrading OpenManage to version 6.4.0 on a DELL R410, >>> check_openmanage 3.6.4 returns >>> >>> CRITICAL: Amperage probe 0 [System Board System Level] reads 0 W >>> >>> >>> Is this due to OpenManage changing behavior (bug), or is the hardware >>> really faulty? (doubtful) :) > >> Most likely this is some sort of bug in OpenManage, or something went >> wrong during upgrade. You should confirm the fault by running >> >> omreport chassis pwrmonitoring > > # omreport chassis pwrmonitoring > > Power Consumption Information is not available on this system because all > the Power Supply units on your system do not support PMBus or the firmware > on your system does not support power monitoring. Strange.. if the system doesn't support power monitoring, the plugin shouldn't complain about it. Are you using check_openmanage via SNMP or locally? (I'm guessing SNMP, and if so there are obvious inconsistencies between what OMSA displays through omreport and what is available via SNMP.) Did power monitoring work at all before upgrading OMSA? >>> Anyone else seen this? >> >> Sorry, no. Very often these problems are resolved simply by restarting >> OpenManage on the monitored server, or a reboot. The next step is to >> re-install OpenManage in case something was missed during install/upgrade. >> If all else fails, contact Dell support. > > Tried all but the latter - guess it's a DELL bug. I forgot one other possible cause: old BIOS and/or firmware. Newer versions of OMSA often need relatively up-to-date BIOS and firmware versions to function normally. You should upgrade all BIOS and firmware on the server. Cheers, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)! Finally, a world-class log management solution at an even better price-free! Download using promo code Free_Logger_4_Dev2Dev. Offer expires February 28th, so secure your free ArcSight Logger TODAY! http://p.sf.net/sfu/arcsight-sfd2d ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage: "Amperage probe 0 [System Board System Level] reads 0 W"
"Tom Sommer" writes: > After upgrading OpenManage to version 6.4.0 on a DELL R410, > check_openmanage 3.6.4 returns > > CRITICAL: Amperage probe 0 [System Board System Level] reads 0 W > > Is this due to OpenManage changing behavior (bug), or is the hardware > really faulty? (doubtful) :) Hi Tom, Most likely this is some sort of bug in OpenManage, or something went wrong during upgrade. You should confirm the fault by running omreport chassis pwrmonitoring Investigate the "Status" field. The only accepted value is "Ok". > I know I could just disable amperage checks, but I'd like not to. > > Anyone else seen this? Sorry, no. Very often these problems are resolved simply by restarting OpenManage on the monitored server, or a reboot. The next step is to re-install OpenManage in case something was missed during install/upgrade. If all else fails, contact Dell support. Cheers, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)! Finally, a world-class log management solution at an even better price-free! Download using promo code Free_Logger_4_Dev2Dev. Offer expires February 28th, so secure your free ArcSight Logger TODAY! http://p.sf.net/sfu/arcsight-sfd2d ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage showing 0 logical drives with OMSA 6.4 and PERC4
Steve Jenkins writes: > On Tue, Jan 25, 2011 at 3:41 AM, Trond Hasle Amundsen > wrote: >> Interesting.. OMSA is obviously aware of the logical drives, but what >> does omreport actually say about them? Try running 'omreport storage >> vdisk controller='. > > Looks like omreport sees the controller, but not the VDisk: > > # omreport storage vdisk controller=0 > No virtual disks found Ok, so there is the reason that check_openmanage doesn't display any virtual disks. It relies on OMSA for the information, specifically omreport when used in local mode. Based on the issue at hand and your reports about OMSA 6.4 and PERC4 controllers on the linux poweredge list, it seems that the latest OMSA has serious issues with 8th gen Dell servers. PS. You may have noticed that the plugin doesn't issue an alert when virtual disks are missing. The reason for this is that it's perfectly legal and plausible for systems to have no virtual disks. This is the downside of a plugin that both discovers the components and monitors them at the same time. It can't give alerts on missing components unless they should always be present in all servers. A notable exception is controllers, since being unable to display controllers is a common OMSA problem. check_openmanage will complain about missing controllers even though controller-less systems are possible. Regards, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)! Finally, a world-class log management solution at an even better price-free! Download using promo code Free_Logger_4_Dev2Dev. Offer expires February 28th, so secure your free ArcSight Logger TODAY! http://p.sf.net/sfu/arcsight-sfd2d ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage showing 0 logical drives with OMSA 6.4 and PERC4
Steve Jenkins writes: > After upgrading three of the 1850s to Dell OMSA 6.4 today, I noticed > something strange. The three of them now display in Nagios: > > OK - System: 'PowerEdge 1850', SN: '', 3 GB ram (6 dimms), 0 > logical drives, 2 physical drives > > OK - System: 'PowerEdge 1850', SN: 'XXX', 12 GB ram (6 dimms), 0 > logical drives, 2 physical drives > > OK - System: 'PowerEdge 1850', SN: 'XXX', 4 GB ram (6 dimms), 0 > logical drives, 2 physical drives > > All three display 0 logical drives, even though they all have a > working RAID array. [snip] > The strange part is that OMSA 6.4 on the 1850s is clearly aware that > there's a logical drive, because the GUI shows "Virtual Disk 0 RAID-1" > in the Storage Dashboard. Hi Steve, Interesting.. OMSA is obviously aware of the logical drives, but what does omreport actually say about them? Try running 'omreport storage vdisk controller='. You seem to be running check_openmanage in local mode, so the output from omreport is what matters. Cheers, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)! Finally, a world-class log management solution at an even better price-free! Download using promo code Free_Logger_4_Dev2Dev. Offer expires February 28th, so secure your free ArcSight Logger TODAY! http://p.sf.net/sfu/arcsight-sfd2d ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Problem with check_openmanage
Jeffrey Watts writes: > Thanks Trond! That seems to have fixed it. Here's what I see now: > > ./check_openmanage -H pkc-search28 -C tomgeco > Power Supply 0 [AC] needs attention: Presence detected, Failure detected, AC > lost > Voltage sensor 14 [PS 2 Voltage 2] is Unknown reading > > It comes up correctly now as a CRIT, too. Good, thanks for reporting back. I'll include this fix in the next release. The problem was that where the reading is not available, the plugin assumes that the reading is discrete (i.e. not a number but "good", "bad" etc.). This assumption is wrong in cases where the reading is NOT discrete and simply not available via SNMP. The fixed version will set the reading to "Unknown reading" when the reading can't be obtained. (However, this situation shouldn't occur at all if OMSA it behaving as it should. Pulling the cable on one power supply would normally lead to a reading of 0 volts for that voltage probe.) Cheers, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)! Finally, a world-class log management solution at an even better price-free! Download using promo code Free_Logger_4_Dev2Dev. Offer expires February 28th, so secure your free ArcSight Logger TODAY! http://p.sf.net/sfu/arcsight-sfd2d ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Problem with check_openmanage
Jeffrey Watts writes: > Hello, I'm using Mr. Amundsen's excellent check_openmanage plugin, and I'm > getting an odd error: > > $ check_openmanage -H myserver -C public > Power Supply 0 [AC] needs attention: Presence detected, Failure detected, AC > lost > Voltage sensor 14 [PS 2 Voltage 2] is > INTERNAL ERROR: Use of uninitialized value $reading in sprintf at /usr/lib/ > nagios/plugins/check_openmanage line 3565. > > Has anyone else seen this error? I'm running version 3.6.4. Please let me > know what additional information is needed. Hi Jeffrey, This shouldn't happen, and I think I see where the problem is. Please try the version available here, and let me know if it performs any better: http://folk.uio.no/trondham/software/test/ Cheers, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)! Finally, a world-class log management solution at an even better price-free! Download using promo code Free_Logger_4_Dev2Dev. Offer expires February 28th, so secure your free ArcSight Logger TODAY! http://p.sf.net/sfu/arcsight-sfd2d ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Check_OpenManage error
"Jeffrey C. Veatch" writes: > To whom it may concern: > > I have been trying to use check_openmanage in my Nagios configuration, but no > matter what I do I get a list of Internal Errors at the end of the returned > test. The only way I can avoid it is by using the debug mode and only > returning the first 80 lines. This however does not warn me of any issues the > server is having. > > Here are some details. The server running OMSA is an R710 running VMware ESX > 4.0.0 Update 2. OMSA version is 6.4. > The nagios server is in a virtual machine running OpenSUSE 11.3. The Nagios > version is 3.2.3. > > If there are other packages that you need to know the version, let me know. > The following is an example of the results that I get. Oh, and in nagios this > ends up being an unknown state for the check. > > VLinux:/usr/local/nagios/libexec # ./check_openmanage -H 192.168.10.21 > OK - System: 'PowerEdge R710', SN: '5QTMZK1', 72 GB ram (18 dimms), 1 logical > drives, 2 physical drives > INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/ > 5.12.1/Net/SNMP.pm line 588. > INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/ > 5.12.1/Net/SNMP.pm line 655. > INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/ > 5.12.1/Net/SNMP.pm line 708. > INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/ > 5.12.1/Net/SNMP.pm line 764. > INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/ > 5.12.1/Net/SNMP.pm line 869. > INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/ > 5.12.1/Net/SNMP.pm line 952. > INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/ > 5.12.1/Net/SNMP.pm line 1028. > INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/ > 5.12.1/Net/SNMP.pm line 1103. > INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/ > 5.12.1/Net/SNMP.pm line 1168. > INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/ > 5.12.1/Net/SNMP.pm line 1325. > INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/ > 5.12.1/Net/SNMP.pm line 1531. > INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/ > 5.12.1/Net/SNMP.pm line 1549. > INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/ > 5.12.1/Net/SNMP.pm line 1563. > INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/ > 5.12.1/Net/SNMP.pm line 1577. > INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/ > 5.12.1/Net/SNMP.pm line 1591. > INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/ > 5.12.1/Net/SNMP.pm line 1613. > INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/ > 5.12.1/Net/SNMP.pm line 1633. > INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/ > 5.12.1/Net/SNMP.pm line 1653. > INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/ > 5.12.1/Net/SNMP.pm line 1674. > INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/ > 5.12.1/Net/SNMP.pm line 1702. > INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/ > 5.12.1/Net/SNMP.pm line 1737. > INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/ > 5.12.1/Net/SNMP.pm line 1846. > INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/ > 5.12.1/Net/SNMP.pm line 1968. > INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/ > 5.12.1/Net/SNMP.pm line 1973. > INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/ > 5.12.1/Net/SNMP.pm line 1978. > INTERNAL ERROR: Use of :locked is deprecated at /usr/lib/perl5/vendor_perl/ > 5.12.1/Net/SNMP.pm line 1983. > Thanks for any help you can give me. Hi Jeffrey, Interesting error, never seen this one before :) check_openmanage will print any perl warnings that occur during execution as internal errors. This is done to avoid situations where the plugin stops working due to perl incompatibilities etc. without your knowledge, as Nagios completely ignores any plugin output to STDERR. Which version of Net::SNMP are you using? Try 'rpm -q perl-Net-SNMP' to find out. Perl 5.12 deprecated the "locked" attribute, and this was fixed in Net::SNMP version 6.0.1, i.e. the latest release. The changelog for Net::SNMP 6.0.1 has the following: - Removed all occurrences of the "locked" attribute that was deprecated in Perl 5.12.0. I believe this to be a problem with your distribution using an old/incompatible version of Net::SNMP. It seems that for perl 5.12.x you need Net::SNMP 6.0.1 (or any later version). PS. I found this in the OpenSUSE bugzilla: https://bugzilla.novell.com/show_bug.cgi?id=629698 Cheers, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo ---
Re: [Nagios-users] check_openmanage plugin reporting Firmware out of date
Trond Hasle Amundsen writes: > "Surangiwala, Asif " writes: > >> Can we update the check_openmanage script to parse the "Minimum >> Required Firmware Version" and compare it with the current "Firmware >> Version" to overcome the OMSA bug? > > It is entirely possible to mitigate this bug within the plugin, but I > don't think that it's a good idea to let the plugin do all version > parsings and ignore OMSA on a general basis. I have created a version > that works around this particular bug (version 3.6.2-p1) and made it > available here: > > http://folk.uio.no/trondham/software/omsa-fw-bug/ > > It simply ignores out-of-date firmware if the firmware and minimum > firmware versions match those in question. But in order for this to > work, I also had to turn off checking the global health status, which > inherits the non-critical status of the controller. > > DISCLAIMER: This version is only intended as a temporary solution for > users of OMSA 6.3.0 that struggles with the recent firmware bug, and > don't want to use blacklisting as a workaround. When OMSA 6.4.0 becomes > available, you should upgrade OMSA and revert to a regular release of > check_openmanage. Hi Asif, Dell has released OMSA 6.4.0, which fixes the firmware version parsing issue. I have also released a new version of check_openmanage that contains a few compatibility fixes for OMSA 6.4.0: http://folk.uio.no/trondham/software/check_openmanage.html#download Cheers, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- Oracle to DB2 Conversion Guide: Learn learn about native support for PL/SQL, new data types, scalar functions, improved concurrency, built-in packages, OCI, SQL*Plus, data movement tools, best practices and more. http://p.sf.net/sfu/oracle-sfdev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage plugin reporting Firmware out of date
"Surangiwala, Asif " writes: > Can we update the check_openmanage script to parse the "Minimum > Required Firmware Version" and compare it with the current "Firmware > Version" to overcome the OMSA bug? It is entirely possible to mitigate this bug within the plugin, but I don't think that it's a good idea to let the plugin do all version parsings and ignore OMSA on a general basis. I have created a version that works around this particular bug (version 3.6.2-p1) and made it available here: http://folk.uio.no/trondham/software/omsa-fw-bug/ It simply ignores out-of-date firmware if the firmware and minimum firmware versions match those in question. But in order for this to work, I also had to turn off checking the global health status, which inherits the non-critical status of the controller. DISCLAIMER: This version is only intended as a temporary solution for users of OMSA 6.3.0 that struggles with the recent firmware bug, and don't want to use blacklisting as a workaround. When OMSA 6.4.0 becomes available, you should upgrade OMSA and revert to a regular release of check_openmanage. Regards, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- Increase Visibility of Your 3D Game App & Earn a Chance To Win $500! Tap into the largest installed PC base & get more eyes on your game by optimizing for Intel(R) Graphics Technology. Get started today with the Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs. http://p.sf.net/sfu/intelisp-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage plugin reporting Firmware out of date
"Surangiwala, Asif " writes: > I have Dell Open Manage Server Administrator 6.3.0 installed on some Dell > R710’s with PERC H700 controller. When I run the Nagios plugin > check_openmanage, it reports the following: > > Controller 0 [PERC H700 Integrated]: Firmware '12.10.0-0025' is out of date > > The H700 is running the latest firmware 12.10.0-0025, check_openmanage plugin > is v3.6.2 by Trond H. Amundsen. OMSA is running fine and is not complaining > about any firmware issues. > > The same ‘Firmware out of date’ warning is also given for H800 controllers on > the R710’s having it. > > Is there an issue with the plugin’s interaction with OMSA? Hi Asif, This is a bug in OMSA, not check_openmanage. OMSA is reporting that the firmware is too old while clearly it is not. Dell has stated that the bug will be fixed in the next version of OMSA. For more information, see the following thread on the Linux-Poweredge mailing list: http://lists.us.dell.com/pipermail/linux-poweredge/2010-December/043713.html As a workaround, I suggest using blacklisting to suppress the false warnings until OMSA 6.4.0 is released and deployed on your systems: check_openmanage -b ctrl_fw=all [..other options..] Regards, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- Increase Visibility of Your 3D Game App & Earn a Chance To Win $500! Tap into the largest installed PC base & get more eyes on your game by optimizing for Intel(R) Graphics Technology. Get started today with the Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs. http://p.sf.net/sfu/intelisp-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Check_OpenManage INTERNAL ERROR
Benny Somali writes: > Ignore my previous question. Too late, but no problem. My one-line patch is easily reversed :) > It worked fine now. > I used a batch script and didn't add a line to turn the echo off so it > returned special characters. > So I added @echo off and the Status Information displayed. Good. Thanks again for reporting this and for testing the beta version. Regards, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- Nokia and AT&T present the 2010 Calling All Innovators-North America contest Create new apps & games for the Nokia N8 for consumers in U.S. and Canada $10 million total in prizes - $4M cash, 500 devices, nearly $6M in marketing Develop with Nokia Qt SDK, Web Runtime, or Java and Publish to Ovi Store http://p.sf.net/sfu/nokia-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Check_OpenManage INTERNAL ERROR
Benny Somali writes: > Works fine now. Good, thanks for testing. > By the way, the Status Information field is blank, is it related to > the max length of 1023 chars? Probably not. You shouldn't run into problems with the silly nrpe limit for other than large servers with lots of performance data, and then only the perfdata should be affected. My guess is that the "State" field is also empty for the failed disk. I have an updated beta for you here: http://folk.uio.no/trondham/software/beta/ If should now report that the disk is "Unknown State". Regards, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- Nokia and AT&T present the 2010 Calling All Innovators-North America contest Create new apps & games for the Nokia N8 for consumers in U.S. and Canada $10 million total in prizes - $4M cash, 500 devices, nearly $6M in marketing Develop with Nokia Qt SDK, Web Runtime, or Java and Publish to Ovi Store http://p.sf.net/sfu/nokia-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Check_OpenManage INTERNAL ERROR
Benny Somali writes: > Yes, you are right. > There is pdisk #1 that has empty vendor ID field. > The disk in question was original Dell disk, however, it seemed to be > bad now. > We have an opened trouble ticket with Dell and expect to get a > replacement disk. Ah.. it makes sense that in some circumstances, if the disk is sufficiently bad, Openmanage can't report the vendor. I went ahead and patched this in the plugin. There is a beta version (win32 binary) available here: http://folk.uio.no/trondham/software/beta/check_openmanage.exe Please give it a try and let me know if it resolved this issue. Cheers, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- Nokia and AT&T present the 2010 Calling All Innovators-North America contest Create new apps & games for the Nokia N8 for consumers in U.S. and Canada $10 million total in prizes - $4M cash, 500 devices, nearly $6M in marketing Develop with Nokia Qt SDK, Web Runtime, or Java and Publish to Ovi Store http://p.sf.net/sfu/nokia-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Check_OpenManage INTERNAL ERROR
Benny Somali writes: > INTERNAL ERROR: substr outside of string at script/check_openmanage line 1502. > INTERNAL ERROR: Use of uninitialized value in lc at script/check_openmanage > line 1502. Hi Benny, Thanks for reporting this. The error is related to the vendor of physical disks as reported by omreport. What does 'omreport storage pdisk controller=0' say? I'm guessing that the Vendor field is empty or missing for one of the disks. Finding the root cause would be interesting. Can you tell if the disk in question is an original disk supplied by Dell? If it isn't, this could be the reason that the vendor field is empty/missing, i.e. Openmanage doesn't recognize it. If it is a Dell drive, we're probably dealing with a rare Openmanage oddity. In any case, check_openmanage should handle this situation more gracefully. I'll provide a patched version for you to test on Monday. Cheers, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- Nokia and AT&T present the 2010 Calling All Innovators-North America contest Create new apps & games for the Nokia N8 for consumers in U.S. and Canada $10 million total in prizes - $4M cash, 500 devices, nearly $6M in marketing Develop with Nokia Qt SDK, Web Runtime, or Java and Publish to Ovi Store http://p.sf.net/sfu/nokia-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Question on setting up my own check
Marc Powell writes: > On Oct 19, 2010, at 2:20 PM, steve f wrote: > >> Hello All, >> >> I have the following script created to check free space on a remote legacy >> box via rsh. >> >> used=`sudo rsh $1 df -v |grep starlite6 | head -1 | awk '{print $4}'` >> free=`sudo rsh $1 df -v |grep starlite6 | head -1 | awk '{print $5}'` > > Beyond just good programming practice, always use full paths to external > programs within your scripts. $PATH may not be what you expect it to be, > especially when being run by the nagios daemon which has a more restrictive > environment. > > # (paths may be different on your system) > used=`/usr/bin/sudo /usr/bin/rsh $1 /bin/df -v | /bin/grep starlite | > /usr/bin/head -1 | /usr/bin/awk '{print $4}'` Or... set PATH before doing anything else, e.g. #!/bin/bash PATH=/bin:/sbin:/usr/bin:/usr/sbin export PATH [...rest of script...] This will enhance readability wrt. using full paths everywhere. Cheers, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- Download new Adobe(R) Flash(R) Builder(TM) 4 The new Adobe(R) Flex(R) 4 and Flash(R) Builder(TM) 4 (formerly Flex(R) Builder(TM)) enable the development of rich applications that run across multiple browsers and platforms. Download your free trials today! http://p.sf.net/sfu/adobe-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Bug in check_openmanage ?
rb...@free.fr writes: > omreport chassis pwrmanagement > Power Budget Information is not available on this system. > > In fact, i solve the problem by updating/resetting the idrac. Ok, good to know. I'm still a little concerned that there was a hardware problem that check_openmanage didn't identify properly. Please let me know if this happens again. > But the plugins nagios is always ko and i don't know why ... > > ./tmp/check_openmanage -H 10.1.19.193 > SNMP ERROR [cooling]: Requested entries are empty or do not exist. This is a completely different problem. Cooling devices (i.e. fans) should exist in all servers except blades. Which type of server is this, and do you know if it has fans or not? The error above is from the Net::SNMP perl module. If the plugin doesn't get the data it expects when polling via SNMP, it will forward the error message from Net::SNMP. Regards, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- Nokia and AT&T present the 2010 Calling All Innovators-North America contest Create new apps & games for the Nokia N8 for consumers in U.S. and Canada $10 million total in prizes - $4M cash, 500 devices, nearly $6M in marketing Develop with Nokia Qt SDK, Web Runtime, or Java and Publish to Ovi Store http://p.sf.net/sfu/nokia-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Bug in check_openmanage ?
rb...@free.fr writes: > Hi Trond, > > You are right ... > > -- > # omreport chassis > > Health > > Main System Chassis > > SEVERITY : COMPONENT > Ok : Memory > Critical : Power Management > Ok : Processors > Ok : Temperatures > Ok : Voltages > Ok : Hardware Log > Ok : Batteries > > For further help, type the command followed by -? > > > On the IDRAC i have the message "System Board Current Latch" This is interesting.. Have you configured power budgeting on this server? What does this command say: omreport chassis pwrmanagement On a regular R805 here it just says: Power Budget Information is not available on this system. but we've never configured or used this feature, so I don't know anything about it. I'm thinking that perhaps check_openmanage should support these and similar configurable OMSA features. Regards, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- Nokia and AT&T present the 2010 Calling All Innovators-North America contest Create new apps & games for the Nokia N8 for consumers in U.S. and Canada $10 million total in prizes - $4M cash, 500 devices, nearly $6M in marketing Develop with Nokia Qt SDK, Web Runtime, or Java and Publish to Ovi Store http://p.sf.net/sfu/nokia-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Bug in check_openmanage ?
rb...@free.fr writes: > OOPS! Something is wrong with this server, but I don't know what. The > global system health status is CRITICAL, but every component check is > OK. This may be a bug in the Nagios plugin, please file a bug report. > > The status change from OK to Unknown... > > Is anybody can help me to debbug ? Hi Rémi, Thanks for reporting this. As an extra precaution, check_openmanage will check the global health status in addition to each of the components, providing you don't use blacklisting and/or check control such that the global check can be a false positive. This case seems to be a real issue where a component is bad and the global health status reflects this. The component in question is not checked by the plugin for some reason. I'd like to narrow down the suspect pool. If you have login access to this server, can you send the output from the following command: omreport chassis If this command reports that everything is OK, we're probably dealing with a storage problem. Just to rule out blacklisting bugs etc., what is the command definition for check_openmanage in your Nagios config? Regards, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- Start uncovering the many advantages of virtual appliances and start using them to simplify application deployment and accelerate your shift to cloud computing. http://p.sf.net/sfu/novell-sfdev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Problem with check_openmanage and Open Manage 6.3
Luca Olivotto writes: > that is the output: > SNMPv2-SMI::enterprises.674.10893.1.20.130.1 = No Such Object available on > this agent at this OID Ok, this confirms that the problem lies with OMSA, specifically the SNMP functionality. I'm afraid that I can't offer much clues about how to fix this. I would try restarting the OMSA and SNMP services, and if that doesn't work, reinstall OMSA completely. Best of luck, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- Start uncovering the many advantages of virtual appliances and start using them to simplify application deployment and accelerate your shift to cloud computing. http://p.sf.net/sfu/novell-sfdev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Problem with check_openmanage and Open Manage 6.3
Luca Olivotto writes: > yes, i see the perc 6i controller. Ok, thanks. I then suspect that the problem lies with the SNMP part of OMSA. Kan you run the following command from your Nagios server to confirm: snmpwalk -v2c -c 1.3.6.1.4.1.674.10893.1.20.130.1 The result should look something like this: $ snmpwalk -v2c -c public foobar 1.3.6.1.4.1.674.10893.1.20.130.1 SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.1.1 = INTEGER: 1 SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.2.1 = STRING: "PERC 6/i Integrated" SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.3.1 = STRING: "DELL" SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.4.1 = INTEGER: 6 SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.5.1 = INTEGER: 1 SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.7.1 = INTEGER: 30 SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.8.1 = STRING: "6.2.0-0013" SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.9.1 = INTEGER: 256 SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.10.1 = INTEGER: 0 SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.11.1 = INTEGER: 6 SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.12.1 = INTEGER: 2 SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.37.1 = INTEGER: 3 SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.38.1 = INTEGER: 3 SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.39.1 = STRING: "\\0" SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.40.1 = INTEGER: 3 SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.41.1 = STRING: "00.00.04.17-RH1 " SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.42.1 = STRING: "embedded" SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.43.1 = INTEGER: 99 SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.47.1 = INTEGER: 2 SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.48.1 = INTEGER: 30 SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.49.1 = INTEGER: 30 SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.50.1 = INTEGER: 30 SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.51.1 = INTEGER: 30 SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.52.1 = INTEGER: 1 SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.53.1 = INTEGER: 1 SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.54.1 = INTEGER: 32 SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.57.1 = INTEGER: 99 SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.58.1 = INTEGER: 99 Regards, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- Start uncovering the many advantages of virtual appliances and start using them to simplify application deployment and accelerate your shift to cloud computing. http://p.sf.net/sfu/novell-sfdev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Problem with check_openmanage and Open Manage 6.3
Luca Olivotto writes: > Hello, > i have a problem with the plugin check_openmanage . > > if i use this command: > ./check_openmanage -H xx.xx.xx.xx > > i get this result: > OOPS! Something is wrong with this server, but I don't know what. The global > system health status is WARNING, but every component check is OK. This may be > a bug in the Nagios plugin, please file a bug report. > > The server that i'm checking is a PowerEdge 2950 and i suppose that the > problem is the version of OpenManage installed on the server. The version is > 6.3 and the only warning shown via the webinterface are the old version of > the firmware/driver/storeDriver of the controller. > If i try that command > > check_openmanage -H 10.10.10.6 -b ctrl_fw=all/ctrl_driver=all/ctrl_stdr=all -s > -e > the output is: > OK - System: 'PowerEdge 2950', SN: 'xx', 16 GB ram (4 dimms), 0 logical > drives, 0 physical drives > > as you can see the disk are not checked(that server has a broked mirror). > > the version of check_openmanage is 3.6.0 Hi Luca, Your analysis is correct. OMSA doesn't display storage info via SNMP, but there is something wrong with a storage component. For some reason, OMSA senses the storage failure and the global health status inherits this failure status, but OMSA doesn't display the storage. This condition will trigger the behaviour you are seeing. The plugin searches for storage controllers. If it doesn't find any controllers, it concludes that there is no storage alltogether and will skip subsequent checks of disk drives etc. Do you see any controlles by running this command on the server: omreport storage controller Regards, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- Start uncovering the many advantages of virtual appliances and start using them to simplify application deployment and accelerate your shift to cloud computing. http://p.sf.net/sfu/novell-sfdev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage ignores blacklist directive
"C. Bensend" writes: >> Despite of giving it the parameter to ignore Warnings about the >> controller firmware, it still gives a Warning Status: >> >> /usr/lib/nagios/plugins/check_openmanage -b ctrl_fw -s -H 192.168.2.137 > > 'ctrl_fw' isn't the complete option you need to give there - you > also need to specify the ID per: > > http://folk.uio.no/trondham/software/check_openmanage.8.html > > Try 'ctrl_fw=0,1' Yes, or: ctrl_fw=all ..if you wish to blacklist this for all controllers and aren't interested in specifying controller IDs. Cheers, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- This SF.net email is sponsored by Make an app they can't live without Enter the BlackBerry Developer Challenge http://p.sf.net/sfu/RIM-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage: Use of uninitialized value in sprintf at /usr/lib64/nagios/plugins/check_openmanage
Max Williams writes: > Excellent, sorted, everything reports as OK now. Good. I'll try to make a release with these changes in the next couple of days. > Thanks so much Trond, amazing support and an amazingly useful plugin! Glad you like it, Max. Thanks for reporting this issue :) Cheers, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- This SF.net email is sponsored by Sprint What will you do first with EVO, the first 4G phone? Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage: Use of uninitialized value in sprintf at /usr/lib64/nagios/plugins/check_openmanage
Max Williams writes: > Here is the output, the inactive temperature probe is sorted but the > missing EMM still produces an alert: > > OK | 1:1:0:1 | Temperature Probe 1 in enclosure 3 [MD1000] is Inactive This one works as expected :) > OK | 1:1:0:2 | Temperature Probe 2 in enclosure 3 [MD1000]: C ( max) > OK | 1:1:0:3 | Temperature Probe 3 in enclosure 3 [MD1000]: C ( max) Hmm... something strange going on here. I wonder why this happens, in the SNMP output you attached previously the values are there. Anyway, I've added some extra checking in the code to make it report better if the reading is unavailable for some reason. It should now report simply: Temperature Probe 0 in enclosure 2:0:0 [MD1000] is Ready if the temp reading is not an integer and OMSA reports the status as OK. > CRITICAL | 1:1:0:1 | EMM 1 in enclosure 3 [MD1000] needs attention: Not > Installed Ah.. I misread the SNMP output.. The status is "Unknown" when reported by omreport, but "Other" when reported with SNMP. One little annoying difference between the two.. The output should be: EMM 0 in enclosure 2:0:0 [MD1000] is Not Installed with an OK state. I've created a second test version: http://folk.uio.no/trondham/software/beta/check_openmanage Please give this one a try and see if it performs better. Cheers, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- This SF.net email is sponsored by Sprint What will you do first with EVO, the first 4G phone? Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage: Use of uninitialized value in sprintf at /usr/lib64/nagios/plugins/check_openmanage
Max Williams writes: > Both of the new enclosures show the same output so perhaps these just > have a different configuration to the others we have here. Yes. I suspect that the is related to one EMM not being installed. My guess is that the inactive temperature sensor is located in the EMM, but there is no way to tell since neither the omreport output nor the SNMP output reveals the location of the temperature sensors. Or perhaps the EMM is needed to activate the sensor. We always order our MD1000s with 2 EMMs, so this is something that I haven't had the opportunity to test. I have created a test version for you to try. This version should: * report inactive temperature sensors as OK * report EMMs with state "Not Installed" as OK In addition it checks that the reading from the sensors are in fact digits before attempting to print the values. The test version is located here: http://folk.uio.no/trondham/software/beta/ Try it with the '-d' option to see that it reports these things properly. Cheers, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- ThinkGeek and WIRED's GeekDad team up for the Ultimate GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the lucky parental unit. See the prize list and enter to win: http://p.sf.net/sfu/thinkgeek-promo ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage: Use of uninitialized value in sprintf at /usr/lib64/nagios/plugins/check_openmanage
Max Williams writes: > Hi, > > After adding more storage to a couple of our servers we are getting this > error: > > > > [r...@host ~]# /usr/lib64/nagios/plugins/check_openmanage -C password -b > ctrl_driver=0,1,2 -b ctrl_fw=0,1,2 -b intr=0 -H host2 > > Temperature Probe 1 in enclosure 3 [MD1000] is Inactive C at ( max) > > EMM 1 in enclosure 3 [MD1000] needs attention: Not Installed > > INTERNAL ERROR: Use of uninitialized value in sprintf at /usr/lib64/nagios/ > plugins/check_openmanage line 2312. > > INTERNAL ERROR: Use of uninitialized value in sprintf at /usr/lib64/nagios/ > plugins/check_openmanage line 2312. > > INTERNAL ERROR: Use of uninitialized value in sprintf at /usr/lib64/nagios/ > plugins/check_openmanage line 2318. > > INTERNAL ERROR: Use of uninitialized value in sprintf at /usr/lib64/nagios/ > plugins/check_openmanage line 2318. > > INTERNAL ERROR: Use of uninitialized value in sprintf at /usr/lib64/nagios/ > plugins/check_openmanage line 2318. > > INTERNAL ERROR: Use of uninitialized value in sprintf at /usr/lib64/nagios/ > plugins/check_openmanage line 2318. > > [r...@host ~]# > > > > We didn?t get this error before adding a new cabinet of disks which now brings > the total up to 47 (2x internal disk and 3x full MD1000s). > > Has any one else come across this error? I am not perl literate so not sure > how > to debug or fix this. Hi Max, This is interesting. I've never seen "Inactive" temperature sensors in external enclosures. Also, that the plugin reports missing EMMs seems like a misfeature. Can you post the output from the following commands: On the monitored host: omreport storage enclosure controller= enclosure= info=temps omreport storage enclosure controller= enclosure= info=emms Replace with controller/enclosure pairs. You'll get the enclosure and controller IDs with commands omreport storage controller omreport storage enclosure Also, since you're checking with SNMP, I'll need the output from an snmpwalk of the enclosures wrt. temperatures and EMMs. From the Nagios server: snmpwalk -v2c -c 1.3.6.1.4.1.674.10893.1.20.130.11 snmpwalk -v2c -c 1.3.6.1.4.1.674.10893.1.20.130.13 If you are uncomfortable with posting this information on the mailinglist, feel free to email me directly. Debug output from the plugin could also be useful: check_openmanage -H -C -d Cheers, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- ThinkGeek and WIRED's GeekDad team up for the Ultimate GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the lucky parental unit. See the prize list and enter to win: http://p.sf.net/sfu/thinkgeek-promo ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage plugin error
Andrea Ballarati writes: > Nagios reports error from the plugin in subject, we have another Dell > PowerEdge 1950 for which no errors are reported. > This is the output of check_openmanage -d > >System: PowerEdge 1800 >ServiceTag: OMSA version:4.5.0 >BIOS/date: A05 09/21/2005 Plugin version: 3.5.7 > - >Storage Components > > = > STATE |ID| MESSAGE TEXT > > -+--+ > WARNING |0 | Controller 0 [CERC SATA 1.5/2s] needs attention: > Degraded > OK |0:0:0 | Array Disk 0:0 [1.0TB] on ctrl 0 is Online > OK |0:0:1 | Array Disk 0:1 [1.0TB] on ctrl 0 is Online > OK | 0:0 | Logical drive 0 'Windows Disk 0' [RAID-1, 931.48 > GB] on ctrl 0 is Ready > OK | 0:0 | Channel 0 [] on controller 0 is Ready > - >Chassis Components > > = > STATE | ID | MESSAGE TEXT > > -+--+ > OK |1 | Memory module 1 [DIMM1_A, 512 MB] is Ok > OK |2 | Memory module 2 [DIMM1_B, 512 MB] is Ok > OK |1 | Chassis fan 1 [BMC Fan 1]: 1500 > OK |2 | Chassis fan 2 [BMC Fan 2]: 1500 > OK |0 | Power Supply 0 [VRM]: Presence detected > OK |1 | Power Supply 1 [VRM]: Presence detected > OK |0 | Temperature Probe 0 [PROC_1 Temp] reads 38 C (max=120/125) > OK |1 | Temperature Probe 1 [BMC Ambient Temp] reads 22 C > (min=8/3, max=40/45) > OK |2 | Temperature Probe 2 [BMC Planar Temp] reads 33 C > (min=8/3, max=62/67) > OK |3 | Temperature Probe 3 [BMC VRD 0 Temp] reads 31 C > (min=8/3, max=70/75) > OK |4 | Temperature Probe 4 [BMC VRD 1 Temp] reads 27 C > (min=8/3, max=70/75) > OK |0 | Processor 0 [Intel Xeon 3.00GHz] is Present > OK |0 | Voltage sensor 0 [BMC CMOS Battery] is 3.070 V > OK |1 | Voltage sensor 1 [PROC_1 VCORE] is Good > OK |2 | Voltage sensor 2 [BMC PROC VTT] is Good > OK |3 | Voltage sensor 3 [BMC 1.5V PG] is Good > OK |4 | Voltage sensor 4 [BMC 1.8V PG] is Good > OK |5 | Voltage sensor 5 [BMC 3.3V PG] is Good > OK |6 | Voltage sensor 6 [BMC 5V PG] is Good > OK |0 | Chassis intrusion 0 detection: Ok (Not Breached) > - >Other messages > > = > STATE | MESSAGE TEXT > > -+--- > OK | ESM log health is Ok (less than 80% full) > > INTERNAL ERROR: Use of uninitialized value in numeric eq (==) at > /usr/lib/nagios/plugins/check_openmanage line 1380. > INTERNAL ERROR: Use of uninitialized value in numeric eq (==) at > /usr/lib/nagios/plugins/check_openmanage line 1380. > INTERNAL ERROR: Use of uninitialized value in sprintf at > /usr/lib/nagios/plugins Hi Andrea, check_openmanage is designed to work with relatively recent OMSA versions. You are using OMSA version 4.5.0, which is very old. The server in question (poweredge 1800) is supported by newer OMSA, so the solution is an OMSA upgrade to the latest version (6.2.0). OMSA versions 5.3.0 and later is OK to use with check_openmanage, and I've had reports that 5.1.0 and 5.2.0 works as well (but no guarantee). Anything older will yield strange results or will simply not work. Cheers, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage weirdness
Greg Etling writes: > Trond, thanks for your quick reply. Unfortunately it does appear we have > a disconnect between OMSA and SNMP: [snip] > [r...@nagios ~]# snmpwalk -v2c -c * testserver > 1.3.6.1.4.1.674.10893.1.20.130.1 > SNMPv2-SMI::enterprises.674.10893.1.20.130.1 = No Such Object available > on this agent at this OID Hmm.. you should see output like: $ snmpwalk -v2c -c community hostname 1.3.6.1.4.1.674.10893.1.20.130.1 SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.1.1 = INTEGER: 1 SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.2.1 = STRING: "PERC 6/i Integrated" SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.3.1 = STRING: "DELL" SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.4.1 = INTEGER: 6 SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.5.1 = INTEGER: 1 SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.7.1 = INTEGER: 30 SNMPv2-SMI::enterprises.674.10893.1.20.130.1.1.8.1 = STRING: "6.2.0-0013" [...] > It appears to only have data under the 1.3.6.1.4.1.674.10892 and > 1.3.6.1.4.1.674.10899 trees. Thoughts? Unfortunately my Windows knowledge is rather limited. I have never installed OMSA on Windows, but I suspect that there are options to choose from during the install. The first thing I would do is to re-install OMSA step by step and try to figure out what I might have missed. On Linux, the install procedure and packaging of the OMSA components changed with version 6.2.0. This may very well be the case with the Windows version as well. Cheers, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage weirdness
Greg Etling writes: > I have just started implementing some check_openmanage checks on my > servers, and have run into some odd behavior with the combination of > Windows 2003, OM 6.2 and the SNMP check. It appears that this > combination is having issues with the drive/controller reporting. > Initially things worked fine under OM 5.4, until the SNMP service would > die (other than that, Mrs. Lincoln...) - so i upgraded to OM 6.2, when I > observed the following behaviour. > > When the check is run without any blacklisting, the plugin reports that > there is a global status WARNING, but all components are OK - the > WARNING is coming from out of date Firmware/Driver versions as listed below: > > -- > Firmware/Driver Information for Controller PERC 6/i Integrated > Firmware Version6.0.3-0002 > Minimum Required Firmware Version6.2.0-0012 > Driver Version2.14.00.32 > Minimum Required Driver Version2.23.00.32 > Storport Driver Version5.2.3790.3959 > Minimum Required Storport Driver Version5.2.3790.4173 > -- > > Now when run in debug mode, I noticed that it had no information about > the drives at all (note the beta version - same output as plugin v3.5.7): [snip] This is the key to this problem. There are warnings associated with the storage subsystem, but that information is not available via SNMP for some reason. The global status of the server inherits these warnings, however, so the plugin reports this as some unknown error. Does omreport report anything on storage? Try: omreport storage controller If that works, try getting the same information via SNMP: snmpwalk -v2c -c 1.3.6.1.4.1.674.10893.1.20.130.1 Usually the problem is that the storage components of OMSA is not installed, in which case neither command will work. > And the Status as reported to Nagios believes that there are no disks > whatsoever on the server: > -- > OK - System: 'PowerEdge 2950', SN: 'XXX', hardware working fine, 0 > logical drives, 0 physical drives > -- Yes, that is the normal behaviour when the plugin doesn't find any storage components. The plugin can't report this as a problem, since it's OK for a server not to have storage reported by OMSA (which only reports on supported storage), or any storage at all for that matter (diskless servers). > This has been replicated on several identical systems. > > I'm a bit stumped as to where the problem lies. Please let me know if > you need further information from me. You should check your OMSA install. The storage parts of it was probably not installed. It may also be that there is something wrong with the OMSA+SNMP integration, which prevents storage information from being presented. That would be trickier to debug. Cheers, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Internal error
Richard Hagen writes: > I recently installed a new DELL Poweredge 2970 with W2k8 and installed also > DELL OMSA. > > When i read the status from nagios i get the following error: > > Amperage probe 0 [PS 1 Current 1] reads 0 A > Amperage probe 1 [PS 2 Current 2] reads 0 A > INTERNAL ERROR: Use of uninitialized value in division (/) at /usr/lib/nagios/ > plugins/check_openmanage line 3536. > INTERNAL ERROR: Use of uninitialized value in division (/) at /usr/lib/nagios/ > plugins/check_openmanage line 3536. > INTERNAL ERROR: Use of uninitialized value in sprintf at /usr/lib/nagios/ > plugins/check_openmanage line 3562. Hi Richard, This happens because the value (i.e. reading from the amperage probes) are not reported by SNMP, while the rest of the data about the probes are reported (status, type, name etc.). There is something wrong with Openmanage on this server. What is the output from this command: omreport chassis pwrmonitoring That being said, the plugin could handle this better. Please try the beta version available here: http://folk.uio.no/trondham/tmp/ Cheers, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Check_multipath
Brian O'Mahony writes: > It works locally though, and I have > > > > Cmnd_Alias MULTIPATH=/sbin/multipath -l > > nagios ALL= NOPASSWD: MULTIPATH My money is on "Requiretty". Locally you have a TTY, while NRPE does not. The "Requiretty" setting in /etc/sudoers must be turned off. Comment out this line in /etc/sudoers: Defaultsrequiretty Cheers, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Problem with check_openmanage 3.5.6
Nicole Hähnel writes: >>> CRITICAL: [xxx] Physical Disk 0:0 [Wdc WD1600JS-55MHB0, 160GB] on ctrl 0 >>> needs >>> attention: >>> -- SYSTEM: PowerEdge 830, SN: xxx >>> INTERNAL ERROR: Use of uninitialized value in string eq at >>> /usr/lib64/nagios/ >>> plugins/grontmij/check_openmanage line 1432. >>> INTERNAL ERROR: Use of uninitialized value in sprintf at /usr/lib64/nagios/ >>> plugins/grontmij/check_openmanage line 1445. Mostly for the list archive: We took this off the list to do some back-and-forth debugging and testing, and the issue is now resolved. A new version of check_openmanage is released, which will print the above correctly as: CRITICAL: Physical Disk 0:0 [Wdc WD1600JS-55MHB0, 160GB] on ctrl 0 needs attention: Undefined value 4096 This relates to SNMP returning values which are not defined in the MIBs. Such values are now reported as "Undefined value ". Regards, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Keeping the Nagios Configuration Sane
David Wallis writes: > Matt Simmons wrote: >> Hi All, >> >> I'm attending the 2010 Professional IT Community Conference >> (http://www.picconf.org) being held in New Brunswick, NJ, and I'm >> giving a talk about staying sane while working with the Nagios >> configuration. >> >> The talk will be 45 minutes long, and will primarily be an outshoot >> from this article that I wrote on my blog: >> http://www.standalone-sysadmin.com/blog/2009/07/nagios-config/ >> >> I could talk about that and some other things that I've been figuring >> out, but I was wondering if anyone had any tricks or tips for dealing >> with the Nagios config? Is there anything special that you do to keep >> things straight? >> >> I'm going to be putting my slides and any additional material online >> following the conference, so hopefully someone else can get some use >> from it. >> >> By the way, if anyone on this list is in the north east of the US, you >> should come visit the conference. Without training, it's only $275 for >> 2 days. With a full day and a half of training, it's still only $400 >> for the whole shebang. Anyway, this isn't a sales email. >> >> I'm looking forward to any tips you would want to share. Thanks in advance! >> >> --Matt >> > > I manage the Nagios installation for 3 different domains at work, each > domain with several hundred servers and clients. I quickly reached the > "There's got to be a better way!" point when trying to maintain > configuration files that were getting pretty big. I was using all the > tricks listed in the Nagios docs, but it was still pretty crazy. > > The approach I took was to write a configuration generator program that > uses a meta-config file to generate the hosts.cfg, hostgroups.cfg and > services.cfg config files. The meta-config file allows one to set up > cascading configuration variables, and then has one line per monitored > host, that includes things like host groups, parents, etc, and then a > list of services to monitor. > > I also created the idea of "meta-services" that allow the program to > generate configuration data for any number of related services with a > single service name in the meta-config file. For instance, including the > service "weball" will cause the configuration generator to create > service entries for every plumbed interface on the web server, checks > for every virtual server (http and https), and checks for every SSL cert > that it finds. In one domain, a 400 line meta-config file generates a > 20,000 line services.cfg file. > > Rather than updating individual config files, I just update the > meta-config file and then regenerate all of the *.cfg files. I've been > using this for several years with very good results. That's an interesting approach, and we do something similar. It goes without saying that when the number of hosts grows to several hundred, maintaining the Nagios config for hosts and hostgroups etc. the regular way becomes an arduous task. This is especially true if your environment is largely heterogenous. We have a list of our servers maintained in a homegrown application using a topic map as base. Large parts of the Nagios config are generated from this. I think this is an important point. Usually, you already have a list of your servers, and you can use this list as a base for Nagios config as well. The format of the host list is not important, but deciding that this is the starting point for Nagios hosts config is. When a host is added/removed in the list, it is added/removed in Nagios. This is very much like David's approach, i.e. a list of hosts in a format that is easier to handle and maintain. In addition, we have defined several "roles" that a server may have, such as dell-hardware, hp-hardware, mail-mx-server, web-server, dns-server etc. A simple perl script runs every day on each host and determines its roles. This information is collected and kept centrally. Parts of the Nagios config (hostgroups, servicegroups) are generated based on these roles. NRPE config is the same on all hosts. It is maintained centrally and distributed to each host daily. Adding stuff in the sudoers file (needed for some plugins) is done automatically based on the host's roles. Another point: We generally don't use plugins that require us to configure the plugin and tailor it for each individual host. For example, for filesystem monitoring we have created a custom plugin that monitors all partitions by default. It has a optional configuration file locally on each host where we can set individual thresholds if needed. Thinking like this should come easy to system administrators that are used to dealing with large installations. It's all about automation :) Cheers, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compilin
Re: [Nagios-users] check_openmanage: "Failed to load external entity" with OMSA v5.3.0 on 2003
"C. Bensend" writes: >I am working my way through some more Windows NSClient++ > installs, and I've hit a 2003 server that is not happy with > check_openmanage (please forgive the horrible formatting): > > > C:\Program Files\NSClient++>check_openmanage -e -p -b > bat_charge=ALL/ctrl_fw=ALL > /ctrl_driver=ALL --omreport F:\dellopenmanage\oma\bin\omreport.exe > > I/O warning : failed to load external entity > "E:/dellopenmanage/xslroot/oma/cli/ > omclpr.xsl" > error > xsltParseStylesheetFile : cannot parse > E:/dellopenmanage/xslroot/oma/cli/omclpr. > xsl > I/O warning : failed to load external entity > "E:/dellopenmanage/xslroot/oma/cli/ > omclpr.xsl" > error > xsltParseStylesheetFile : cannot parse > E:/dellopenmanage/xslroot/oma/cli/omclpr. > xsl > I/O warning : failed to load external entity > "E:/dellopenmanage/xslroot/oma/cli/ > omclpr.xsl" > error > xsltParseStylesheetFile : cannot parse > E:/dellopenmanage/xslroot/oma/cli/omclpr. > xsl > Couldn't close filehandle for command > '"F:\dellopenmanage\oma\bin\omreport.exe" > -? 2>&1': > Problem running 'omreport storage controller': Error! XML Transformation > failed > Problem running 'omreport chassis memory': Error! XML Transformation failed > Problem running 'omreport chassis fans': Error! XML Transformation failed > Problem running 'omreport chassis pwrsupplies': Error! XML Transformation > failed > > Problem running 'omreport chassis temps': Error! XML Transformation failed > Problem running 'omreport chassis processors': Error! XML Transformation > failed > Problem running 'omreport chassis volts': Error! XML Transformation failed > Problem running 'omreport chassis batteries': Error! XML Transformation > failed > Problem running 'omreport chassis pwrmonitoring': Error! XML > Transformation fail > ed > Couldn't close filehandle for command > '"F:\dellopenmanage\oma\bin\omreport.exe" > chassis pwrmonitoring -fmt ssv': > Problem running 'omreport chassis intrusion': Error! XML Transformation > failed > Couldn't close filehandle for command > '"F:\dellopenmanage\oma\bin\omreport.exe" > system esmlog -fmt ssv': > -- SYSTEM: N/A, SN: N/A > > >*** Note that I specified F: for the path to OMSA, and the output > is complaining about an external entity in E:. That is not a typo. > >This is OMSA version 5.3.0 which I have dozens of, and this is > the first time I've seen this. > >This host is a DC, so while I believe it's probably a case of > "uninstall OMSA and re-install OMSA", I'm hoping someone has seen > this before and I can avoid the reboot. > >Trond? Anyone? I've seen XML errors from OpenManage when probing disks with broken firmware, but this seems to be something else. Do you get the same XML errors when you run the failing commands manually on the server? If so, I'm afraid there isn't much check_openmanage can do about it. A reinstall of OpenManage is probably the next logical step.. Cheers, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Problem with check_openmanage 3.5.6
Nicole Hähnel writes: > I tested the new version: > > CRITICAL: [xxx] Physical Disk 0:0 [Wdc WD1600JS-55MHB0, 160GB] on ctrl 0 needs > attention: > -- SYSTEM: PowerEdge 830, SN: xxx > INTERNAL ERROR: Use of uninitialized value in string eq at /usr/lib64/nagios/ > plugins/grontmij/check_openmanage line 1432. > INTERNAL ERROR: Use of uninitialized value in sprintf at /usr/lib64/nagios/ > plugins/grontmij/check_openmanage line 1445. Hmm.. OK, new test: http://folk.uio.no/trondham/tmp/check_openmanage-3.5.7-beta2 Regards, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Problem with check_openmanage 3.5.6
Nicole Hähnel writes: > it's a windows server. > So I'm using check_openmanage with snmp. > > check_openmanage -s -C $ARG1$ -H $HOSTADDRESS$ -e -i -p --state --check > intrusion=1,alertlog=1,esmlog=1 -o 3 --htmlinfo de > > List of Physical Disks on Controller CERC SATA 1.5/6ch (Slot 4) > > Controller CERC SATA 1.5/6ch (Slot 4) > ID: 0:0 > Status: Unknown > Name : Physical Disk 0:0 > State : Unknown > Failure Predicted : No > Progress : Not Applicable > Bus Protocol : SATA > Media : HDD > Capacity : 149.05 GB (160040681472 bytes) > Used RAID Disk Space : 0.00 GB (0 bytes) > Available RAID Disk Space : 0.00 GB (0 bytes) > Hot Spare : No > Vendor ID : WDC > Product ID: WD1600JS-55MHB0 > Revision : 02.0 > Serial No.: WD-WCANM3083963 > Negotiated Speed : Not Available > Capable Speed : Not Available > Manufacture Day : Not Available > Manufacture Week : Not Available > Manufacture Year : Not Available > SAS Address : Not Available Ok, so the status and state are both "Unknown". I'm guessing that these values are completely missing in the SNMP output, which is why perl chokes on it. I've added some robustness in the code that should handle this case properly. Please try the beta version (3.5.7-beta1) available here: http://folk.uio.no/trondham/tmp/check_openmanage-3.5.7-beta1 The plugin will give an alert on the drive, which in my opinion is the correct thing to do. You can always blacklist the drive. The cause of the error is obviously that this is a non-Dell drive, which Openmanage doesn't know how to handle. BTW, you can reduce your command definition to this: check_openmanage -s -C $ARG1$ -H $HOSTADDRESS$ -e -i -p -a -o 3 --htmlinfo de The effect will be the same. You probably defined the command a while ago, and there have been some changes to options since then. Cheers, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Problem with check_openmanage 3.5.6
Nicole Hähnel writes: > Hi > > I get this message on one pe830 (OM 6.1.0) : > > CRITICAL: [ xxx] Physical Disk 0:0 [Wdc WD1600JS-55MHB0, 160GB] on ctrl 0 > needs > attention: > -- SYSTEM: PowerEdge 830, SN: xxx > INTERNAL ERROR: Use of uninitialized value in string eq at /usr/lib64/nagios/ > plugins/grontmij/check_openmanage line 1428. > INTERNAL ERROR: Use of uninitialized value in sprintf at /usr/lib64/nagios/ > plugins/grontmij/check_openmanage line 1441. > > > Is this a problem of check_openmanage or the disk? > It's a non dell sata disk. Hi Nicole, Can you provide the output of the following command, executed on the monitored host: omreport storage pdisk controller=0 Also, are you using check_openmanage in SNMP or local context? Cheers, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage and net-snmp v3
Hi all, Just to bring this thread to a conclusion... I have released a new version of check_openmanage that adds a new option '--use-get_table', which is to be used as a workaround for issues with SNMPv3 on Windows using net-snmp. There are a few other minor fixes and feature enhancements as well. Downloads and changelog: http://folk.uio.no/trondham/software/check_openmanage.html#download (Also available on Nagios Exchange and Monitoring Exchange.) Cheers, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_openmanage and net-snmp v3
"Verhaeghe, Koen" writes: > The script is working, at least, it does not give any errors anymore. > I even get "Physical Disk 0:1 [Ata WDC WD800JD-75MSA3, 0GB] on ctrl 0 > needs attention: Failure Predicted" as expected. I was expecting also an > errormessage from the Virtual disks, as they are degraded, but that's > not there. If the error is just "Failure Predicted", it means that the disk is working fine for the time being and the virtual drive status is not affected. When/if the drive eventually fails the virtual drive will be degraded. > Moreover, I know some of our servers have problems with power supplies > or memory, so I changed a section in the below mentioned script like you > did for the disks and others, just to test: > > #my $result = $snmp_session->get_entries(-columns => [keys > %ps_oid]); > > > ## > # SNMPv3 test > > > ## > my $result = q{}; > if ($opt{protocol} == 3) { > my $powerDeviceTable = '1.3.6.1.4.1.674.10892.1.600.12.1'; > $result = $snmp_session->get_table(-baseoid => > $powerDeviceTable); > } > else { > $result = $snmp_session->get_entries(-columns => [keys > %ps_oid]); > } > > > ## > > > ## > > And now I do get the expected error: > "Power Supply 1 [AC] needs attention: Presence detected, Failure > detected, AC lost" > > I think it is safe to say that, when using net-snmp v3, the get_entries > method is not giving the expected result. The complete picture is still a little unclear to me. Do these problems occur only when you use net-snmp instead of Windows' native snmp agent? (I'm assuming that "net-snmp" refers to http://freshmeat.net/projects/net-snmp). I would be interested in any test results you might have using the native Windows snmp agent with SNMPv3. Cheers, -- Trond H. Amundsen Center for Information Technology Services, University of Oslo -- SOLARIS 10 is the OS for Data Centers - provides features such as DTrace, Predictive Self Healing and Award Winning ZFS. Get Solaris 10 NOW http://p.sf.net/sfu/solaris-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null