Re: [Nagios-users] NSCA and long output
Hi Aaron, You wrote: > I've been trying to figure out if this is possible for a while. I'm > using NRPE and $LONGHOSTOUTPUT$ for a number of tests, which is great, > except for passive monitoring. We have several data centers that run > their own Nagios boxes and then ship the data back to the master > Nagios > server via NSCA. The problem is that I can't get NSCA to utilize the > $LONGHOSTOUTPUT$ - this is kind of critical for things like log file > checks, etc. With NSCA this data doesn't get passed. Looking at the NSCA sources, common.h has: #define MAX_PLUGINOUTPUT_LENGTH 512 I'm guessing that's the issue right there. The first thing I'd try is to bump that up to 4096, and recompile send_nsca and nsca. I haven't looked very carefully at the source or tried this myself, but it seems like a good place to start. Mike - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Antwort: Re: Fwd: plugin for iostat readings?
On Mon, 25 Feb 2008 [EMAIL PROTECTED] wrote: > You need to call iostat with multiple checks - the longer the better - > but then it means you would have to run iostat for like 30 seconds or > so -> plugin runtime is 30 seconds too then! That means that check > would have a high delay/latency, which is overall a bad idea. My > solution so far is to run the plugin via cron and report output via > nsca, this gave me the best results. One thing you might want to consider is using sadc/sar for this job. The sadc(8) program collects all kinds of stats on OS resource usage and if you explicitly ask it, it will capture some interesting disk i/o stats, including read/write requests per second. One thing that would be handy to be able to alert on is the i/o wait statistic available from sar(1). The man page defines it as: "Percentage of time that the CPU or CPUs were idle during which the system had an outstanding disk I/O request." So basically the plugin could just read the most recent data from sadc by using sar and alert if the io/wait figure reached a certain threshold. You would want to adjust the sadc cron entry to run about as often as your check interval. Mike - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] State Stalking and notifications
On Feb 20, 2008, at 7:46 AM, Frost, Mark {PBG} wrote: > I had thought about writing a custom check for each line > of output that this command generates, but that seems needlessly > painful. You could write one active check that parses the output, figures out what's gone wrong, and then submits passive results for the specific services that have errors. Mike - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] Announce: Monocole Oracle Monitoring Package
Hello, Blue Gecko Inc, is proud to announce the first release of Monocle, our open source (GPLv2) Oracle 10g database monitoring package! Monocle mostly consists of a body of PL/SQL code that runs inside the database as scheduled Oracle jobs. When events occur inside the database that are significant enough to alert on, the PL/SQL monitoring code notes the problems in a monitoring events table. The check_monocle script, which is used as a connector between Monocole and Nagios, reads the data from the events table and reports the information to Nagios using the command pipe. Requirements: * Oracle 10g or greater * Nagios 2.x * Oracle Instant Client (sqlplus) Features: * Alert Log Monitor - Scans the alert Log and reports on exceptions * Backup Monitor - Monitors the health of Oracle RMAN backups * Job Monitor - Monitors Oracle DBA and Scheduler Jobs * Lock Monitor - Monitors and records info about blocking locks * Resource Monitor - Monitors resource consumption (cursors/ processes) * Space Monitor - Monitors tablespace/archive space consumption * Standby Monitor - Monitors physical/logical standby databases To find out more about Monocle, visit our open source development site at: http://code.bluegecko.net Direct download link: http://files.bluegecko.net/code/Monocle-1.0.tar.gz About Blue Gecko: What Blue Gecko does is simple: We provide database administration support services for Oracle and MySQL. The Blue Gecko team proactively monitors, administers, and tunes Oracle and MySQL Server database environments, either on our own Database Hosting Services platform, or at our Client's site with Remote DBA Services. To find out more about Blue Gecko, visit our services site at: http://www.bluegecko.net - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] linux kernel instrumentation + Nagios?
Roger wrote: > I'm looking for tools that will give Nagios some visibility inside the > Linux kernel. What are you trying to learn from the kernel? I think it'd be handy to have a monitor that would alert if a process started doing more than a certain amount of block i/o operations. Or perhaps a monitor that alerted on disk i/o bandwidth utilization (iostat -x %util) could use SystemTap to show you which processes would be likely culprits. Mike - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Passive host results & soft states?
Marco wrote: > What I did is to send the passive host check through NSCA only if its > in hard state, soft states are ignored, what script do you use to call > send_nsca ? Just a simple script that pipes "$HOST\t$RESULT\t$OUTPUT\n" into send_nsca. I'll need to also pass in $HOSTSTATETYPE$ and exit if I don't see a HARD state. Thanks for the advice, this is a lot nicer way of dealing with this problem. Mike - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Passive host results & soft states?
Thanks Marc, I found this answer from Ethan Galstad in the thread you posted: > Nagios 2 doesn't support a max_attempts directive for hosts and all > passive host check results will immediately force the host into a HARD > state. This has changed a bit in Nagios 3 - hosts do have a > max_attempts directive, but passive results still put the host into a > HARD state. I think the solution for me is to change the check-host-alive command to send more pings, that way one dropped packet on a remotely monitored box won't cause me to get woken up at some ungodly hour. The command_line I had for check-host-alive (not sure where I got it) seems somewhat silly to me: Original: $USER1$/check_ping -H $HOSTADDRESS$ -w 3000.0,80% -c 5000.0,100% -p 1 My Change: $USER1$/check_ping -H $HOSTADDRESS$ -w 1000.0,40% -c 5000.0,100% -p 5 Mike - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] Passive host results & soft states?
Howdy, Host A is a server which sends passive host/service results to host B via NSCA. When a single host check fails on host C (a machine monitored by host A) host A considers host C to be a SOFT state, where host B (the one that actually sends notifications) considers host C to be in a HARD state. This causes me a lot of problems because often just one host check will fail, yet I still get a notification. Here are some log entries that illustrate this: Log entry from host A: Wed Mar 21 02:29:02 2007 HOST ALERT: prod-mysql-1a;DOWN;SOFT;1;CRITICAL - Host Unreachable (prod-mysql-1a) Log entry from host B: Wed Mar 21 02:29:04 2007 EXTERNAL COMMAND: PROCESS_HOST_CHECK_RESULT;prod-mysql-1a;1;CRITICAL - Host Unreachable (prod-mysql-1a) Wed Mar 21 02:29:04 2007 HOST ALERT: prod-mysql-1a;DOWN;HARD;1;CRITICAL - Host Unreachable (prod-mysql-1a) Host A defines its hosts using this template: define host { nameserver check_command check-host-alive failure_prediction_enabled 1 max_check_attempts 4 notification_period 24x7 freshness_threshold 250s contact_groups admins notification_period 24x7 notification_interval 0 notification_optionsd,u,r check_interval 4 register0 } Host B defines its hosts using this template: define host { nameremote-host-template active_checks_enabled 0 check_period24x7 max_check_attempts 4 notification_period 24x7 notification_interval 5 notification_optionsd,r failure_prediction_enabled 1 check_command service_is_stale register0 } Thanks for any help! Mike - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Snmptrap with Nagios
> I have to monitor a "thing" that works with snmptraps, but I don`t know > what I have to do. You need to have a machine that listens for SNMP traps. The program snmptrapd does this, it comes with net-snmp package. This daemon writes the trap info to the system log, or alternately runs a program and passes it the trap information. That program can then submit a check to nagios using NSCA or by writing to the nagios.cmd pipe. I found this article to be very helpful: http://www.samag.com/documents/s=9559/sam0503g/ Mike - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Which tool is best for me: Nagios, OpenNMS, or something else?
I can't speak for OpenNMS, but I think for Nagios the answer for a lot of your questions is going to be: "There isn't a way of doing this with the standard nagios plugin package, but someone has probably written a plugin that does this, check the Nagios Exchange site." > % Confirm each machine is up/pingable/reachable [obviously!] Obviously. > % nmap each machine to make sure correct ports (varies by machine) and > no others are open This isn't a standard nagios plugin, however somebody has a plugin that does this, a quick google search found: http://ubermonkey.wordpress.com/2006/09/28/nagios-nmap-plugin/ > % Not all tests all the time: some tests should run less frequently > (reduce the load); You can define the check_interval on a service by service basis. > % For machines running httpd, download several pages, diff to last > copies of these pages, report "big" differences... I'm guessing you'll have to code this plugin yourself in nagios. > % For machines running sendmail, send a test email to one of the other > machines running sendmail, which then confirms receipt; alert if not > received. Also do other mail routing/delivery tests. This is becomming a frequently asked question on this list. Various people have written plugins to do this, but it's been my experience that most people who need this end up writing their own. > % For machines running popd/imapd, simulate login to confirm > authentication is working (popd/imapd auth isn't always local for us) See default answer. A quick google search found this page, which confirms authentication on pop/imap. http://www.jhweiss.de/software/nagios.html > % Monitor files in /etc (eg, passwd, shadow, crontab) for changes. You could do this with tripwire and then write a plugin that reads the snmp trap, or trap logfile. > % Ideally, the "something bad has happened" reporting can be > configured-- it may be OK for "mailq -v" to be large for 10-15 > minutes, but not for 30 minutes (for example). You can do this with nagios. You can check every five minutes and not go to a hard failure state until the check has failed six times. Mike - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Monitoring disk bandwidth utilization?
Hugo van der Kooij wrote: > I'm puzzled by this term of 'disk bandwitdh'. I am not quit sure we are on > the same wavelenght here. But I could imagine digging up the absolute > counters and using rrd to build the usual graphs out of them. Sorry for not making this clear. The iostat and sar utilities (part of the RH sysstat package) will show you a statistic called either iowait or %used for a given block device, which the manpage defines as: "Percentage of CPU time during which I/O requests were issued to the device (bandwidth utilization for the device). Device saturation occurs when this value is close to 100%." > The issue I think is getting very frequent measurements and normalizing > them in some sort so you can obtain average and maximum values out of > them. By default the sar utility will show you 10 minute averages, and iostat can show you current utilizations. Mike - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys - and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] Monitoring disk bandwidth utilization?
Hi! I've been unable to find a nagios plugin that monitors disk bandwidth utilization, does anybody know of one? It seems like it would be relatively straightforward to wrap a nagios plugin around a utility like iostat or sar, but I thought I'd ask if anyone had done this before I dive in. Thanks! mikeh - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys - and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] Separate notification_interval for warnings?
Howdy, I have nagios set up to send notifications every five minutes. This makes sense when a service is CRITICAL, but makes less sense when it is simpily WARNING. Warnings go to a separate email alias... every five minutes. Normally during the day I acknowledge them, but during the evening they can generate quite a lot of spam. I couldn't figure out a good way to solve this problem, so I ended up adding a new variable to nagios 2.6 called warn_notification_interval which only gets applied to services/hosts in the WARNING state. My question is, is this useful or is there an easier way to solve this problem I just couldn't think of? I'd be willing to create a patch for my changes if anyone is interested. Mike - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys - and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null