[Nagios-users] Problem with avail.cgi
Hello I have an issue with the avail.cgi and wanted to know if anyone else has encountered this behaviour . I ask nagios to produce a report for a service of the last 7 days . but when i get the report i see in the output a table with entries more then a week old , example : 1-08-2010 00:00:00 to 08-08-2010 00:00:00 Duration: 7d 0h 0m 0s First assumed service state: Report period: Backtracked archives: [ Availability report completed in 0 min 13 sec ] but in the summery table : Service Log Entries: [ View full log entries ] http://10.0.6.149/nagios/cgi-bin/avail.cgi?host=gbc1-ms-06service=dispacth.lovefilm.com+checkt1=1280617200t2=1281222000backtrack=4assumestateretention=yesassumeinitialstates=yesassumestatesduringnotrunning=yesinitialassumedhoststate=0initialassumedservicestate=0show_log_entriesfull_log_entriesshowscheduleddowntime=yes Event Start Time Event End Time Event Duration Event/State Type Event/State Information 28-07-2010 00:00:00 28-07-2010 09:45:29 0d 9h 45m 29s SERVICE OK (HARD) HTTP OK: HTTP/1.1 302 Found - 611 bytes in 0.020 second response time 29-07-2010 00:00:00 29-07-2010 10:20:56 0d 10h 20m 56s SERVICE OK (HARD) HTTP OK: HTTP/1.1 302 Found - 611 bytes in 0.022 second response time 30-07-2010 00:00:00 31-07-2010 00:00:00 1d 0h 0m 0s SERVICE OK (HARD) HTTP OK: HTTP/1.1 302 Found - 611 bytes in 0.016 second response time 31-07-2010 00:00:00 01-08-2010 00:00:00 1d 0h 0m 0s SERVICE OK (HARD) HTTP OK: HTTP/1.1 302 Found - 611 bytes in 0.030 second response time 01-08-2010 00:00:00 02-08-2010 00:00:00 1d 0h 0m 0s SERVICE OK (HARD) HTTP OK: HTTP/1.1 302 Found - 611 bytes in 0.021 second response time 02-08-2010 00:00:00 02-08-2010 14:38:47 0d 14h 38m 47s SERVICE OK (HARD) HTTP OK: HTTP/1.1 302 Found - 611 bytes in 0.027 second response time 03-08-2010 00:00:00 03-08-2010 11:51:23 0d 11h 51m 23s SERVICE OK (HARD) HTTP OK: HTTP/1.1 302 Found - 611 bytes in 0.021 second response time 04-08-2010 00:00:00 04-08-2010 08:41:24 0d 8h 41m 24s SERVICE OK (HARD) HTTP OK: HTTP/1.1 302 Found - 611 bytes in 0.031 second response time As you can see the report does not provide the details i asked for and it claims to provide and that subverts my statistics . Any one encountred this before ? can you recommend a way to fix this ? I am using nagios 3.2.0 from source. Thanks -- Never,Ever Cut A Deal With a Dragon Next year I will be doing the London to Paris bike ride to raise money for the DogTrust (www.dogtrust.co.uk) . Please Sponsor me at http://www.justgiving.com/Assaf-Flatto -- This SF.net email is sponsored by Make an app they can't live without Enter the BlackBerry Developer Challenge http://p.sf.net/sfu/RIM-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Problem with avail.cgi
On Aug 9, 2010, at 4:17 AM, Assaf Flatto wrote: but in the summery table : Service Log Entries: [ View full log entries ] Event Start Time Event End Time Event Duration Event/State Type Event/State Information 28-07-2010 00:00:00 28-07-2010 09:45:29 0d 9h 45m 29s SERVICE OK (HARD) HTTP OK: HTTP/1.1 302 Found - 611 bytes in 0.020 second response time 29-07-2010 00:00:00 29-07-2010 10:20:56 0d 10h 20m 56s SERVICE OK (HARD) HTTP OK: HTTP/1.1 302 Found - 611 bytes in 0.022 second response time 30-07-2010 00:00:00 31-07-2010 00:00:00 1d 0h 0m 0s SERVICE OK (HARD) HTTP OK: HTTP/1.1 302 Found - 611 bytes in 0.016 second response time 31-07-2010 00:00:00 01-08-2010 00:00:00 1d 0h 0m 0s SERVICE OK (HARD) HTTP OK: HTTP/1.1 302 Found - 611 bytes in 0.030 second response time 01-08-2010 00:00:00 02-08-2010 00:00:00 1d 0h 0m 0s SERVICE OK (HARD) HTTP OK: HTTP/1.1 302 Found - 611 bytes in 0.021 second response time 02-08-2010 00:00:00 02-08-2010 14:38:47 0d 14h 38m 47s SERVICE OK (HARD) HTTP OK: HTTP/1.1 302 Found - 611 bytes in 0.027 second response time 03-08-2010 00:00:00 03-08-2010 11:51:23 0d 11h 51m 23s SERVICE OK (HARD) HTTP OK: HTTP/1.1 302 Found - 611 bytes in 0.021 second response time 04-08-2010 00:00:00 04-08-2010 08:41:24 0d 8h 41m 24s SERVICE OK (HARD) HTTP OK: HTTP/1.1 302 Found - 611 bytes in 0.031 second response time As you can see the report does not provide the details i asked for and it claims to provide and that subverts my statistics . Any one encountred this before ? can you recommend a way to fix this ? I suspect this is because you've specified 4 backtracked archives when running the report (4 days if you have daily log rotation set up). Try specifying 0. -- Marc -- This SF.net email is sponsored by Make an app they can't live without Enter the BlackBerry Developer Challenge http://p.sf.net/sfu/RIM-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Problem with avail.cgi
Marc Powell wrote: On Aug 9, 2010, at 4:17 AM, Assaf Flatto wrote: but in the summery table : Service Log Entries: [ View full log entries ] Event Start Time Event End Time Event Duration Event/State Type Event/State Information 28-07-2010 00:00:00 28-07-2010 09:45:29 0d 9h 45m 29s SERVICE OK (HARD) HTTP OK: HTTP/1.1 302 Found - 611 bytes in 0.020 second response time 29-07-2010 00:00:00 29-07-2010 10:20:56 0d 10h 20m 56s SERVICE OK (HARD) HTTP OK: HTTP/1.1 302 Found - 611 bytes in 0.022 second response time 30-07-2010 00:00:00 31-07-2010 00:00:00 1d 0h 0m 0s SERVICE OK (HARD) HTTP OK: HTTP/1.1 302 Found - 611 bytes in 0.016 second response time 31-07-2010 00:00:00 01-08-2010 00:00:00 1d 0h 0m 0s SERVICE OK (HARD) HTTP OK: HTTP/1.1 302 Found - 611 bytes in 0.030 second response time 01-08-2010 00:00:00 02-08-2010 00:00:00 1d 0h 0m 0s SERVICE OK (HARD) HTTP OK: HTTP/1.1 302 Found - 611 bytes in 0.021 second response time 02-08-2010 00:00:00 02-08-2010 14:38:47 0d 14h 38m 47s SERVICE OK (HARD) HTTP OK: HTTP/1.1 302 Found - 611 bytes in 0.027 second response time 03-08-2010 00:00:00 03-08-2010 11:51:23 0d 11h 51m 23s SERVICE OK (HARD) HTTP OK: HTTP/1.1 302 Found - 611 bytes in 0.021 second response time 04-08-2010 00:00:00 04-08-2010 08:41:24 0d 8h 41m 24s SERVICE OK (HARD) HTTP OK: HTTP/1.1 302 Found - 611 bytes in 0.031 second response time As you can see the report does not provide the details i asked for and it claims to provide and that subverts my statistics . Any one encountred this before ? can you recommend a way to fix this ? I suspect this is because you've specified 4 backtracked archives when running the report (4 days if you have daily log rotation set up). Try specifying 0. -- Marc Marc I did as you suggested and indeed the "extra" days are no longer appearing in the report , however i am seeing another odd occurrence : On the report summery i get one stat and when selecting that service in the report i get a different result . Servicegroup 'Weekly-report' Service State Breakdowns: Host Service % Time OK % Time Warning % Time Unknown % Time Critical % Time Undetermined dispacth.check 99.797% (99.797%) 0.000% (0.000%) 0.000% (0.000%) 0.203% (0.203%) 0.000% then selecting it on the report : Service 'dispacth. check' On Host '' 02-08-2010 13:19:12 to 09-08-2010 13:19:12 Duration: 7d 0h 0m 0s First assumed service state: Unspecified Current State Service Ok Service Warning Service Unknown Service Critical Report period: Backtracked archives: [ Current time range ] Today Last 24 Hours Yesterday This Week Last 7 Days Last Week This Month Last 31 Days Last Month This Year Last Year [ Availability report completed in 0 min 9 sec ] Service State Breakdowns: State Type / Reason Time % Total Time % Known Time OK Unscheduled 6d 23h 54m 50s 99.949% 99.949% Scheduled 0d 0h 0m 0s 0.000% 0.000% Total 6d 23h 54m 50s 99.949% 99.949% Can you assist on this ? Thanks -- Never,Ever Cut A Deal With a Dragon Next year I will be doing the London to Paris bike ride to raise money for the DogTrust (www.dogtrust.co.uk) . Please Sponsor me at http://www.justgiving.com/Assaf-Flatto -- This SF.net email is sponsored by Make an app they can't live without Enter the BlackBerry Developer Challenge http://p.sf.net/sfu/RIM-dev2dev ___ Nagios-users mailing list
Re: [Nagios-users] Passive freshness checks - active checks
On 6 August 2010 17:02, Charlie Reddington charlie.redding...@gmail.com wrote: Hi All, I'm having a bit of a problem with my nagios setup. I'm trying to move toward passive checks, with failover being a active check. For now, my failover check command is just a one liner that returns critical with a message. I'm it's looking like the active check is being run, even when I see the corresponding passive check coming in. I suspect it may be in my configs somewhere, but I'm not sure what is wrong yet. The big kicker of this, is it's not all of my checks. Only some of them. They all have different freshness thresholds, but that doesn't seem to be common. Their configs are the same, but in a different order, and that doesn't seem like the problem either as it's affecting some of one, and not of the other. Any thoughts of what I may be doing wrong? Charlie --- I can't see any problem with the config below. If you have dozens of checks set up this way and they are all set up in crontab to run at */15 then you will get a storm of checks at each 15 minute intervals. I normally make sure I stagger the checks in cron so that they are reasonably evenly spaced. If you have thousands it might also be worth introducing a small random sleep to spread them out even more. I've not had any problems with it myself, but if you have a very busy system, you might need to check that the command buffers aren't filling (run /usr/local/nagios/bin/nagiosstats to list the current Nagios statistics). Check the logs from nsca too. If I recall correctly you may need to set debug=1 in nsca.cfg for a while to get enough information. One problem I sometimes see occurs when the clock on the sending server is way out of sync with the clock on the Nagios server, nsca will complain and not process the check. See this section in the nsca.cfg file: # MAX PACKET AGE OPTION # This option is used by the nsca daemon to determine when client # data is too old to be valid. Keeping this value as small as # possible is recommended, as it helps prevent the possibility of # replay attacks. This value needs to be at least as long as # the time it takes your clients to send their data to the server. # Values are in seconds. The max packet age cannot exceed 15 # minutes (900 seconds). If this variable is set to zero (0), no # packets will be rejected based on their age. max_packet_age=30 If I recall, I increased this from some smaller value to make it more forgiving of systems which are a bit out of sync. I hope that's pointed you in the right direction. Cheers, Jim Nagios Version: 3.2.0 I have a service template definition that looks like this. define service{ name passive-service check_freshness 1 active_checks_enabled 0 passive_checks_enabled 1 parallelize_check 1 obsess_over_service 0 notifications_enabled 0 event_handler_enabled 0 flap_detection_enabled 0 failure_prediction_enabled 0 process_perf_data 1 retain_status_information 1 retain_nonstatus_information 1 is_volatile 0 check_period 24x7 max_check_attempts 1 contact_groups admins notification_options w,c,r notification_interval 60 notification_period 24x7 register 0 } And then I have a services defined like so. # Free Memory Check define service{ use passive-service service_description Passive Memory Check check_command check_stale hostgroups passive freshness_threshold 3600 } My active checks are defined with. # alert on stale define command{ command_name check_stale command_line $USER1$/check_dummy 2 Check is stale, please run manually } On my host, I use cron jobs to run things like this. I use nsca_wrapper to send my check results to the central nagios server. # Check Free Memory */15 * * * * root /usr/local/nagios/libexec/nsca_wrapper.sh -H server.name -S 'Passive Memory Check' -C '/usr/local/nagios/libexec/ check_memory -w 10 -c 5' /dev/null -- This SF.net email is sponsored by Make an app they can't live without Enter the BlackBerry Developer Challenge http://p.sf.net/sfu/RIM-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when
Re: [Nagios-users] Calculations of RRD data
On 8 August 2010 19:47, Stephen H. Dawson serv...@shdawson.com wrote: Hi, Not sure if this is the correct place to ask this question, but starting here. We use Nagios for lots of monitoring, and store that data in the RRD database. We graph that data. Life is good. We have some odd thoughts about what if scenarios, where we need to further review the data in the RRD database. Simple arithmetic calculations of minus, division, and then some averaging of some of those minus and division outputs. We really do not want to put the monitored data into a SQL database. How (hopefully) does one do arithmetic calculations of data in an RRD database, please? I find DRRAW really useful for that sort of thing. As Marc said, you can use rrdgraph to do these things - DRRAW just makes it easier. For a version which was developed to add functionality specific to PNP4Nagios, see: http://www.semintelligent.com/blog/articles/39/pnp-aware-version-of-drraw-released I'm not sure if this was ever rolled in to the main DRRAW release which is at http://web.taranis.org/drraw/ You can also use the rrdtool xport utility to export information from an rrd to a .xml, doing some calc on it in the process, for example here's one where I get data from three different rrd files: #!/bin/sh rrdtool xport \ --start end -14 day \ --end 07/12/2010 00:00 \ --step 3600 \ --enumds \ DEF:a=/usr/local/nagios/share/perfdata//chp-p15/Load.rrd:1:AVERAGE \ DEF:e=/usr/local/nagios/share/perfdata//chp-p16/Load.rrd:1:AVERAGE \ DEF:b=/usr/local/nagios/share/perfdata//chp-p15/BIS_count_tBPInstances.rrd:1:MIN \ DEF:c=/usr/local/nagios/share/perfdata//chp-p15/BIS_count_tBPInstances.rrd:1:MAX \ CDEF:d=c,b,- \ XPORT:a:Load Average App \ XPORT:e:Load Average DB \ XPORT:d:tBPInstances Delta \ XPORT:b:tBPInstances Min \ XPORT:c:tBPInstances Max 14days-to-20100623.xml The xml file can then be easily imported in to Microsoft Excel so you can do futher maths on it if you wish. See: http://oss.oetiker.ch/rrdtool/doc/rrdxport.en.html Another neat thing you can do is use rrdcgi to publish graphs on the web. See: http://oss.oetiker.ch/rrdtool/doc/rrdcgi.en.html I found the learning curve for all this lot fairly steep, but using drraw helps (as it can show you what rrd commands it is building) and the rewards are great. hth, Jim -- This SF.net email is sponsored by Make an app they can't live without Enter the BlackBerry Developer Challenge http://p.sf.net/sfu/RIM-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Calculations of RRD data
Yes, we use DRRAW as well. However, running those cal's, and then graphing within Nagios or DRRAW or pnp4nagios would be nice. I guess it is an export from RRD, do the calc's, and review outside of Nagios/DRRAW/pnp4nagios kind of thing? Thanks, SHD -Original Message- From: avery...@gmail.com [mailto:avery...@gmail.com] On Behalf Of Jim Avery Sent: Monday, August 09, 2010 11:35 E/T To: serv...@shdawson.com; Nagios Users List Subject: Re: [Nagios-users] Calculations of RRD data On 8 August 2010 19:47, Stephen H. Dawson serv...@shdawson.com wrote: Hi, Not sure if this is the correct place to ask this question, but starting here. We use Nagios for lots of monitoring, and store that data in the RRD database. We graph that data. Life is good. We have some odd thoughts about what if scenarios, where we need to further review the data in the RRD database. Simple arithmetic calculations of minus, division, and then some averaging of some of those minus and division outputs. We really do not want to put the monitored data into a SQL database. How (hopefully) does one do arithmetic calculations of data in an RRD database, please? I find DRRAW really useful for that sort of thing. As Marc said, you can use rrdgraph to do these things - DRRAW just makes it easier. For a version which was developed to add functionality specific to PNP4Nagios, see: http://www.semintelligent.com/blog/articles/39/pnp-aware-version-of-drraw-re leased I'm not sure if this was ever rolled in to the main DRRAW release which is at http://web.taranis.org/drraw/ You can also use the rrdtool xport utility to export information from an rrd to a .xml, doing some calc on it in the process, for example here's one where I get data from three different rrd files: #!/bin/sh rrdtool xport \ --start end -14 day \ --end 07/12/2010 00:00 \ --step 3600 \ --enumds \ DEF:a=/usr/local/nagios/share/perfdata//chp-p15/Load.rrd:1:AVERAGE \ DEF:e=/usr/local/nagios/share/perfdata//chp-p16/Load.rrd:1:AVERAGE \ DEF:b=/usr/local/nagios/share/perfdata//chp-p15/BIS_count_tBPInstances.rrd:1 :MIN \ DEF:c=/usr/local/nagios/share/perfdata//chp-p15/BIS_count_tBPInstances.rrd:1 :MAX \ CDEF:d=c,b,- \ XPORT:a:Load Average App \ XPORT:e:Load Average DB \ XPORT:d:tBPInstances Delta \ XPORT:b:tBPInstances Min \ XPORT:c:tBPInstances Max 14days-to-20100623.xml The xml file can then be easily imported in to Microsoft Excel so you can do futher maths on it if you wish. See: http://oss.oetiker.ch/rrdtool/doc/rrdxport.en.html Another neat thing you can do is use rrdcgi to publish graphs on the web. See: http://oss.oetiker.ch/rrdtool/doc/rrdcgi.en.html I found the learning curve for all this lot fairly steep, but using drraw helps (as it can show you what rrd commands it is building) and the rewards are great. hth, Jim -- This SF.net email is sponsored by Make an app they can't live without Enter the BlackBerry Developer Challenge http://p.sf.net/sfu/RIM-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Passive freshness checks - active checks
I can't see any problem with the config below. If you have dozens of checks set up this way and they are all set up in crontab to run at */15 then you will get a storm of checks at each 15 minute intervals. I normally make sure I stagger the checks in cron so that they are reasonably evenly spaced. If you have thousands it might also be worth introducing a small random sleep to spread them out even more. I've not had any problems with it myself, but if you have a very busy system, you might need to check that the command buffers aren't filling (run /usr/local/nagios/bin/nagiosstats to list the current Nagios statistics). Check the logs from nsca too. If I recall correctly you may need to set debug=1 in nsca.cfg for a while to get enough information. One problem I sometimes see occurs when the clock on the sending server is way out of sync with the clock on the Nagios server, nsca will complain and not process the check. See this section in the nsca.cfg file: # MAX PACKET AGE OPTION # This option is used by the nsca daemon to determine when client # data is too old to be valid. Keeping this value as small as # possible is recommended, as it helps prevent the possibility of # replay attacks. This value needs to be at least as long as # the time it takes your clients to send their data to the server. # Values are in seconds. The max packet age cannot exceed 15 # minutes (900 seconds). If this variable is set to zero (0), no # packets will be rejected based on their age. max_packet_age=30 If I recall, I increased this from some smaller value to make it more forgiving of systems which are a bit out of sync. I hope that's pointed you in the right direction. Cheers, Jim Hey Jim, Thanks for the info, I have increased the time offset to be a minute or two. But all our systems should be close as we use NTP to keep them in sync, and nagios currently does active checks on this one to make sure things are happy. I'll check out the stats and turn on debugging next to see if there is anything there. In the mean time, what version of nagios are you running? Thanks, Charlie -- This SF.net email is sponsored by Make an app they can't live without Enter the BlackBerry Developer Challenge http://p.sf.net/sfu/RIM-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Passive freshness checks - active checks
On 9 August 2010 16:38, Charlie Reddington charlie.redding...@gmail.com wrote: I have increased the time offset to be a minute or two. But all our systems should be close as we use NTP to keep them in sync, and nagios currently does active checks on this one to make sure things are happy. That's the ideal thing to do, yes. I'll check out the stats and turn on debugging next to see if there is anything there. In the mean time, what version of nagios are you running? Nagios Core 3.2.1 Cheers, Jim -- This SF.net email is sponsored by Make an app they can't live without Enter the BlackBerry Developer Challenge http://p.sf.net/sfu/RIM-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Calculations of RRD data
On 9 August 2010 16:39, Stephen H. Dawson serv...@shdawson.com wrote: Yes, we use DRRAW as well. However, running those cal's, and then graphing within Nagios or DRRAW or pnp4nagios would be nice. I guess it is an export from RRD, do the calc's, and review outside of Nagios/DRRAW/pnp4nagios kind of thing? It depends what you want do do. I do a lot of simple maths using DRRAW in the CDEF field for each data source or by adding CDEF lines. Don't forget you can hide datasources by setting -Nothing- for the line/area type so you can just display the results not the original data. For some things where drraw can't quite cut it, I use rrdgraph outside of DRRAW (typically I use rrdcgi so I can easily publish to the web). See: http://oss.oetiker.ch/rrdtool/doc/rrdcgi.en.html I only bother exporting to .xml and import to Excel when I want to do really fancy scatter graphs and regression analysis. If you're going to want the performance data in a database all the time, you might consider changing your Nagios perfdata processing config to output the data to MySQL or whatever instead of or as well as to PNP. Cheers, Jim -- This SF.net email is sponsored by Make an app they can't live without Enter the BlackBerry Developer Challenge http://p.sf.net/sfu/RIM-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Calculations of RRD data
Thanks, SHD -Original Message- From: avery...@gmail.com [mailto:avery...@gmail.com] On Behalf Of Jim Avery Sent: Monday, August 09, 2010 12:01 E/T To: serv...@shdawson.com Cc: Nagios Users List Subject: Re: [Nagios-users] Calculations of RRD data On 9 August 2010 16:39, Stephen H. Dawson serv...@shdawson.com wrote: Yes, we use DRRAW as well. However, running those cal's, and then graphing within Nagios or DRRAW or pnp4nagios would be nice. I guess it is an export from RRD, do the calc's, and review outside of Nagios/DRRAW/pnp4nagios kind of thing? It depends what you want do do. I do a lot of simple maths using DRRAW in the CDEF field for each data source or by adding CDEF lines. Don't forget you can hide datasources by setting -Nothing- for the line/area type so you can just display the results not the original data. For some things where drraw can't quite cut it, I use rrdgraph outside of DRRAW (typically I use rrdcgi so I can easily publish to the web). See: http://oss.oetiker.ch/rrdtool/doc/rrdcgi.en.html I only bother exporting to .xml and import to Excel when I want to do really fancy scatter graphs and regression analysis. If you're going to want the performance data in a database all the time, you might consider changing your Nagios perfdata processing config to output the data to MySQL or whatever instead of or as well as to PNP. Cheers, Jim -- This SF.net email is sponsored by Make an app they can't live without Enter the BlackBerry Developer Challenge http://p.sf.net/sfu/RIM-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] Nagios Deployment
Hi all, I'n planning a nagios deployment at the moment but I am new to Nagios and would like to confirm that it has the features I require. I've used Nagios in the past but have never set it up from scratch. I've also been playing with Lilac over the last couple of days but I think a Nagios only install is the way to go. I will be deploying on Ubuntu server and I need to monitor about 100 Windows Servers and 25 Ubuntu/CentOS boxes. Services that I need to monitor: Exchange 2007 SQL Server MDaemon OpenX and the the usual. What is the easiest way to go about doing this? I would also like to monito email be sending a mail to an echo address and then checking for a response. I've read a reasonable amount of documentation and learned a little from trial and error but if someone could. Point me in the right direction I'd really appreciate it. Thanks Shane -- Sent using BlackBerry E-mail Disclaimer: The information contained in this message is confidential and is intended for the addressee only. If you have received this message in error or there are any problems please notify the originator immediately. The unauthorized use, disclosure, copying or alteration of this message is strictly forbidden. This mail and any attachments have been scanned for viruses prior to leaving the network of saongroup.com saongroup.com will not be liable for direct, special, indirect or consequential damages arising from alteration of the contents of this message by a third party or as a result of any virus being passed on. saongroup.com reserves the right to monitor and record e-mail messages sent to and from this address for the purposes of investigating or detecting any unauthorized use of its system and ensuring its effective operation. -- This SF.net email is sponsored by Make an app they can't live without Enter the BlackBerry Developer Challenge http://p.sf.net/sfu/RIM-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Nagios Deployment
On 9 August 2010 17:49, Shane Killian shane.kill...@irishjobs.ie wrote: I've read a reasonable amount of documentation and learned a little from trial and error but if someone could. Point me in the right direction I'd really appreciate it. I think that's the thing - the documentation is great for sorting out the nitty-gritty but not always so good for understanding how it's all supposed to work together. What I did, and what I recommend anyone new to Nagios to do is to get hold of one of the books about Nagios - my favourite (and I won't pretend I've read them all) is the one by Wolfgang Barth published by No Starch Press. http://nostarch.com/nagios.htm -- This SF.net email is sponsored by Make an app they can't live without Enter the BlackBerry Developer Challenge http://p.sf.net/sfu/RIM-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] ? on monitoring the distribution servers
I don't use localhost in any host or service configs. There is no use for localhost, since it is such an ambiguous hostname. In a distributed setup where there are multiple sources feeding data into a single server, *every* host name must be unique, otherwise they clobber each other's data (as you have noticed). I use the actual FQDN hostnames of the collectors (even the central one) in the configs. When there is an issue with a service on one of the collectors, it shows up in the interface under that particular collector's hostname, so I know exactly which one is broken. On 08/09/2010 03:23 PM, steve f wrote: I finally got a distributed server up running in Core 3.x and have a stupid question on monitoring the dist server. I have the central server currently configured to not do active checks. On the distributed server, I had all of the localhost.cfg checks running and in the nagios.log on the central server, I see the following : EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;localhost;Root Partition;0;DISK OK - free space: / 827 MB (86% inode=88%): [1281213501] PASSIVE SERVICE CHECK: localhost;Root Partition;0;DISK OK - free space: / 827 MB (86% inode=88%): This is coming as a passive check from the distributed server with the hostname of localhost and as such, it appears that on the central server its using this check result populating the checks for the central server. if I wanted to monitor the distributed server, would I not use the localhost.cfg on the distributed server? SHould I rename everything localhost in localhost.cfg to the name of the distributed server? Whats the most rational way to monitor the distributed server ? Thanks, -- This SF.net email is sponsored by Make an app they can't live without Enter the BlackBerry Developer Challenge http://p.sf.net/sfu/RIM-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- This SF.net email is sponsored by Make an app they can't live without Enter the BlackBerry Developer Challenge http://p.sf.net/sfu/RIM-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Nagios Deployment
I use nagios on Ubuntu both 10.04 and 8.04. I recommend installing from source as it is much more stable and works better with the plugins etc. than the default Ubuntu package. Other than that since you look to be monitoring lots of windows hosts i would suggest checking out wmi which was not too difficult to install on Ubuntu 10.04LTS. Greg Pangrazio On Mon, Aug 9, 2010 at 1:00 PM, Jim Avery j...@jimavery.me.uk wrote: On 9 August 2010 17:49, Shane Killian shane.kill...@irishjobs.ie wrote: I've read a reasonable amount of documentation and learned a little from trial and error but if someone could. Point me in the right direction I'd really appreciate it. I think that's the thing - the documentation is great for sorting out the nitty-gritty but not always so good for understanding how it's all supposed to work together. What I did, and what I recommend anyone new to Nagios to do is to get hold of one of the books about Nagios - my favourite (and I won't pretend I've read them all) is the one by Wolfgang Barth published by No Starch Press. http://nostarch.com/nagios.htm -- This SF.net email is sponsored by Make an app they can't live without Enter the BlackBerry Developer Challenge http://p.sf.net/sfu/RIM-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- This SF.net email is sponsored by Make an app they can't live without Enter the BlackBerry Developer Challenge http://p.sf.net/sfu/RIM-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] question about notifications
On Aug 9, 2010, at 1:24 PM, gregborbo...@gmail.com wrote: Are the command arguments passed in a scope? Such as check_groovy!WARN!CRIT If so, you could do arg1 and arg2 These would only be available to the check_command, not the notification command. -- Marc -- This SF.net email is sponsored by Make an app they can't live without Enter the BlackBerry Developer Challenge http://p.sf.net/sfu/RIM-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null