[Nagios-users] Problem with avail.cgi

2010-08-09 Thread Assaf Flatto


Hello

I have an issue with the avail.cgi and wanted to know if anyone else has 
encountered this behaviour .


I ask nagios to produce a report for a service of the last 7 days .
but when i get the report i see in the output a table with entries more 
then a week old ,



example :

1-08-2010 00:00:00 to 08-08-2010 00:00:00
Duration: 7d 0h 0m 0s


First assumed service state:

Report period:  Backtracked archives:






[ Availability report completed in 0 min 13 sec ]



but in the summery table :

Service Log Entries:
[ View full log entries ] 
http://10.0.6.149/nagios/cgi-bin/avail.cgi?host=gbc1-ms-06service=dispacth.lovefilm.com+checkt1=1280617200t2=1281222000backtrack=4assumestateretention=yesassumeinitialstates=yesassumestatesduringnotrunning=yesinitialassumedhoststate=0initialassumedservicestate=0show_log_entriesfull_log_entriesshowscheduleddowntime=yes
Event Start Time 	Event End Time 	Event Duration 	Event/State Type 
Event/State Information
28-07-2010 00:00:00 	28-07-2010 09:45:29 	0d 9h 45m 29s 	SERVICE OK 
(HARD) 	HTTP OK: HTTP/1.1 302 Found - 611 bytes in 0.020 second response 
time
29-07-2010 00:00:00 	29-07-2010 10:20:56 	0d 10h 20m 56s 	SERVICE OK 
(HARD) 	HTTP OK: HTTP/1.1 302 Found - 611 bytes in 0.022 second response 
time
30-07-2010 00:00:00 	31-07-2010 00:00:00 	1d 0h 0m 0s 	SERVICE OK 
(HARD) 	HTTP OK: HTTP/1.1 302 Found - 611 bytes in 0.016 second response 
time
31-07-2010 00:00:00 	01-08-2010 00:00:00 	1d 0h 0m 0s 	SERVICE OK 
(HARD) 	HTTP OK: HTTP/1.1 302 Found - 611 bytes in 0.030 second response 
time
01-08-2010 00:00:00 	02-08-2010 00:00:00 	1d 0h 0m 0s 	SERVICE OK 
(HARD) 	HTTP OK: HTTP/1.1 302 Found - 611 bytes in 0.021 second response 
time
02-08-2010 00:00:00 	02-08-2010 14:38:47 	0d 14h 38m 47s 	SERVICE OK 
(HARD) 	HTTP OK: HTTP/1.1 302 Found - 611 bytes in 0.027 second response 
time
03-08-2010 00:00:00 	03-08-2010 11:51:23 	0d 11h 51m 23s 	SERVICE OK 
(HARD) 	HTTP OK: HTTP/1.1 302 Found - 611 bytes in 0.021 second response 
time
04-08-2010 00:00:00 	04-08-2010 08:41:24 	0d 8h 41m 24s 	SERVICE OK 
(HARD) 	HTTP OK: HTTP/1.1 302 Found - 611 bytes in 0.031 second response 
time




As you can see the report does not provide the details i asked for and  
it claims to provide and that subverts my statistics .


Any one encountred this before ? can you recommend a way to fix this ?

I am using nagios 3.2.0 from source.

Thanks

--
Never,Ever Cut A Deal With a Dragon 



Next year I will be doing the London to Paris bike ride to 
raise money for the DogTrust (www.dogtrust.co.uk) .

Please Sponsor me at http://www.justgiving.com/Assaf-Flatto

--
This SF.net email is sponsored by 

Make an app they can't live without
Enter the BlackBerry Developer Challenge
http://p.sf.net/sfu/RIM-dev2dev ___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Problem with avail.cgi

2010-08-09 Thread Marc Powell

On Aug 9, 2010, at 4:17 AM, Assaf Flatto wrote:

 but in the summery table :
 
 Service Log Entries:
 [ View full log entries ]
 Event Start Time  Event End Time  Event Duration  Event/State Type
 Event/State Information
 28-07-2010 00:00:00   28-07-2010 09:45:29 0d 9h 45m 29s   SERVICE OK 
 (HARD)   HTTP OK: HTTP/1.1 302 Found - 611 bytes in 0.020 second response 
 time
 29-07-2010 00:00:00   29-07-2010 10:20:56 0d 10h 20m 56s  SERVICE OK 
 (HARD)   HTTP OK: HTTP/1.1 302 Found - 611 bytes in 0.022 second response 
 time
 30-07-2010 00:00:00   31-07-2010 00:00:00 1d 0h 0m 0s SERVICE OK 
 (HARD)   HTTP OK: HTTP/1.1 302 Found - 611 bytes in 0.016 second response 
 time
 31-07-2010 00:00:00   01-08-2010 00:00:00 1d 0h 0m 0s SERVICE OK 
 (HARD)   HTTP OK: HTTP/1.1 302 Found - 611 bytes in 0.030 second response 
 time
 01-08-2010 00:00:00   02-08-2010 00:00:00 1d 0h 0m 0s SERVICE OK 
 (HARD)   HTTP OK: HTTP/1.1 302 Found - 611 bytes in 0.021 second response 
 time
 02-08-2010 00:00:00   02-08-2010 14:38:47 0d 14h 38m 47s  SERVICE OK 
 (HARD)   HTTP OK: HTTP/1.1 302 Found - 611 bytes in 0.027 second response 
 time
 03-08-2010 00:00:00   03-08-2010 11:51:23 0d 11h 51m 23s  SERVICE OK 
 (HARD)   HTTP OK: HTTP/1.1 302 Found - 611 bytes in 0.021 second response 
 time
 04-08-2010 00:00:00   04-08-2010 08:41:24 0d 8h 41m 24s   SERVICE OK 
 (HARD)   HTTP OK: HTTP/1.1 302 Found - 611 bytes in 0.031 second response 
 time
 
 As you can see the report does not provide the details i asked for and  it 
 claims to provide and that subverts my statistics .
 
 Any one encountred this before ? can you recommend a way to fix this ? 

I suspect this is because you've specified 4 backtracked archives when running 
the report (4 days if you have daily log rotation set up). Try specifying 0.

--
Marc


--
This SF.net email is sponsored by 

Make an app they can't live without
Enter the BlackBerry Developer Challenge
http://p.sf.net/sfu/RIM-dev2dev 
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Problem with avail.cgi

2010-08-09 Thread Assaf Flatto




Marc Powell wrote:

  On Aug 9, 2010, at 4:17 AM, Assaf Flatto wrote:

  
  
but in the summery table :

Service Log Entries:
[ View full log entries ]
Event Start Time	Event End Time	Event Duration	Event/State Type	Event/State Information
28-07-2010 00:00:00	28-07-2010 09:45:29	0d 9h 45m 29s	SERVICE OK (HARD)	HTTP OK: HTTP/1.1 302 Found - 611 bytes in 0.020 second response time
29-07-2010 00:00:00	29-07-2010 10:20:56	0d 10h 20m 56s	SERVICE OK (HARD)	HTTP OK: HTTP/1.1 302 Found - 611 bytes in 0.022 second response time
30-07-2010 00:00:00	31-07-2010 00:00:00	1d 0h 0m 0s	SERVICE OK (HARD)	HTTP OK: HTTP/1.1 302 Found - 611 bytes in 0.016 second response time
31-07-2010 00:00:00	01-08-2010 00:00:00	1d 0h 0m 0s	SERVICE OK (HARD)	HTTP OK: HTTP/1.1 302 Found - 611 bytes in 0.030 second response time
01-08-2010 00:00:00	02-08-2010 00:00:00	1d 0h 0m 0s	SERVICE OK (HARD)	HTTP OK: HTTP/1.1 302 Found - 611 bytes in 0.021 second response time
02-08-2010 00:00:00	02-08-2010 14:38:47	0d 14h 38m 47s	SERVICE OK (HARD)	HTTP OK: HTTP/1.1 302 Found - 611 bytes in 0.027 second response time
03-08-2010 00:00:00	03-08-2010 11:51:23	0d 11h 51m 23s	SERVICE OK (HARD)	HTTP OK: HTTP/1.1 302 Found - 611 bytes in 0.021 second response time
04-08-2010 00:00:00	04-08-2010 08:41:24	0d 8h 41m 24s	SERVICE OK (HARD)	HTTP OK: HTTP/1.1 302 Found - 611 bytes in 0.031 second response time

As you can see the report does not provide the details i asked for and  it claims to provide and that subverts my statistics .

Any one encountred this before ? can you recommend a way to fix this ? 

  
  
I suspect this is because you've specified 4 backtracked archives when running the report (4 days if you have daily log rotation set up). Try specifying 0.

--
Marc

  

Marc 
I did as you suggested and indeed the "extra" days are no longer
appearing in the report , however i am seeing another odd occurrence :

On the report summery i get one stat and when selecting that service in
the report i get a different result .


Servicegroup 'Weekly-report'
Service State Breakdowns:


  

  Host
  Service
  % Time OK
  % Time Warning
  % Time Unknown
  % Time Critical
  % Time Undetermined


  
  
  
  
  
  
  
  
  
  
  
  
  
  


  
  
  dispacth.check
  99.797% (99.797%)
  0.000% (0.000%)
  0.000% (0.000%)
  0.203% (0.203%)
  0.000%


  
  

  


then selecting it on the report :


  

  
  Service 'dispacth. check'
On Host ''
  
  
  
  02-08-2010 13:19:12 to
09-08-2010 13:19:12
  Duration: 7d 0h 0m 0s
  
  
  







  
First
assumed service state:


  
  



Unspecified

Current State

Service Ok

Service Warning

Service Unknown

Service Critical






  
  
Report
period:
Backtracked
archives:
  
  


[ Current time range ]

Today

Last 24 Hours

Yesterday

This Week

Last 7 Days

Last Week

This Month

Last 31 Days

Last Month

This Year

Last Year





  
  




  
  




  

  
  

  

[ Availability report completed
in 0 min 9 sec ]


Service State Breakdowns:


  

  State
  Type / Reason
  Time
  % Total Time
  % Known Time


  OK
  Unscheduled
  6d 23h 54m 50s
  99.949%
  99.949%


  Scheduled
  0d 0h 0m 0s
  0.000%
  0.000%


  Total
  6d 23h 54m 50s
  99.949%
  99.949%

  



Can you assist on this ?

Thanks 
-- 
Never,Ever Cut A Deal With a Dragon 


Next year I will be doing the London to Paris bike ride to 
raise money for the DogTrust (www.dogtrust.co.uk) .
Please Sponsor me at http://www.justgiving.com/Assaf-Flatto


--
This SF.net email is sponsored by 

Make an app they can't live without
Enter the BlackBerry Developer Challenge
http://p.sf.net/sfu/RIM-dev2dev ___
Nagios-users mailing list

Re: [Nagios-users] Passive freshness checks - active checks

2010-08-09 Thread Jim Avery
On 6 August 2010 17:02, Charlie Reddington charlie.redding...@gmail.com wrote:
 Hi All,

 I'm having a bit of a problem with my nagios setup. I'm trying to move
 toward passive checks, with failover being a active check. For now, my
 failover check command is just a one liner that returns critical with
 a message.

 I'm it's looking like the active check is being run, even when I see
 the corresponding passive check coming in. I suspect it may be in my
 configs somewhere, but I'm not sure what is wrong yet.

 The big kicker of this, is it's not all of my checks. Only some of
 them. They all have different freshness thresholds, but that doesn't
 seem to be common. Their configs are the same, but in a different
 order, and that doesn't seem like the problem either as it's affecting
 some of one, and not of the other.

 Any thoughts of what I may be doing wrong?

 Charlie

 ---


I can't see any problem with the config below.  If you have dozens of
checks set up this way and they are all set up in crontab to run at
*/15 then you will get a storm of checks at each 15 minute intervals.
I normally make sure I stagger the checks in cron so that they are
reasonably evenly spaced.  If you have thousands it might also be
worth introducing a small random sleep to spread them out even more.

I've not had any problems with it myself, but if you have a very busy
system, you might need to check that the command buffers aren't
filling (run /usr/local/nagios/bin/nagiosstats to list the current
Nagios statistics).

Check the logs from nsca too.  If I recall correctly you may need to
set debug=1 in nsca.cfg for a while to get enough information.  One
problem I sometimes see occurs when the clock on the sending server is
way out of sync with the clock on the Nagios server, nsca will
complain and not process the check.  See this section in the nsca.cfg
file:

  # MAX PACKET AGE OPTION
  # This option is used by the nsca daemon to determine when client
  # data is too old to be valid.  Keeping this value as small as
  # possible is recommended, as it helps prevent the possibility of
  # replay attacks.  This value needs to be at least as long as
  # the time it takes your clients to send their data to the server.
  # Values are in seconds.  The max packet age cannot exceed 15
  # minutes (900 seconds).  If this variable is set to zero (0), no
  # packets will be rejected based on their age.

  max_packet_age=30

If I recall, I increased this from some smaller value to make it more
forgiving of systems which are a bit out of sync.


I hope that's pointed you in the right direction.

Cheers,

Jim



 Nagios Version: 3.2.0

 I have a service template definition that looks like this.
 define service{
         name                            passive-service
         check_freshness                 1
         active_checks_enabled           0
         passive_checks_enabled          1
         parallelize_check               1
         obsess_over_service             0
         notifications_enabled           0
         event_handler_enabled           0
         flap_detection_enabled          0
         failure_prediction_enabled      0
         process_perf_data               1
         retain_status_information       1
         retain_nonstatus_information    1
         is_volatile                     0
         check_period                    24x7
         max_check_attempts              1
         contact_groups                  admins
         notification_options            w,c,r
         notification_interval           60
         notification_period             24x7
         register                        0
         }

 And then I have a services defined like so.
 # Free Memory Check
 define service{
         use                     passive-service
         service_description     Passive Memory Check
         check_command           check_stale
         hostgroups              passive
         freshness_threshold     3600
         }

 My active checks are defined with.
 # alert on stale    define command{        command_name
 check_stale
         command_line            $USER1$/check_dummy 2 Check is
 stale, please run manually
         }

 On my host, I use cron jobs to run things like this. I use
 nsca_wrapper to send my check results to the central nagios server.
 # Check Free Memory
 */15 * * * * root /usr/local/nagios/libexec/nsca_wrapper.sh -H
 server.name -S 'Passive Memory Check' -C '/usr/local/nagios/libexec/
 check_memory -w 10 -c 5'   /dev/null



 --
 This SF.net email is sponsored by

 Make an app they can't live without
 Enter the BlackBerry Developer Challenge
 http://p.sf.net/sfu/RIM-dev2dev
 ___
 Nagios-users mailing list
 Nagios-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/nagios-users
 ::: Please include Nagios version, plugin version (-v) and OS when 

Re: [Nagios-users] Calculations of RRD data

2010-08-09 Thread Jim Avery
On 8 August 2010 19:47, Stephen H. Dawson serv...@shdawson.com wrote:
 Hi,


 Not sure if this is the correct place to ask this question, but starting
 here.

 We use Nagios for lots of monitoring, and store that data in the RRD
 database.  We graph that data.  Life is good.  We have some odd thoughts
 about what if scenarios, where we need to further review the data in the
 RRD database.  Simple arithmetic calculations of minus, division, and then
 some averaging of some of those minus and division outputs.

 We really do not want to put the monitored data into a SQL database.  How
 (hopefully) does one do arithmetic calculations of data in an RRD database,
 please?


I find DRRAW really useful for that sort of thing.  As Marc said, you
can use rrdgraph to do these things - DRRAW just makes it easier.

For a version which was developed to add functionality specific to
PNP4Nagios, see:
http://www.semintelligent.com/blog/articles/39/pnp-aware-version-of-drraw-released

I'm not sure if this was ever rolled in to the main DRRAW release
which is at http://web.taranis.org/drraw/


You can also use the rrdtool xport utility to export information from
an rrd to a .xml, doing some calc on it in the process, for example
here's one where I get data from three different rrd files:



#!/bin/sh

rrdtool xport  \
  --start end -14 day \
  --end 07/12/2010 00:00 \
  --step  3600 \
  --enumds \
  DEF:a=/usr/local/nagios/share/perfdata//chp-p15/Load.rrd:1:AVERAGE \
  DEF:e=/usr/local/nagios/share/perfdata//chp-p16/Load.rrd:1:AVERAGE \
  
DEF:b=/usr/local/nagios/share/perfdata//chp-p15/BIS_count_tBPInstances.rrd:1:MIN
\
  
DEF:c=/usr/local/nagios/share/perfdata//chp-p15/BIS_count_tBPInstances.rrd:1:MAX
\
  CDEF:d=c,b,- \
  XPORT:a:Load Average App \
  XPORT:e:Load Average DB \
  XPORT:d:tBPInstances Delta \
  XPORT:b:tBPInstances Min \
  XPORT:c:tBPInstances Max  14days-to-20100623.xml


The xml file can then be easily imported in to Microsoft Excel so you
can do futher maths on it if you wish.

See:

http://oss.oetiker.ch/rrdtool/doc/rrdxport.en.html


Another neat thing you can do is use rrdcgi to publish graphs on the web.  See:

http://oss.oetiker.ch/rrdtool/doc/rrdcgi.en.html

I found the learning curve for all this lot fairly steep, but using
drraw helps (as it can show you what rrd commands it is building) and
the rewards are great.

hth,

Jim

--
This SF.net email is sponsored by 

Make an app they can't live without
Enter the BlackBerry Developer Challenge
http://p.sf.net/sfu/RIM-dev2dev 
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Calculations of RRD data

2010-08-09 Thread Stephen H. Dawson
Yes, we use DRRAW as well.  However, running those cal's, and then graphing
within Nagios or DRRAW or pnp4nagios would be nice.

I guess it is an export from RRD, do the calc's, and review outside of
Nagios/DRRAW/pnp4nagios kind of thing?


Thanks,
SHD 

-Original Message-
From: avery...@gmail.com [mailto:avery...@gmail.com] On Behalf Of Jim Avery
Sent: Monday, August 09, 2010 11:35 E/T
To: serv...@shdawson.com; Nagios Users List
Subject: Re: [Nagios-users] Calculations of RRD data

On 8 August 2010 19:47, Stephen H. Dawson serv...@shdawson.com wrote:
 Hi,


 Not sure if this is the correct place to ask this question, but 
 starting here.

 We use Nagios for lots of monitoring, and store that data in the RRD 
 database.  We graph that data.  Life is good.  We have some odd 
 thoughts about what if scenarios, where we need to further review 
 the data in the RRD database.  Simple arithmetic calculations of 
 minus, division, and then some averaging of some of those minus and
division outputs.

 We really do not want to put the monitored data into a SQL database.  
 How
 (hopefully) does one do arithmetic calculations of data in an RRD 
 database, please?


I find DRRAW really useful for that sort of thing.  As Marc said, you can
use rrdgraph to do these things - DRRAW just makes it easier.

For a version which was developed to add functionality specific to
PNP4Nagios, see:
http://www.semintelligent.com/blog/articles/39/pnp-aware-version-of-drraw-re
leased

I'm not sure if this was ever rolled in to the main DRRAW release which is
at http://web.taranis.org/drraw/


You can also use the rrdtool xport utility to export information from an rrd
to a .xml, doing some calc on it in the process, for example here's one
where I get data from three different rrd files:



#!/bin/sh

rrdtool xport  \
  --start end -14 day \
  --end 07/12/2010 00:00 \
  --step  3600 \
  --enumds \
  DEF:a=/usr/local/nagios/share/perfdata//chp-p15/Load.rrd:1:AVERAGE \
  DEF:e=/usr/local/nagios/share/perfdata//chp-p16/Load.rrd:1:AVERAGE \
 
DEF:b=/usr/local/nagios/share/perfdata//chp-p15/BIS_count_tBPInstances.rrd:1
:MIN
\
 
DEF:c=/usr/local/nagios/share/perfdata//chp-p15/BIS_count_tBPInstances.rrd:1
:MAX
\
  CDEF:d=c,b,- \
  XPORT:a:Load Average App \
  XPORT:e:Load Average DB \
  XPORT:d:tBPInstances Delta \
  XPORT:b:tBPInstances Min \
  XPORT:c:tBPInstances Max  14days-to-20100623.xml


The xml file can then be easily imported in to Microsoft Excel so you can do
futher maths on it if you wish.

See:

http://oss.oetiker.ch/rrdtool/doc/rrdxport.en.html


Another neat thing you can do is use rrdcgi to publish graphs on the web.
See:

http://oss.oetiker.ch/rrdtool/doc/rrdcgi.en.html

I found the learning curve for all this lot fairly steep, but using drraw
helps (as it can show you what rrd commands it is building) and the rewards
are great.

hth,

Jim


--
This SF.net email is sponsored by 

Make an app they can't live without
Enter the BlackBerry Developer Challenge
http://p.sf.net/sfu/RIM-dev2dev 
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Passive freshness checks - active checks

2010-08-09 Thread Charlie Reddington

I can't see any problem with the config below.  If you have dozens of
checks set up this way and they are all set up in crontab to run at
*/15 then you will get a storm of checks at each 15 minute intervals.
I normally make sure I stagger the checks in cron so that they are
reasonably evenly spaced.  If you have thousands it might also be
worth introducing a small random sleep to spread them out even more.

I've not had any problems with it myself, but if you have a very busy
system, you might need to check that the command buffers aren't
filling (run /usr/local/nagios/bin/nagiosstats to list the current
Nagios statistics).

Check the logs from nsca too.  If I recall correctly you may need to
set debug=1 in nsca.cfg for a while to get enough information.  One
problem I sometimes see occurs when the clock on the sending server is
way out of sync with the clock on the Nagios server, nsca will
complain and not process the check.  See this section in the nsca.cfg
file:

 # MAX PACKET AGE OPTION
 # This option is used by the nsca daemon to determine when client
 # data is too old to be valid.  Keeping this value as small as
 # possible is recommended, as it helps prevent the possibility of
 # replay attacks.  This value needs to be at least as long as
 # the time it takes your clients to send their data to the server.
 # Values are in seconds.  The max packet age cannot exceed 15
 # minutes (900 seconds).  If this variable is set to zero (0), no
 # packets will be rejected based on their age.

 max_packet_age=30

If I recall, I increased this from some smaller value to make it more
forgiving of systems which are a bit out of sync.


I hope that's pointed you in the right direction.

Cheers,

Jim


Hey Jim,

Thanks for the info,

I have increased the time offset to be a minute or two. But all our  
systems should be close as we use NTP to keep them in sync, and nagios  
currently does active checks on this one to make sure things are happy.


I'll check out the stats and turn on debugging next to see if there is  
anything there. In the mean time, what version of nagios are you  
running?


Thanks,

Charlie

--
This SF.net email is sponsored by 

Make an app they can't live without
Enter the BlackBerry Developer Challenge
http://p.sf.net/sfu/RIM-dev2dev ___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Passive freshness checks - active checks

2010-08-09 Thread Jim Avery
On 9 August 2010 16:38, Charlie Reddington charlie.redding...@gmail.com wrote:

 I have increased the time offset to be a minute or two. But all our systems
 should be close as we use NTP to keep them in sync, and nagios currently
 does active checks on this one to make sure things are happy.

That's the ideal thing to do, yes.

 I'll check out the stats and turn on debugging next to see if there is
 anything there. In the mean time, what version of nagios are you running?

Nagios Core 3.2.1


Cheers,

Jim

--
This SF.net email is sponsored by 

Make an app they can't live without
Enter the BlackBerry Developer Challenge
http://p.sf.net/sfu/RIM-dev2dev 
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Calculations of RRD data

2010-08-09 Thread Jim Avery
On 9 August 2010 16:39, Stephen H. Dawson serv...@shdawson.com wrote:
 Yes, we use DRRAW as well.  However, running those cal's, and then graphing
 within Nagios or DRRAW or pnp4nagios would be nice.

 I guess it is an export from RRD, do the calc's, and review outside of
 Nagios/DRRAW/pnp4nagios kind of thing?


It depends what you want do do.

I do a lot of simple maths using DRRAW in the CDEF field for each data
source or by adding CDEF lines.  Don't forget you can hide datasources
by setting -Nothing- for the line/area type so you can just display
the results not the original data.

For some things where drraw can't quite cut it, I use rrdgraph outside
of DRRAW (typically I use rrdcgi so I can easily publish to the web).
See: http://oss.oetiker.ch/rrdtool/doc/rrdcgi.en.html

I only bother exporting to .xml and import to Excel when I want to do
really fancy scatter graphs and regression analysis.


If you're going to want the performance data in a database all the
time, you might consider changing your Nagios perfdata processing
config to output the data to MySQL or whatever instead of or as well
as to PNP.


Cheers,

Jim

--
This SF.net email is sponsored by 

Make an app they can't live without
Enter the BlackBerry Developer Challenge
http://p.sf.net/sfu/RIM-dev2dev 
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Calculations of RRD data

2010-08-09 Thread Stephen H. Dawson
Thanks,
SHD 

-Original Message-
From: avery...@gmail.com [mailto:avery...@gmail.com] On Behalf Of Jim Avery
Sent: Monday, August 09, 2010 12:01 E/T
To: serv...@shdawson.com
Cc: Nagios Users List
Subject: Re: [Nagios-users] Calculations of RRD data

On 9 August 2010 16:39, Stephen H. Dawson serv...@shdawson.com wrote:
 Yes, we use DRRAW as well.  However, running those cal's, and then 
 graphing within Nagios or DRRAW or pnp4nagios would be nice.

 I guess it is an export from RRD, do the calc's, and review outside of 
 Nagios/DRRAW/pnp4nagios kind of thing?


It depends what you want do do.

I do a lot of simple maths using DRRAW in the CDEF field for each data
source or by adding CDEF lines.  Don't forget you can hide datasources by
setting -Nothing- for the line/area type so you can just display the
results not the original data.

For some things where drraw can't quite cut it, I use rrdgraph outside of
DRRAW (typically I use rrdcgi so I can easily publish to the web).
See: http://oss.oetiker.ch/rrdtool/doc/rrdcgi.en.html

I only bother exporting to .xml and import to Excel when I want to do really
fancy scatter graphs and regression analysis.


If you're going to want the performance data in a database all the time, you
might consider changing your Nagios perfdata processing config to output the
data to MySQL or whatever instead of or as well as to PNP.


Cheers,

Jim


--
This SF.net email is sponsored by 

Make an app they can't live without
Enter the BlackBerry Developer Challenge
http://p.sf.net/sfu/RIM-dev2dev 
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


[Nagios-users] Nagios Deployment

2010-08-09 Thread Shane Killian
Hi all,

I'n planning a nagios deployment at the moment but I am new to Nagios and would 
like to confirm that it has the features I require.

I've used Nagios in the past but have never set it up from scratch. I've also 
been playing with Lilac over the last couple of days but I think a Nagios 
only install is the way to go.

I will be deploying on Ubuntu server and I need to monitor about 100 Windows 
Servers and 25 Ubuntu/CentOS boxes.

Services that I need to monitor:

Exchange 2007
SQL Server
MDaemon
OpenX and the the usual.

What is the easiest way to go about doing this?

I would also like to monito email be sending a mail to an echo address and 
then checking for a response.

I've read a reasonable amount of documentation and learned a little from trial 
and error but if someone could. Point me in the right direction I'd really 
appreciate it.


Thanks


Shane
--
Sent using BlackBerry

E-mail Disclaimer:

The information contained in this message is confidential and is intended for 
the addressee only. If you have received this message in error or there are any 
problems please notify the originator immediately. The unauthorized use, 
disclosure, copying or alteration of this message is strictly forbidden. This 
mail and any attachments have been scanned for viruses prior to leaving the 
network of saongroup.com

saongroup.com will not be liable for direct, special, indirect or consequential 
damages arising from alteration of the contents of this message by a third 
party or as a result of any virus being passed on. saongroup.com reserves the 
right to monitor and record e-mail messages sent to and from this address for 
the purposes of investigating or detecting any unauthorized use of its system 
and ensuring its effective operation.



--
This SF.net email is sponsored by 

Make an app they can't live without
Enter the BlackBerry Developer Challenge
http://p.sf.net/sfu/RIM-dev2dev 
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Nagios Deployment

2010-08-09 Thread Jim Avery
On 9 August 2010 17:49, Shane Killian shane.kill...@irishjobs.ie wrote:

 I've read a reasonable amount of documentation and learned a little from 
 trial and error but if someone could. Point me in the right direction I'd 
 really appreciate it.

I think that's the thing - the documentation is great for sorting out
the nitty-gritty but not always so good for understanding how it's all
supposed to work together.

What I did, and what I recommend anyone new to Nagios to do is to get
hold of one of the books about Nagios - my favourite (and I won't
pretend I've read them all) is the one by Wolfgang Barth published by
No Starch Press.

http://nostarch.com/nagios.htm

--
This SF.net email is sponsored by 

Make an app they can't live without
Enter the BlackBerry Developer Challenge
http://p.sf.net/sfu/RIM-dev2dev 
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] ? on monitoring the distribution servers

2010-08-09 Thread Herb J.
I don't use localhost in any host or service configs. There is no use 
for localhost, since it is such an ambiguous hostname. In a 
distributed setup where there are multiple sources feeding data into a 
single server, *every* host name must be unique, otherwise they clobber 
each other's data (as you have noticed). I use the actual FQDN hostnames 
of the collectors (even the central one) in the configs. When there is 
an issue with a service on one of the collectors, it shows up in the 
interface under that particular collector's hostname, so I know exactly 
which one is broken.



On 08/09/2010 03:23 PM, steve f wrote:
I finally got a distributed server up  running in Core 3.x and have a 
stupid question on monitoring the dist server.  I have the central 
server currently configured to not do active checks.  On the 
distributed server, I had all of the localhost.cfg checks running and 
in the nagios.log on the central server, I see the following :


EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;localhost;Root 
Partition;0;DISK OK - free space: / 827 MB (86% inode=88%):
[1281213501] PASSIVE SERVICE CHECK: localhost;Root Partition;0;DISK OK 
- free space: / 827 MB (86% inode=88%):


This is coming as a passive check from the distributed server with the 
hostname of localhost and as such, it appears that on the central 
server its using this check result  populating the checks for the 
central server.


if I wanted to monitor the distributed server, would I not use the 
localhost.cfg on the distributed server?  SHould I rename everything 
localhost in localhost.cfg to the name of the distributed server?


Whats the most rational way to monitor the distributed server ?

Thanks,


--
This SF.net email is sponsored by

Make an app they can't live without
Enter the BlackBerry Developer Challenge
http://p.sf.net/sfu/RIM-dev2dev


___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue.
::: Messages without supporting info will risk being sent to /dev/null


--
This SF.net email is sponsored by 

Make an app they can't live without
Enter the BlackBerry Developer Challenge
http://p.sf.net/sfu/RIM-dev2dev ___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Nagios Deployment

2010-08-09 Thread Greg Pangrazio
I use nagios on Ubuntu both 10.04 and 8.04.  I recommend installing
from source as it is much more stable and works better with the
plugins etc. than the default Ubuntu package.

Other than that since you look to be monitoring lots of windows hosts
i would suggest checking out wmi which was not too difficult to
install on Ubuntu 10.04LTS.

Greg Pangrazio





On Mon, Aug 9, 2010 at 1:00 PM, Jim Avery j...@jimavery.me.uk wrote:
 On 9 August 2010 17:49, Shane Killian shane.kill...@irishjobs.ie wrote:

 I've read a reasonable amount of documentation and learned a little from 
 trial and error but if someone could. Point me in the right direction I'd 
 really appreciate it.

 I think that's the thing - the documentation is great for sorting out
 the nitty-gritty but not always so good for understanding how it's all
 supposed to work together.

 What I did, and what I recommend anyone new to Nagios to do is to get
 hold of one of the books about Nagios - my favourite (and I won't
 pretend I've read them all) is the one by Wolfgang Barth published by
 No Starch Press.

 http://nostarch.com/nagios.htm

 --
 This SF.net email is sponsored by

 Make an app they can't live without
 Enter the BlackBerry Developer Challenge
 http://p.sf.net/sfu/RIM-dev2dev
 ___
 Nagios-users mailing list
 Nagios-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/nagios-users
 ::: Please include Nagios version, plugin version (-v) and OS when reporting 
 any issue.
 ::: Messages without supporting info will risk being sent to /dev/null


--
This SF.net email is sponsored by 

Make an app they can't live without
Enter the BlackBerry Developer Challenge
http://p.sf.net/sfu/RIM-dev2dev 
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] question about notifications

2010-08-09 Thread Marc Powell

On Aug 9, 2010, at 1:24 PM, gregborbo...@gmail.com wrote:

 Are the command arguments passed in a scope?
 
 Such as check_groovy!WARN!CRIT
 
 If so, you could do arg1 and arg2

These would only be available to the check_command, not the notification 
command.

--
Marc


--
This SF.net email is sponsored by 

Make an app they can't live without
Enter the BlackBerry Developer Challenge
http://p.sf.net/sfu/RIM-dev2dev 
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null