On 8/6/14, 10:14 AM, Mark Selby wrote:
My company has just stated using Ganglia for production metrics
gathering and as I like to really understand "what is happening" in my
environment I have a few questions that I can not seem to truly figure
out on my own. All and any help is greatly appreciated.
(1) default rrd settings
RRA:AVERAGE:0.5:1 :5856 # 1 day @ 15 second res
RRA:AVERAGE:0.5:4 :20160 # 2 weeks @ 1 minute res
RRA:AVERAGE:0.5:40:52704 # 1 year @ 10 minute res
The above is the default rrd creation setting as stated in
gmetad.conf. This makes sense to me except the math does not work out
and I was just wondering if there is a reason for that. 15 second
resolution for a day come out to 5760 data points (4 * 60 * 24) The
defaults seem to add an extra 96 data points or 24 additional minutes
at 15 seconds. Is there an actual reason for this?
Here is the commit where this was changed:
https://github.com/ganglia/monitor-core/commit/8fe2b054d070d7fcdd8b424073c107eb3a98c9e0
It doesn't indicate a reason for the extra points.
Also I just want to confirm that using these default settings I only
get a single year of data retention for all my metrics
(2) gmond vs gmetad collection rates
I use the default collection rate of 15 seconds in my gmetad.conf for
all of my different cluster (here is a line)
data_source "elasticsearch" localhost:18705
To me this means connect to the gmond running on port 18705 every 15
seconds, get all the values and write them to the various rrd files.
If I take a look at my gmond.conf and say take the collection group
for memory I get a bit confused. To me the below says every 40 seconds
take a look at free_mem and send the value upstream every 180 seconds
if the current value does not differ from the previous one by 1024,
else send the new value right away.
collection_group {
collect_every = 40
time_threshold = 180
metric {
name = "mem_free"
value_threshold = "1024.0"
title = "Free Memory"
}
Since gmetad is writing data every 15 seconds (for a day at least)
when I look at the mem_free.rrd file I expect to see values in groups
of two always being the same. Since I am collecting mem_free every 40
seconds at best then 2 x 15 second values in the rrd file must be the
same because the data can not get updated more quickly
<!-- 2014-08-05 17:00:00 UTC / 1407258000 -->
<row><v>1.0716692000e+07</v></row>
<!-- 2014-08-05 17:00:15 UTC / 1407258015 -->
<row><v>1.0717433333e+07</v></row>
<!-- 2014-08-05 17:00:30 UTC / 1407258030 -->
<row><v>1.0717804000e+07</v></row>
<!-- 2014-08-05 17:00:45 UTC / 1407258045 -->
<row><v>1.0717804000e+07</v></row>
<!-- 2014-08-05 17:01:00 UTC / 1407258060 -->
<row><v>1.0713761333e+07</v></row>
<!-- 2014-08-05 17:01:15 UTC / 1407258075 -->
<row><v>1.0711740000e+07</v></row>
<!-- 2014-08-05 17:01:30 UTC / 1407258090 -->
<row><v>1.0711740000e+07</v></row>
<!-- 2014-08-05 17:01:45 UTC / 1407258105 -->
<row><v>1.0715161333e+07</v></row>
<!-- 2014-08-05 17:02:00 UTC / 1407258120 -->
<row><v>1.0716872000e+07</v></row>
<!-- 2014-08-05 17:02:15 UTC / 1407258135 -->
<row><v>1.0717945600e+07</v></row>
<!-- 2014-08-05 17:02:30 UTC / 1407258150 -->
<row><v>1.0718336000e+07</v></row>
<!-- 2014-08-05 17:02:45 UTC / 1407258165 -->
<row><v>1.0718336000e+07</v></row>
<!-- 2014-08-05 17:03:00 UTC / 1407258180 -->
<row><v>1.0717522667e+07</v></row>
<!-- 2014-08-05 17:03:15 UTC / 1407258195 -->
<row><v>1.0717116000e+07</v></row>
<!-- 2014-08-05 17:03:30 UTC / 1407258210 -->
<row><v>1.0717116000e+07</v></row>
<!-- 2014-08-05 17:03:45 UTC / 1407258225 -->
<row><v>1.0716510133e+07</v></row>
<!-- 2014-08-05 17:04:00 UTC / 1407258240 -->
<row><v>1.0715980000e+07</v></row>
<!-- 2014-08-05 17:04:15 UTC / 1407258255 -->
<row><v>1.0713079200e+07</v></row>
<!-- 2014-08-05 17:04:30 UTC / 1407258270 -->
<row><v>1.0709764000e+07</v></row>
<!-- 2014-08-05 17:04:45 UTC / 1407258285 -->
<row><v>1.0709764000e+07</v></row>
The problem is that when I dump the rrd file I do see some changes
within a two 15 second boundaries. I expect to at a minimum see two
sets of the same value before any change. Can some explain where my
logic fails?
A 40-second collection period will not fit evenly into 15-second
buckets, so per [1] rrdtool will interpolate the value and store what it
thinks the value would have been, had you reported it at the timestamp
of the bucket it's being placed into.
- Adam
[1]: http://oss.oetiker.ch/rrdtool/tut/rrdtutorial.en.html#IData_Resampling
------------------------------------------------------------------------------
Infragistics Professional
Build stunning WinForms apps today!
Reboot your WinForms applications with our WinForms controls.
Build a bridge from your legacy apps to the future.
http://pubads.g.doubleclick.net/gampad/clk?id=153845071&iu=/4140/ostg.clktrk
_______________________________________________
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general
------------------------------------------------------------------------------
Infragistics Professional
Build stunning WinForms apps today!
Reboot your WinForms applications with our WinForms controls.
Build a bridge from your legacy apps to the future.
http://pubads.g.doubleclick.net/gampad/clk?id=153845071&iu=/4140/ostg.clktrk
_______________________________________________
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general