On 8/6/14, 10:14 AM, Mark Selby wrote:
My company has just stated using Ganglia for production metrics gathering and as I like to really understand "what is happening" in my environment I have a few questions that I can not seem to truly figure out on my own. All and any help is greatly appreciated.

(1) default rrd settings

RRA:AVERAGE:0.5:1 :5856    #  1 day @ 15 second res
RRA:AVERAGE:0.5:4 :20160   #  2 weeks @ 1 minute res
RRA:AVERAGE:0.5:40:52704   #  1 year @ 10 minute res

The above is the default rrd creation setting as stated in gmetad.conf. This makes sense to me except the math does not work out and I was just wondering if there is a reason for that. 15 second resolution for a day come out to 5760 data points (4 * 60 * 24) The defaults seem to add an extra 96 data points or 24 additional minutes at 15 seconds. Is there an actual reason for this?

Here is the commit where this was changed: https://github.com/ganglia/monitor-core/commit/8fe2b054d070d7fcdd8b424073c107eb3a98c9e0

It doesn't indicate a reason for the extra points.

Also I just want to confirm that using these default settings I only get a single year of data retention for all my metrics

(2)  gmond vs gmetad collection rates

I use the default collection rate of 15 seconds in my gmetad.conf for all of my different cluster (here is a line)

data_source "elasticsearch"  localhost:18705

To me this means connect to the gmond running on port 18705 every 15 seconds, get all the values and write them to the various rrd files.

If I take a look at my gmond.conf and say take the collection group for memory I get a bit confused. To me the below says every 40 seconds take a look at free_mem and send the value upstream every 180 seconds if the current value does not differ from the previous one by 1024, else send the new value right away.

collection_group {
collect_every = 40
  time_threshold = 180
  metric {
    name = "mem_free"
value_threshold = "1024.0"
    title = "Free Memory"
  }

Since gmetad is writing data every 15 seconds (for a day at least) when I look at the mem_free.rrd file I expect to see values in groups of two always being the same. Since I am collecting mem_free every 40 seconds at best then 2 x 15 second values in the rrd file must be the same because the data can not get updated more quickly

<!-- 2014-08-05 17:00:00 UTC / 1407258000 --> <row><v>1.0716692000e+07</v></row> <!-- 2014-08-05 17:00:15 UTC / 1407258015 --> <row><v>1.0717433333e+07</v></row> <!-- 2014-08-05 17:00:30 UTC / 1407258030 --> <row><v>1.0717804000e+07</v></row> <!-- 2014-08-05 17:00:45 UTC / 1407258045 --> <row><v>1.0717804000e+07</v></row> <!-- 2014-08-05 17:01:00 UTC / 1407258060 --> <row><v>1.0713761333e+07</v></row> <!-- 2014-08-05 17:01:15 UTC / 1407258075 --> <row><v>1.0711740000e+07</v></row> <!-- 2014-08-05 17:01:30 UTC / 1407258090 --> <row><v>1.0711740000e+07</v></row> <!-- 2014-08-05 17:01:45 UTC / 1407258105 --> <row><v>1.0715161333e+07</v></row> <!-- 2014-08-05 17:02:00 UTC / 1407258120 --> <row><v>1.0716872000e+07</v></row> <!-- 2014-08-05 17:02:15 UTC / 1407258135 --> <row><v>1.0717945600e+07</v></row> <!-- 2014-08-05 17:02:30 UTC / 1407258150 --> <row><v>1.0718336000e+07</v></row> <!-- 2014-08-05 17:02:45 UTC / 1407258165 --> <row><v>1.0718336000e+07</v></row> <!-- 2014-08-05 17:03:00 UTC / 1407258180 --> <row><v>1.0717522667e+07</v></row> <!-- 2014-08-05 17:03:15 UTC / 1407258195 --> <row><v>1.0717116000e+07</v></row> <!-- 2014-08-05 17:03:30 UTC / 1407258210 --> <row><v>1.0717116000e+07</v></row> <!-- 2014-08-05 17:03:45 UTC / 1407258225 --> <row><v>1.0716510133e+07</v></row> <!-- 2014-08-05 17:04:00 UTC / 1407258240 --> <row><v>1.0715980000e+07</v></row> <!-- 2014-08-05 17:04:15 UTC / 1407258255 --> <row><v>1.0713079200e+07</v></row> <!-- 2014-08-05 17:04:30 UTC / 1407258270 --> <row><v>1.0709764000e+07</v></row> <!-- 2014-08-05 17:04:45 UTC / 1407258285 --> <row><v>1.0709764000e+07</v></row> The problem is that when I dump the rrd file I do see some changes within a two 15 second boundaries. I expect to at a minimum see two sets of the same value before any change. Can some explain where my logic fails?


A 40-second collection period will not fit evenly into 15-second buckets, so per [1] rrdtool will interpolate the value and store what it thinks the value would have been, had you reported it at the timestamp of the bucket it's being placed into.

- Adam


[1]: http://oss.oetiker.ch/rrdtool/tut/rrdtutorial.en.html#IData_Resampling



------------------------------------------------------------------------------
Infragistics Professional
Build stunning WinForms apps today!
Reboot your WinForms applications with our WinForms controls.
Build a bridge from your legacy apps to the future.
http://pubads.g.doubleclick.net/gampad/clk?id=153845071&iu=/4140/ostg.clktrk


_______________________________________________
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general

------------------------------------------------------------------------------
Infragistics Professional
Build stunning WinForms apps today!
Reboot your WinForms applications with our WinForms controls. 
Build a bridge from your legacy apps to the future.
http://pubads.g.doubleclick.net/gampad/clk?id=153845071&iu=/4140/ostg.clktrk
_______________________________________________
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general

Reply via email to