Re: [Ganglia-developers] Historical Data

Jesse Becker Wed, 16 Apr 2008 15:48:17 -0700

On Wed, Apr 16, 2008 at 5:43 PM, Witham, Timothy D
<[EMAIL PROTECTED]> wrote:
>  I don't have space for it since my grids are too huge, but it would be
>  easier to just keep more detail in the RRDs which is decided at create
>  time.  I haven't yet tried it, but gmetad/conf.c implies that the data
>  retention policy could be changed in the config file (I don't see this
>  option in the man page though; is that a bug?):


This is correct.  You can change the *initial* settings for an RRD
file *when it is created*.  If you already have  RRD files, changing
the settings in the config file will have no effect.  If you want to
change your current files, then you need to export the existing data
('rrdtool xport'), create new files with new settings, then import the
old data ('rrdtool restore').


>  But maybe your parenthetical comment means you did that already but had
>  too much waitIO?  And that's why you went to a cron job?  If so, you are
>  storing the RRDs in tmpfs, right?

And read this paper:
  http://www.usenix.org/events/lisa07/tech/plonka.html

Not related to Ganglia, but it does discuss optimizing performance on
with RRD files.  In short:  turn off read-ahead, and upgrade to a
recent version of rrdtool.

>  >>  Also, I am curious as to how the performance of rrdtool would be
>  >>  affected if we were to store related metrics in a single rrd file:
>  >>  e.g., we could group cpu_(user,system,idle,wio,nice) in a single
>  file,
>  >>  which I think would reduce the resource usage of gmetad
>  significantly.
>
>  I have wondered that too.  Since RRD is random access, it seems like it
>  should be at least as efficient and probably more efficient since there
>  would be less files open.  But it would be difficult to change.  Now

The cost of calling open() is fairly low, and even on huge clusters,
I'd be surprised if this is hugely significant.  RRD files have a
short header section that stores information about the RRAs, the DSs,
and offsets as to the "current" pointer for the RRAs within the file.
With mutliple metrics in a single file, you reduce the number of
open() calls, but increase the number of calls to seek() within a
single file.

One of the main points of the paper I mentioned above is that
read-ahead is almost entirely wasted on RRD files.  In order to
read/write a single value in an RRA, the OS will open the file, read
the header (which is short) plus many other blocks on disk because of
read-ahead settings.  Next, rrdtool must seek to the proper location
in the RRD file, read a bunch of blocks (which we don't care about),
then write the new data.   Repeat this seek/readahead/write pattern
for each RRA that needs to be updated.


>  each RRD is simple with DS:sum and DS:num for summaries; the metric is
>  in the filename itself.  To change, you would need to put the metric
>  names in the RRDs: DS:cpu_user_sum, etc. and I think you would have to
>  update all metrics with one rrd_update call.  Of course this would work
>  only for the standard metrics and extra metrics would still need to be
>  in their own files.  Or, perhaps with the new metric groupings, each
>  group could be an RRD file of related metrics.  And then you'd have to
>  change the PHP to understand all this...

yep.  You could only consolidate a few sets of metrics, since not all
system support all of them.

However, ideas for improving the FE are welcome.  Once we get 3.1 (or
3.2) out the door, I'd like to work on new FE, perhaps with things
like consolidated RRD files in mind.


-- 
Jesse Becker
GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2

-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
_______________________________________________
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers

Re: [Ganglia-developers] Historical Data

Reply via email to