actually. i just updated gmetad to allow custom RRAs to be defined. i just dropped the code into CVS so if you use the CVS code (which will be released as 3.0.1 very soon)... you can specify

RRAs "RRA:AVERAGE:0.5:1:240" \
     "RRA:AVERAGE:0.5:24:240" \
     "RRA:AVERAGE:0.5:168:240" \
     "RRA:AVERAGE:0.5:672:240" \
     "RRA:AVERAGE:0.5:5760:370"

in gmetad.conf to alter the round-robin archive format. this was a simple feature to add and i know it's in big demand ... no sense waiting until later to add it.

forget everything that i wrote below... just use CVS for now or wait for 3.0.1. :)

-matt


Matt Massie wrote:
here is an idea you might try.

all the rrd code is in ./gmetad/rrd_helpers.c

the function for creating rrds is RRD_create(). you can alter the format of the round-robin archives there without breaking compatibility (in upcoming version of gmetad will allow you to specify the archives in the configuration file).

it's important that you do not change any line starting with "DS" (data source) since that _will_ break compatibility.

you could change your round-robin granularity there.

you could have rrdtool save all raw data points for a week (for example). then you just need to have a cron job that once a week copies the database to another location. for example...

cp -r /my/gmetad/data/root /my/gmetad_archive/`date`/

you would then need to write a few simple scripts using rrdtool for querying data for a particular time period.

-matt




Ramon Bastiaans wrote:

Jason A. Smith wrote:

If you really want long term storage of the raw or nearly raw data then
rrdtool is probably not the right tool to use.  You would be better off
writing your own ganglia frontend client that would collect the xml data
from gmetad at the interval you need, parse it and store it into some
other database or archive.  This could also be done from another
computer so it would have a negligible impact on the gmetad host.

~Jason
I have thought about this too.

The problem with this is the fact that if I go to something SQL-ish or similar, I will have to store about 25+ billion rows (<43 metrics> * <275 hosts> * <1 year of seconds>) because I'd want to store for about 1 year's worth of metrics, of the detailed view. Meaning a new value every 15 seconds per host per metric.

I am having nightmare's allready about working with a SQL database with 25+ billion rows, I doubt it will ever work on the hardware I have available for the project.

It would allmost be more useable (performance and storage wise) to just write additional .rrd files in the same manner gmetad does and perhaps use a ramdisk for this.

I agree a SQL dbase would be much more desireable, however I am very tempted to just write a tool that grabs the xml and stores it in additional rrd's. However it sure is beyond the whole concept of round robin databases to use it for the archiving of data.

If you have a good idea or suggestion on how to store the amount of data efficiently, without needing a extra cluster just to store and use the values, I would love to hear it.

- Ramon.


-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
Ganglia-general mailing list
Ganglia-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-general



--
PGP fingerprint 'A7C2 3C2F 8445 AD3C 135E F40B 242A 5984 ACBC 91D3'

   They that can give up essential liberty to obtain a little
      temporary safety deserve neither liberty nor safety.
  --Benjamin Franklin, Historical Review of Pennsylvania, 1759

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to