Witham, Timothy D wrote: >> Please have a look at this patch, perhaps it'll help with your > endeavor: >> http://bugzilla.ganglia.info/cgi-bin/bugzilla/show_bug.cgi?id=176 > It does look interesting. > I really like that patch too. :-) > >> On Wed, Apr 16, 2008 at 2:01 PM, Rich Paul <[EMAIL PROTECTED]> > wrote: >>> I've been hacking on ganglia, to add the ability to access highly >>> granular historical data. This consists of a new php page, which > shows >>> a graph for 1 attribute on 1 host at the time of your choice, a > script >>> which runs as a cronjob in order to copy data from >>> /var/lib/ganglia/rrds/**/*.rrd to /var/lib/ganglia/hist/**/*.rrd, (I >>> found that when saving a month of data at 4 samples per minute, my >>> system was spending much time waiting for IO), and some hacks to >>> graph.php. Has anybody else played with the ability to look at >>> arbitrary hours rather than just the most recent hour? > > I don't have space for it since my grids are too huge, but it would be > easier to just keep more detail in the RRDs which is decided at create > time. I haven't yet tried it, but gmetad/conf.c implies that the data > retention policy could be changed in the config file (I don't see this > option in the man page though; is that a bug?): > > config->RRAs[0] = "RRA:AVERAGE:0.5:1:244"; > config->RRAs[1] = "RRA:AVERAGE:0.5:24:244"; > config->RRAs[2] = "RRA:AVERAGE:0.5:168:244"; > config->RRAs[3] = "RRA:AVERAGE:0.5:672:244"; > config->RRAs[4] = "RRA:AVERAGE:0.5:5760:374"; > > Basically, you would just want to crank up that 244 number for the first > line or two. See rrdcreate(1) for details. This would then store the > detail you want at the cost of increased RRD file size. I have been > thinking of doing the opposite: adding another line for less detailed > averages beyond a year. > > But maybe your parenthetical comment means you did that already but had > too much waitIO? And that's why you went to a cron job? If so, you are > storing the RRDs in tmpfs, right? I was having a problem with too much waitIO, which is why I switched to batch processing to move half-hour chunks from rrds/**/metric.rrd to hist/**/metric.rrd.
I don't store the stuff in tmpfs, for a couple reasons. 1 is that I want to lose as little data as possible in the event of a system failure or reboot. I also don't want to hog the memory on a server which serves our development environment in several other capacities. >>> Also, I am curious as to how the performance of rrdtool would be >>> affected if we were to store related metrics in a single rrd file: >>> e.g., we could group cpu_(user,system,idle,wio,nice) in a single > file, >>> which I think would reduce the resource usage of gmetad > significantly. > > I have wondered that too. Since RRD is random access, it seems like it > should be at least as efficient and probably more efficient since there > would be less files open. But it would be difficult to change. Now > each RRD is simple with DS:sum and DS:num for summaries; the metric is > in the filename itself. To change, you would need to put the metric > names in the RRDs: DS:cpu_user_sum, etc. and I think you would have to > update all metrics with one rrd_update call. Of course this would work > only for the standard metrics and extra metrics would still need to be > in their own files. Or, perhaps with the new metric groupings, each > group could be an RRD file of related metrics. And then you'd have to > change the PHP to understand all this... I think you're right on these points. Probably the only metrics for which I would be interested in doing this would be the 3 load metrics, 5ish cpu metrics, the 5ish memory metrics, and the 4ish network metrics. As a matter of taste, I probably would only group metrics which were in common units (except for the network metrics). I'm not sure how it would effect the speed, because I don't know if rrdtool stores multiple ds files like parallel arrays, or like an array of structs. In the former case, updating a row of 5 values would probably dirty 5 sectors in the buffer cache, it there would probably be little gain. In the latter case, I suspect one could usually update 5 values while only dirtying 1 sector in the buffer cache, and you would have a pretty good win. > > I would guess it was designed with 1 metric in each RRD file to make it > more flexible in adding/removing metrics, and to make the code simpler. > > -twitham > I suspect so as well. Changing it would be (AFAICT) pretty simple in the php's (mostly only changing graph.php), I'm not sure how simple it would be in gmetad and gmond. I think the hardest part would be convincing gmond to batch grouped metrics into a single message, so that gmetad is not left receiving metrics one at a time and caching them until it collects the whole set. Then again, I played with the .x file to write a proxy for gmond, and it seemed like a pretty easy config to hack upon. ------------------------------------------------------------------------- This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone _______________________________________________ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers