On Wed, Apr 16, 2008 at 5:43 PM, Witham, Timothy D <[EMAIL PROTECTED]> wrote: > I don't have space for it since my grids are too huge, but it would be > easier to just keep more detail in the RRDs which is decided at create > time. I haven't yet tried it, but gmetad/conf.c implies that the data > retention policy could be changed in the config file (I don't see this > option in the man page though; is that a bug?):
This is correct. You can change the *initial* settings for an RRD file *when it is created*. If you already have RRD files, changing the settings in the config file will have no effect. If you want to change your current files, then you need to export the existing data ('rrdtool xport'), create new files with new settings, then import the old data ('rrdtool restore'). > But maybe your parenthetical comment means you did that already but had > too much waitIO? And that's why you went to a cron job? If so, you are > storing the RRDs in tmpfs, right? And read this paper: http://www.usenix.org/events/lisa07/tech/plonka.html Not related to Ganglia, but it does discuss optimizing performance on with RRD files. In short: turn off read-ahead, and upgrade to a recent version of rrdtool. > >> Also, I am curious as to how the performance of rrdtool would be > >> affected if we were to store related metrics in a single rrd file: > >> e.g., we could group cpu_(user,system,idle,wio,nice) in a single > file, > >> which I think would reduce the resource usage of gmetad > significantly. > > I have wondered that too. Since RRD is random access, it seems like it > should be at least as efficient and probably more efficient since there > would be less files open. But it would be difficult to change. Now The cost of calling open() is fairly low, and even on huge clusters, I'd be surprised if this is hugely significant. RRD files have a short header section that stores information about the RRAs, the DSs, and offsets as to the "current" pointer for the RRAs within the file. With mutliple metrics in a single file, you reduce the number of open() calls, but increase the number of calls to seek() within a single file. One of the main points of the paper I mentioned above is that read-ahead is almost entirely wasted on RRD files. In order to read/write a single value in an RRA, the OS will open the file, read the header (which is short) plus many other blocks on disk because of read-ahead settings. Next, rrdtool must seek to the proper location in the RRD file, read a bunch of blocks (which we don't care about), then write the new data. Repeat this seek/readahead/write pattern for each RRA that needs to be updated. > each RRD is simple with DS:sum and DS:num for summaries; the metric is > in the filename itself. To change, you would need to put the metric > names in the RRDs: DS:cpu_user_sum, etc. and I think you would have to > update all metrics with one rrd_update call. Of course this would work > only for the standard metrics and extra metrics would still need to be > in their own files. Or, perhaps with the new metric groupings, each > group could be an RRD file of related metrics. And then you'd have to > change the PHP to understand all this... yep. You could only consolidate a few sets of metrics, since not all system support all of them. However, ideas for improving the FE are welcome. Once we get 3.1 (or 3.2) out the door, I'd like to work on new FE, perhaps with things like consolidated RRD files in mind. -- Jesse Becker GPG Fingerprint -- BD00 7AA4 4483 AFCC 82D0 2720 0083 0931 9A2B 06A2 ------------------------------------------------------------------------- This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone _______________________________________________ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers