It should be possible integrate both approached in the library. Rate calculations are a layer on top of the raw counters. One approach would be to define raw counter functions for each metrics, e.g. pkts_in_count_func(void). You could then re-implement the pks_in_func(void) to make use of the pkts_in_count(void) function. You could even generalize a rate function along the lines rate_funct(*counterfunction). There are good reasons to avoid rate calculations at the sender end and leave this as a function for the receiver that you might want to consider if you were re-architecting the basic metrics. If you took this approach, you could dramatically simplify the library.
Another difference is that sFlow doesn't deal in individual values, but in standard structures containing blocks of metrics. For example, for the network counters, you might have a structure containing pkts_in, pkts_out, bytes_in,bytes_out,discards_in,discards_out, errors_in, errors_out. We would have a function get_network_counters(*counter_struct) that populated the block in a single operation. You could implement this as a set of calls to get the individual counters, but you risk skewing the counters in the block (and in practice each family of counters can often be retrieved in a single operation - reading a single /proc file for example). It looks like Ganglia also exports its standard metrics in blocks. The standard RRD charts contain well defined sets of counters. Changing the library to retrieve counter blocks (cpu, memory, network) would reduce the number of functions in the library and simplify the functions in the case where the counter block can be retrieved from the OS atomically. If this approach makes sense, perhaps we could work together to agree on a standard set of structures to export as XDR encoded counter blocks since both Ganglia and sFlow share this simple mechanism. I started a discussion on the sFlow.org mailing list to try and settle a set of standard system performance counter blocks, if anyone from the Ganglia project is interested in participating their input would be welcome. http://www.sflow.org/discussion/index.php http://www.sflow.org/sflow-discussion/ Shared structures would make it easy to use sFlow as a feed for gmond, or for an sFlow agent running in a Ganglia cluster to bridge counters from all the hosts in the cluster to sFlow. On Mar 12, 2010, at 2:35 AM, Daniel Pocock wrote: > Peter Phaal wrote: >> We have had time to take a closer look at libmetrics and while it is a >> useful reference, there are significant architectural differences between >> sFlow's approach to counter polling and Ganglia's. The libmetrics library >> export rates rather than counters. An sFlow agent exports raw counters and >> leaves it to the sFlow analyzer to calculate rates. While we would aim to be >> able to generate the same metrics on the analyzer, an sFlow agent won't be >> able use many of the libmetrics calls. >> > Two issues come to mind: > > a) do you believe that a generalised libmetrics supporting both counters > and rates would be useful, and potentially easier to achieve than > building your own solution from the ground up? > > b) there is no hard rule to say that the current approach will remain > consistent over major releases of Ganglia ------------------------------------------------------------------------------ Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev _______________________________________________ Ganglia-developers mailing list Ganglia-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/ganglia-developers