It should be possible integrate both approached in the library. Rate 
calculations are a layer on top of the raw counters. One approach would be to 
define raw counter functions for each metrics, e.g. pkts_in_count_func(void). 
You could then re-implement the pks_in_func(void) to make use of the 
pkts_in_count(void) function. You could even generalize a rate function along 
the lines rate_funct(*counterfunction). There are good reasons to avoid rate 
calculations at the sender end and leave this as a function for the receiver 
that you might want to consider if you were re-architecting the basic metrics. 
If you took this approach, you could dramatically simplify the library.

Another difference is that sFlow doesn't deal in individual values, but in 
standard structures containing blocks of metrics. For example, for the network 
counters, you might have a structure containing pkts_in, pkts_out, 
bytes_in,bytes_out,discards_in,discards_out, errors_in, errors_out.  We would 
have a function get_network_counters(*counter_struct) that populated the block 
in a single operation. You could implement this as a set of calls to get the 
individual counters, but you risk skewing the counters in the block (and in 
practice each family of counters can often be retrieved in a single operation - 
reading a single /proc file for example).

It looks like Ganglia also exports its standard metrics in blocks. The standard 
RRD charts contain well defined sets of counters. Changing the library to 
retrieve counter blocks (cpu, memory, network) would reduce the number of 
functions in the library and simplify the functions in the case where the 
counter block can be retrieved from the OS atomically.

If this approach makes sense, perhaps we could work together to agree on a 
standard set of structures to export as XDR encoded counter blocks since both 
Ganglia and sFlow share this simple mechanism. I started a discussion on the 
sFlow.org mailing list to try and settle a set of standard system performance 
counter blocks, if anyone from the Ganglia project is interested in 
participating their input would be welcome. 
http://www.sflow.org/discussion/index.php
http://www.sflow.org/sflow-discussion/

Shared structures would make it easy to use sFlow as a feed for gmond, or for 
an sFlow agent running in a Ganglia cluster to bridge counters from all the 
hosts in the cluster to sFlow.

On Mar 12, 2010, at 2:35 AM, Daniel Pocock wrote:

> Peter Phaal wrote:
>> We have had time to take a closer look at libmetrics and while it is a 
>> useful reference, there are significant architectural differences between 
>> sFlow's approach to counter polling and Ganglia's. The libmetrics library 
>> export rates rather than counters. An sFlow agent exports raw counters and 
>> leaves it to the sFlow analyzer to calculate rates. While we would aim to be 
>> able to generate the same metrics on the analyzer, an sFlow agent won't be 
>> able use many of the libmetrics calls.
>> 
> Two issues come to mind:
> 
> a) do you believe that a generalised libmetrics supporting both counters
> and rates would be useful, and potentially easier to achieve than
> building your own solution from the ground up?
> 
> b) there is no hard rule to say that the current approach will remain
> consistent over major releases of Ganglia


------------------------------------------------------------------------------
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
Ganglia-developers mailing list
Ganglia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/ganglia-developers

Reply via email to