On Thu, Nov 3, 2011 at 2:12 PM, Grant Ingersoll <gsing...@apache.org> wrote:
> ... > So, to deploy a Combiner, we need to understand what type of distribution > we are dealing with (which we do know, but may need a marker interface or > something). Then, if is an "Combinable" distribution, it can do above > (which I admittedly need to work through a bit more)? Do we have Welford > implemented in our math package? > Yes. A marker would be required. The key method would be add(Model). We do have a Welford implementation for one dimensional estimates in org.apache.mahout.math.stats.OnlineSummarizer Rough pseudo code would be really helpful as to what the Combiner might > look like. I'll worry about the math later. > Here is a sketch: public class ModelCombiner extends Reducer<Integer, Model> { void reduce(Integer key, Iterable<Model> values, Context ctx) { Model m = null; for (Model x : values) { if (m == null) { m = x; else { m.add(x); } } ctx.write(key, m); } }