On Thu, Nov 3, 2011 at 2:12 PM, Grant Ingersoll <[email protected]> wrote:
> ...
> So, to deploy a Combiner, we need to understand what type of distribution
> we are dealing with (which we do know, but may need a marker interface or
> something). Then, if is an "Combinable" distribution, it can do above
> (which I admittedly need to work through a bit more)? Do we have Welford
> implemented in our math package?
>
Yes. A marker would be required.
The key method would be add(Model).
We do have a Welford implementation for one dimensional estimates
in org.apache.mahout.math.stats.OnlineSummarizer
Rough pseudo code would be really helpful as to what the Combiner might
> look like. I'll worry about the math later.
>
Here is a sketch:
public class ModelCombiner extends Reducer<Integer, Model> {
void reduce(Integer key, Iterable<Model> values, Context ctx) {
Model m = null;
for (Model x : values) {
if (m == null) {
m = x;
else {
m.add(x);
}
}
ctx.write(key, m);
}
}