On Tue, Mar 8, 2011 at 3:26 PM, Sean Owen <[email protected]> wrote:
> Looks interesting -- it looks like a specialization for iterative

> Hadoop is, in the end, a tool that was never conceived for general
> distributed computation. But among frameworks it's (relatively) well
> understood and available. It seems like Mahout has taken on the
> mission of delivering something that works on the framework that's out
> there now, which is a practical rather than theoretically-motivated
> goal. (I think it's a good goal too.) I see that as a difference from
> many research-oriented projects.
>

At the last HUG they rolled out plans (preliminary alpha ETA summer) where
they separate task management substrate from application substrate. I.e. once
you got your task allocation & data/rack affinity refactored as a
standalone concern,
you can run MR or even MPI or whatever distributed data flow your
heart desires.

That's IMO a good news for stuff like mahout-math, a lot of times
matrix jobs require something
that is currently emulated by map-only passes, or has to resort to
reduction whereas all is though
is sequential merge without sort component .

So i think brighter days are ahead (for Mahout in particular).

Reply via email to