Re: Mahout 1.0 goals

Ted Dunning Thu, 27 Feb 2014 17:16:33 -0800

Well, Mahout has had (kinda sorta awful) classifiers and clustering from
day one.  It isn't like the only goal is recommendations.


The non-MR, non-Hadoop comments are really more user centric requirements
than implementations.  It is important that users be able to start without
a cluster and move relatively transparently into a fully scaled solution.

Moreover, the Hadoop-tied map-reduce implementations that we have had up to
now have been disastrously complex.  We really need something better.




On Thu, Feb 27, 2014 at 5:11 PM, Sean Owen <[email protected]> wrote:

> This sounds good, but sounds like a whole different project or projects.
> For example where does R appear from, what non-MR implementations, etc,
> what is the no Hadoop implementation?
> On Feb 28, 2014 12:38 AM, "Ted Dunning" <[email protected]> wrote:
>
> > I would like to start a conversation about where we want Mahout to be for
> > 1.0.  Let's suspend for the moment the question of how to achieve the
> > goals.  Instead, let's converge on what we really would like to have
> happen
> > and after that, let's talk about means that will get us there.
> >
> > Here are some goals that I think would be good in the area of numerics,
> > classifiers and clustering:
> >
> > - runs with or without Hadoop
> >
> > - runs with or without map-reduce
> >
> > - includes (at least), regularized generalized linear models, k-means,
> > random forest, distributed random forest, distributed neural networks
> >
> > - reasonably competitive speed against other implementations including
> > graphlab, mlib and R.
> >
> > - interactive model building
> >
> > - models can be exported as code or data
> >
> > - simple programming model
> >
> > - programmable via Java or R
> >
> > - runs clustered or not
> >
> >
> > What does everybody think?
> >
>

Re: Mahout 1.0 goals

Reply via email to