Well, Mahout has had (kinda sorta awful) classifiers and clustering from day one. It isn't like the only goal is recommendations.
The non-MR, non-Hadoop comments are really more user centric requirements than implementations. It is important that users be able to start without a cluster and move relatively transparently into a fully scaled solution. Moreover, the Hadoop-tied map-reduce implementations that we have had up to now have been disastrously complex. We really need something better. On Thu, Feb 27, 2014 at 5:11 PM, Sean Owen <sro...@gmail.com> wrote: > This sounds good, but sounds like a whole different project or projects. > For example where does R appear from, what non-MR implementations, etc, > what is the no Hadoop implementation? > On Feb 28, 2014 12:38 AM, "Ted Dunning" <ted.dunn...@gmail.com> wrote: > > > I would like to start a conversation about where we want Mahout to be for > > 1.0. Let's suspend for the moment the question of how to achieve the > > goals. Instead, let's converge on what we really would like to have > happen > > and after that, let's talk about means that will get us there. > > > > Here are some goals that I think would be good in the area of numerics, > > classifiers and clustering: > > > > - runs with or without Hadoop > > > > - runs with or without map-reduce > > > > - includes (at least), regularized generalized linear models, k-means, > > random forest, distributed random forest, distributed neural networks > > > > - reasonably competitive speed against other implementations including > > graphlab, mlib and R. > > > > - interactive model building > > > > - models can be exported as code or data > > > > - simple programming model > > > > - programmable via Java or R > > > > - runs clustered or not > > > > > > What does everybody think? > > >