Practically speaking, to guide short-term goals, we do need to start
with a narrower, coherent remit and expand later. Starting as a
Java-based, Hadoop-based library for developers, focusing on
collaborative filtering, clustering, categorization, and a few other
things sounds just right.

It would be bad to think, we'll, we're about anything
machine-learning-related at all, and take a couple steps in 10
different directions, rather than start by thoroughly exploring a
couple. But nobody is saying that, it seems. Let's start by being a
great library as described above.

To that end I do want to push on...
1) Unifying our Hadoop integration -- well, once Hadoop sorts itself
out again. 0.20.0 doesn't really 'work' it seems
2) Unifying the code base -- see message about the common and utils
package for instance

If we do stuff like this we really are going to arrive at a useful,
polished, coherent product soon.

Sean

On Sat, Sep 5, 2009 at 4:30 PM, Grant Ingersoll<gsing...@apache.org> wrote:
> I don't think we necessarily need to be distributed or Hadoop based, but
> those are what we led with so far and its a good start.  The nice thing is
> the stuff works just fine in standalone mode, too.   First and foremost, we
> are a machine learning project with a commercial friendly license and a
> solid community aiming to build fast, production ready libraries.  Java,
> Hadoop and distributed are important, but secondary in my mind.  There will
> certainly be some algorithms that we can't implement in Hadoop.  See

Reply via email to