To kind of wrap this up for now --

I hear some consensus that Mahout is about distributed, Hadoop-based
solutions for developers. So let's make sure we present a clean,
coherent API to developers wanting to run the project's Hadoop jobs.

I think we're a little bit stuck now as Hadoop 0.20.0 is a little bit
busted. But as it moves forward, perhaps I can volunteer to suggest
changes to unify the various jobs, mappers, reducers, etc. across the
project.

Sean

On Fri, Sep 4, 2009 at 11:21 PM, Grant Ingersoll<gsing...@apache.org> wrote:
>
> On Sep 4, 2009, at 1:07 PM, Ted Dunning wrote:
>
>> These are good questions to ask.  I don't know that we are ready to answer
>> them, but I do think that we have pieces of the answers.
>>
>> So far, there are three or four general themes that seem to be of real
>> interest/value
>>
>> a) taste/collaborative filtering/cooccurrence analysis
>>
>> b) facilitation of conventional machine learning by large scale
>> aggregation
>> using hadoop (so far, this is largely cooccurrence counting)
>>
>> c) standard and basic machine learning tasks like clustering, simple
>> classifiers running on large scale data
>>
>> d) stuff
>
> I'd add a few non-technical things I find useful:
>
> e)  Non-viral License
>
> f) Community supporting it (i.e. not abandoned) and a place to get answers
> about practical problems.
>
> I've been frustrated more than once by the lack of (e) and (f) on some other
> projects.  Not that I'm saying we solve (f) yet completely (could use a bit
> more diversity in people answering, but that is starting to take hold, too),
> but I do firmly believe Apache is one of the best places to build a
> community.
>
> -Grant
>

Reply via email to