Hi

  I just subscribed this maillist and plan to use mahout collaborative
filtering part. I feel that mahout may be better focused on a few algorithms
first and do it very well in a scalable way. Simple algorithms such as naive
bayes and {item|user}-based collaborative filtering may be the initial
focus. Complex algorithms such as LDA can be delayed. Applying a large data
set in simple algorithms, we can achieve a very good quality. This is also
where the scalability, one important characteristics of mahout, really
matters.

 Best,

Albert.

On Fri, Sep 4, 2009 at 3:21 PM, Grant Ingersoll <gsing...@apache.org> wrote:

>
> On Sep 4, 2009, at 1:07 PM, Ted Dunning wrote:
>
>  These are good questions to ask.  I don't know that we are ready to answer
>> them, but I do think that we have pieces of the answers.
>>
>> So far, there are three or four general themes that seem to be of real
>> interest/value
>>
>> a) taste/collaborative filtering/cooccurrence analysis
>>
>> b) facilitation of conventional machine learning by large scale
>> aggregation
>> using hadoop (so far, this is largely cooccurrence counting)
>>
>> c) standard and basic machine learning tasks like clustering, simple
>> classifiers running on large scale data
>>
>> d) stuff
>>
>
> I'd add a few non-technical things I find useful:
>
> e)  Non-viral License
>
> f) Community supporting it (i.e. not abandoned) and a place to get answers
> about practical problems.
>
> I've been frustrated more than once by the lack of (e) and (f) on some
> other projects.  Not that I'm saying we solve (f) yet completely (could use
> a bit more diversity in people answering, but that is starting to take hold,
> too), but I do firmly believe Apache is one of the best places to build a
> community.
>
> -Grant
>

Reply via email to