Re: Statistical clustering MapReduce example?

Stefan Groschupf Mon, 30 Oct 2006 03:10:31 -0800

Hi Albert, Hi All,

yes it works pretty well. Back than we used weta as map reduceimplementation, a system we wrote since hadoop was part of nutch anddifficult to use stand alone.With hadoop the performance should be even better also theimplementation should be much easier.

We do a lot of text mining with map reduce and it works/scale great.It would be nice to have a big table implementation for some tasksthat need to share a fast "memory" but I'm busy with other stuffthese days to help working on that as well.We worked around this problem by having a set of boxes that have abig RAM and hold a simple HashMap in memory.We use a hadoop partitioner to decide which key value tuple we pushinto which hashmap server.


Stefan






Am 27.10.2006 um 21:31 schrieb Albert Chern:

Thank you for the paper Stefan.  I was surprised at how well the
K-Means algorithm fits into MapReduce.  I actually never thought of
writing the output back to the FS to test for convergence.

On 10/27/06, David Pollak <[EMAIL PROTECTED]> wrote:

Stefan,

THis is most excellent stuff!  Thanks for sending me the paper!

David

On Oct 26, 2006, at 4:13 PM, Stefan Groschupf wrote:

> Hi David,
>
> a student once wrote a paper about map reduce and clustering in my
> company during a internal ship.
> I will send it to you off list since the list does not support
> attachments.
> However if someone wants to have a copy as well, let me know.
>
> Cheers,
> Stefan
>
> Am 23.10.2006 um 22:41 schrieb David Pollak:
>
>> Howdy,
>>
>> I'm looking to cluster documents by word frequency (and maybe
>> position).  Does anyone know of MapReduce examples that
>> demonstrate statistical clustering?
>>
>> Thanks,
>>
>> David
>>
>>
>>
>
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> 101tec Inc.
> search tech for web 2.1
> Menlo Park, California
> http://www.101tec.com
>
>
>


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
101tec Inc.
search tech for web 2.1
Menlo Park, California
http://www.101tec.com

Re: Statistical clustering MapReduce example?

Reply via email to