Hi Albert, Hi All,

yes it works pretty well. Back than we used weta as map reduce implementation, a system we wrote since hadoop was part of nutch and difficult to use stand alone. With hadoop the performance should be even better also the implementation should be much easier.

We do a lot of text mining with map reduce and it works/scale great. It would be nice to have a big table implementation for some tasks that need to share a fast "memory" but I'm busy with other stuff these days to help working on that as well. We worked around this problem by having a set of boxes that have a big RAM and hold a simple HashMap in memory. We use a hadoop partitioner to decide which key value tuple we push into which hashmap server.

Stefan






Am 27.10.2006 um 21:31 schrieb Albert Chern:

Thank you for the paper Stefan.  I was surprised at how well the
K-Means algorithm fits into MapReduce.  I actually never thought of
writing the output back to the FS to test for convergence.

On 10/27/06, David Pollak <[EMAIL PROTECTED]> wrote:
Stefan,

THis is most excellent stuff!  Thanks for sending me the paper!

David

On Oct 26, 2006, at 4:13 PM, Stefan Groschupf wrote:

> Hi David,
>
> a student once wrote a paper about map reduce and clustering in my
> company during a internal ship.
> I will send it to you off list since the list does not support
> attachments.
> However if someone wants to have a copy as well, let me know.
>
> Cheers,
> Stefan
>
> Am 23.10.2006 um 22:41 schrieb David Pollak:
>
>> Howdy,
>>
>> I'm looking to cluster documents by word frequency (and maybe
>> position).  Does anyone know of MapReduce examples that
>> demonstrate statistical clustering?
>>
>> Thanks,
>>
>> David
>>
>>
>>
>
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> 101tec Inc.
> search tech for web 2.1
> Menlo Park, California
> http://www.101tec.com
>
>
>





~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
101tec Inc.
search tech for web 2.1
Menlo Park, California
http://www.101tec.com



Reply via email to