Hi Albert, Hi All,
yes it works pretty well. Back than we used weta as map reduce
implementation, a system we wrote since hadoop was part of nutch and
difficult to use stand alone.
With hadoop the performance should be even better also the
implementation should be much easier.
We do a lot of text mining with map reduce and it works/scale great.
It would be nice to have a big table implementation for some tasks
that need to share a fast "memory" but I'm busy with other stuff
these days to help working on that as well.
We worked around this problem by having a set of boxes that have a
big RAM and hold a simple HashMap in memory.
We use a hadoop partitioner to decide which key value tuple we push
into which hashmap server.
Stefan
Am 27.10.2006 um 21:31 schrieb Albert Chern:
Thank you for the paper Stefan. I was surprised at how well the
K-Means algorithm fits into MapReduce. I actually never thought of
writing the output back to the FS to test for convergence.
On 10/27/06, David Pollak <[EMAIL PROTECTED]> wrote:
Stefan,
THis is most excellent stuff! Thanks for sending me the paper!
David
On Oct 26, 2006, at 4:13 PM, Stefan Groschupf wrote:
> Hi David,
>
> a student once wrote a paper about map reduce and clustering in my
> company during a internal ship.
> I will send it to you off list since the list does not support
> attachments.
> However if someone wants to have a copy as well, let me know.
>
> Cheers,
> Stefan
>
> Am 23.10.2006 um 22:41 schrieb David Pollak:
>
>> Howdy,
>>
>> I'm looking to cluster documents by word frequency (and maybe
>> position). Does anyone know of MapReduce examples that
>> demonstrate statistical clustering?
>>
>> Thanks,
>>
>> David
>>
>>
>>
>
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> 101tec Inc.
> search tech for web 2.1
> Menlo Park, California
> http://www.101tec.com
>
>
>
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
101tec Inc.
search tech for web 2.1
Menlo Park, California
http://www.101tec.com