subject:"RE\: Reduce Sort"

RE: Reduce Sort

2008-04-09 Thread Natarajan, Senthil

ailto:[EMAIL PROTECTED] Sent: Tuesday, April 08, 2008 11:53 AM To: core-user@hadoop.apache.org; '[EMAIL PROTECTED]' Subject: Re: Reduce Sort There are two ways to do this. Both of them assume that you have counted the addresses using map-reduce and the results are in HDFS. First, sinc

Re: Reduce Sort

2008-04-08 Thread Ted Dunning

On 4/8/08 10:43 AM, "Natarajan, Senthil" <[EMAIL PROTECTED]> wrote: > I would like to try using Hadoop. That is good for education, probably bad for run time. It could take SECONDS longer to run (oh my). > Do you mean to write another MapReduce program which takes the output of the > first M

RE: Reduce Sort

2008-04-08 Thread Natarajan, Senthil

time to reduce all the maps and hence affect the performance right? Thanks, Senthil -Original Message- From: Ted Dunning [mailto:[EMAIL PROTECTED] Sent: Tuesday, April 08, 2008 11:53 AM To: core-user@hadoop.apache.org; '[EMAIL PROTECTED]' Subject: Re: Reduce Sort There are t

Re: Reduce Sort

2008-04-08 Thread Ted Dunning

There are two ways to do this. Both of them assume that you have counted the addresses using map-reduce and the results are in HDFS. First, since the number of unique IP address is likely to be relatively small, simply sorting the results using conventional sort is probably as good as it gets.