Sean,

Yes thats the one for the shuffles that happen on reduce side (pull
model), you can drill down from that class onwards into seeing how
fetchers operate, etc.

On Wed, Jun 6, 2012 at 9:54 PM, Barry, Sean F <sean.f.ba...@intel.com> wrote:
> Thanks Harsh!
> And is this the right source code for the shuffling that is done in the 
> reduce task?
>
> http://search-hadoop.com/c/Hadoop:/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/task/reduce/Shuffle.java%7C%7Cshuffle+sort
>
> -sb
>
> -----Original Message-----
> From: Harsh J [mailto:ha...@cloudera.com]
> Sent: Tuesday, June 05, 2012 7:43 PM
> To: common-user@hadoop.apache.org
> Subject: Re: Shuffle/sort
>
> Hey Sean,
>
> Check out http://www.slideshare.net/jhammerb/hadoop-map-reduce-arch-106883,
> a slightly dated and MR1-oriented presentation from Owen O'Malley that goes a 
> good level in-depth to get an overview of how things work (including how 
> reduces pull data).
>
> After that, check out Chris Douglas'
> http://www.slideshare.net/hadoopusergroup/ordered-record-collection
> that goes in-depth into the evolution of the implementations of that layer. 
> This is pretty much the state of 0.20/1.0 today too, and in 2.0 we have had 
> Netty replacing Jetty among other improvements but I haven't a public 
> document link to share on this yet. Others may share the changes docs on 2.0 
> if they have a link to one (or I'll respond back as soon as I have one).
>
> I hope this helps!
>
> On Wed, Jun 6, 2012 at 4:16 AM, Barry, Sean F <sean.f.ba...@intel.com> wrote:
>> "I was always wondering after mapping, how each reduce task get its
>> input. It is said in google's paper and hadoop's documentation that a
>> sort is done to aggregate the same key of the map output. But there is
>> no detailed explanation of how it is implemented and my intuition is
>> that perhaps a global hashing will work better than sorting. So I
>> really want to know the details and see whether my intuition is right. If I 
>> can find out that in the source code, where should I start with?"
>>
>> I saw this question online and no one replied to it. does anyone know where 
>> I go to study the source code for the shuffle and sort.
>>
>> -sean
>
>
>
> --
> Harsh J



-- 
Harsh J

Reply via email to