What I explained is shuffle phase.
After the reducer pulls the data, it does a sort on the key part only and calls the corresponding reduce method. On 12/22/14, bit1...@163.com <bit1...@163.com> wrote: > Then what exactly happens after Reducer pulls all mapper output key/value > pairs from all the mapper nodes before reducer see the > <key,value1,value2..>? > > > > bit1...@163.com > > From: Susheel Kumar Gadalay > Date: 2014-12-22 13:20 > To: user > Subject: Re: Question about shuffle/merge/sort phrase > Sorry, typo > > It is the reducer which will pull the mapper o/p as soon as it completes. > > On 12/22/14, Susheel Kumar Gadalay <skgada...@gmail.com> wrote: >> It is the mapper which will push the o/p to the respective reducer as >> soon as it completes. >> >> The no of reducers are known at the beginning itself. >> The mapper as it process the input split, generate the o/p of for each >> reducer (if the mapper o/p key is eligible for the reducer). >> The reducer will wait till the completion of all map tasks to start it >> processing. >> >> >> On 12/22/14, bit1...@163.com <bit1...@163.com> wrote: >>> Could some one help me on this question? thanks. >>> >>> >>> >>> bit1...@163.com >>> >>> 发件人: Todd >>> 发送时间: 2014-12-21 21:59 >>> 收件人: user@hadoop.apache.org >>> 主题: Question about shuffle/merge/sort phrase >>> Hi, Hadoopers, >>> I got a question about shuffle/sort/merge phrase related.. >>> My understanding is that shuffle is used to transfer the mapper >>> output(key/value pairs) from mapper node to reducer node, and merge >>> phrase >>> is used to merge all the mapper output from all mapper nodes, and sort >>> phrase is used to sort the key/value pair by key, >>> Then my question, whose responsibility is it that brings each key with >>> all >>> its values together (The reducer's input is a key and an iterative >>> values). >>> >>> >>> Thanks. >>> >> >