i read the doc, and find if i have 8 reducer ,a map task will output 8 partition ,each partition will be send to a different reducer, so if i increase reduce number ,the partition number increase ,but the volume on network traffic is same,why sometime ,increase reducer number will not decrease job complete time ?
On Wed, Dec 11, 2013 at 1:48 PM, Vinayakumar B <vinayakuma...@huawei.com>wrote: > It looks simple, J > > > > Shuffled Maps= Number of Map Tasks * Number of Reducers > > > > Thanks and Regards, > > Vinayakumar B > > > > *From:* ch huang [mailto:justlo...@gmail.com] > *Sent:* 11 December 2013 10:56 > *To:* user@hadoop.apache.org > *Subject:* issue about Shuffled Maps in MR job summary > > > > hi,maillist: > > i run terasort with 16 reducers and 8 reducers,when i double > reducer number, the Shuffled maps is also double ,my question is the job > only run 20 map tasks (total input file is 10,and each file is 100M,my > block size is 64M,so split is 20) why i need shuffle 160 maps in 8 reducers > run and 320 maps in 16 reducers run?how to caculate the shuffle maps number? > > > > 16 reducer summary output: > > > > > > Shuffled Maps =320 > > > > 8 reducer summary output: > > > > Shuffled Maps =160 >