The reducer's primary work begins by pulling in data files from all
the other tasktrackers. Due to this fact, assigning multiple reduce
tasks in one go would tax the node (in terms of number of network
connections) since they'll all begin individually connecting and
pulling at about the same time,
unsubscribe
Hi,
I see in the code that while we assign a number of map tasks, we assign only
one reduce task per tasktracker during the heartbeat.
Is there a brief somewhere on why this design decision is made ?
Thanks
Sudhan S