Re: Shuffle phase: fine-grained control of data flow

Robert Evans Wed, 07 Nov 2012 07:46:02 -0800

Jiwei,

I think you could use that knowledge to launch reducers closer to the map
output, but I am not sure that it would make much difference.  It may even
slow things down. It is a question of several things

1) Can we get enough map tasks close to one another that it will make a
difference?
2) Does the reduced shuffle time offset the overhead of waiting for the
map location data before launching and fetching data early?
3) and do the time savings also offset the overhead of getting the map
tasks to be close to one another?

For #2 you might be able to deal with this by using speculative execution,
and launching some reduce tasks later if you see a clustering of map
output.  For #1 it will require changes to how we schedule tasks which
depending on how well it is implemented will impact #3 as well.
Additionally for #1 any job that approaches the same order of size as the
cluster will almost require the map tasks to be evenly distributed around
the cluster. If you can come up with a patch I would love to see some
performance numbers.

Personally I think spending time reducing the size of the data sent to the
reducers is a much bigger win.  Can you use a combiner? Do you really need
all of the data or can you sample the data to get a statistically
significant picture of what is in the data?  Have you enabled compression
between the maps and the reducers?

--Bobby

On 11/7/12 8:05 AM, "Harsh J" <ha...@cloudera.com> wrote:

>Hi Jiwei,
>
>In trunk (i.e. MR2), the completion events selection + scheduling
>logic lies under class EventFetcher's getMapCompletionEvents() method,
>as viewable at 
>http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-mapreduce-project/
>hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apa
>che/hadoop/mapreduce/task/reduce/EventFetcher.java?view=markup
>
>This EventFetcher thread is used by the Shuffle (reduce package)
>class, to continually do the shuffling. The Shuffle class is then
>itself used by the ReduceTask class (look in mapred package of same
>maven module).
>
>I guess you can start there, to see if a better selection+scheduling
>logic would yield better results.
>
>On Wed, Nov 7, 2012 at 12:26 PM, Jiwei Li <cxm...@gmail.com> wrote:
>> Dear all,
>>
>> For jobs like Sort, massive amounts of network traffic happen during
>> shuffle phase. The simple mechanism in Hadoop 1.0.4 to choose reduce
>>nodes
>> does not help reduce network traffic. If JobTracker is fully aware of
>> locations of every map output, why not take advantage of this topology
>> knowledge?
>>
>> So, is there anyone who knows where to develop such codes upon? Many
>>thanks.
>>
>> Regards.
>> --
>> Jiwei
>
>
>
>-- 
>Harsh J

Re: Shuffle phase: fine-grained control of data flow

Reply via email to