[ 
https://issues.apache.org/jira/browse/MAPREDUCE-279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13002543#comment-13002543
 ] 

MengWang commented on MAPREDUCE-279:
------------------------------------

@All

How shuffle works in MapReduce 2.0 ?

Our study shows that shuffle is a performance bottleneck of mapreduce 
computing. There are some problems of shuffle:
(1)Shuffle and reduce are tightly-coupled, usually shuffle phase doesn't 
consume too much memory and CPU, so theoretically, reducetasks's slot can be 
used for other computing tasks when copying data from maps. This method will 
enhance cluster utilization. Furthermore, should shuffle be separated from 
reduce? Then shuffle will not use reduce's slot,we need't distinguish between 
map slots and reduce slots at all.
(2)For large jobs, shuffle will use too many network connections, Data 
transmitted by each network connection is very little, which is inefficient. 
From 0.21.0 one connection can transfer several map outputs, but i think this 
is not enough. Maybe we can use a per node shuffle client progress(like 
tasktracker) to shuffle data for all reduce tasks on this node, then we can 
shuffle more data trough one connection.
(3)Too many concurrent connections will cause shuffle server do massive random 
IO, which is inefficient. Maybe we can aggregate http request(like delay 
scheduler), then random IO will be sequential.
(4)How to manage memory used by shuffle efficiently. We use buddy memory 
allocation, which will waste a considerable amount of memory.
(5)If shuffle separated from reduce, then we must figure out how to do reduce 
locality?
(6)Can we store map outputs in a Storage system(like hdfs)?
(7)Can shuffle be a general data transfer service, which not only for 
map/reduce paradigm?

> Map-Reduce 2.0
> --------------
>
>                 Key: MAPREDUCE-279
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-279
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: jobtracker, tasktracker
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>             Fix For: 0.23.0
>
>
> Re-factor MapReduce into a generic resource scheduler and a per-job, 
> user-defined component that manages the application execution. 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to