[jira] Commented: (MAPREDUCE-318) Refactor reduce shuffle code

Iyappan Srinivasan (JIRA) Thu, 03 Sep 2009 03:07:58 -0700

    [ 
https://issues.apache.org/jira/browse/MAPREDUCE-318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12750951#action_12750951
 ]


Iyappan Srinivasan commented on MAPREDUCE-318:
----------------------------------------------

+1 for testing

Cluster conf with mapred.child.java.opts 512M and io.sort.factor 100.
namenode heap size is 3GB and jobtracker heap size is 1GB.

Some benchmarking and functionality test results.

1) 
default sort on a a 94 node cluster :
trunk two attempts : 1)2376 seconds 2) 2589 seconds
with patch last two attempts : 1) 1408 seconds 2) 1381 seconds


2)
loadgen ona  94 node cluster:
trunk : 57 minutes
with patch two attempts : 1)56 minuts 9 seconds 2) 56 minuts 23 seconds.

3)
gridmix2 on a  491 node cluster : 
trunk : 1 hour 7 minutes
patch  two attempts : 1) 57 minutes, 2) 47 minutes.

4) sort ( with memory-to-memory enabled) : passed

5) After starting the job with mapred.reduce.slowstart.completed.maps=1, remove 
some intermediate map output and corrupt some map output. verify if only those 
tasks are rerun.


 

> Refactor reduce shuffle code
> ----------------------------
>
>                 Key: MAPREDUCE-318
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-318
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>            Reporter: Owen O'Malley
>            Assignee: Owen O'Malley
>         Attachments: HADOOP-5233_api.patch, HADOOP-5233_part0.patch, 
> mapred-318-14Aug.patch, mapred-318-20Aug.patch, mapred-318-24Aug.patch, 
> mapred-318-3Sep-v1.patch, mapred-318-3Sep.patch, mapred-318-common.patch
>
>
> The reduce shuffle code has become very complex and entangled. I think we 
> should move it out of ReduceTask and into a separate package 
> (org.apache.hadoop.mapred.task.reduce). Details to follow.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-318) Refactor reduce shuffle code

Reply via email to