[ 
https://issues.apache.org/jira/browse/MAPREDUCE-956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12753149#action_12753149
 ] 

Jothi Padmanabhan commented on MAPREDUCE-956:
---------------------------------------------

True, we do have a final merge before feeding the reducer. However, assigning 
33% of progress for this one final merge does not seem to be correct.  In cases 
where the number of files at that time is < io.sort.factor, this final merge 
does not even occur, we start feeding the reducer straight away. Also, since we 
have merges happening during shuffle phase as well, I was just proposing that 
we delineate  as
Shuffle (50%)
Final Merge + Reduce (50%)



> Shuffle should be broken down to only two phases (copy/reduce) instead of 
> three (copy/sort/reduce)
> --------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-956
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-956
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: task
>    Affects Versions: 0.21.0
>            Reporter: Jothi Padmanabhan
>
> For the progress calculations and displaying on the UI, shuffle, in its 
> current form,  is decomposed into three phases (copy/sort/reduce). Actually, 
> the sort phase is no longer applicable. I think we should just reduce the 
> number of phases to two and assign 50% weight-age to each of copy and reduce 
> phases. Thoughts?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to