[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated MAPREDUCE-5059:
-----------------------------------------------

    Status: Open  (was: Patch Available)

bq. Personally I think it'd be more useful to calculate merge time as the time 
delta between the end of the shuffle and the start of the reduce phase (i.e.: 
what the task attempts page is already showing for each reduce task attempt). I 
understand that the shuffle and merge phases overlap in practice, but really 
what we're looking for here is excessive time spent merging even after the data 
has been shuffled to the reducer.
+1 for doing this, I too agree that it is more useful this way.

The patch looks mostly good to me.
 - Can you remove the old code completely, instead of commenting it out?
 - In the test, I couldn't figure out why the expected time was 50, can you put 
a small comment saying it's an average of 45 and 55, or something of that sort?

Also the shown time is sort+merge time from shuffleEnd, not sure how we can 
make that clear.

                
> Job overview shows average merge time larger than for any reduce attempt
> ------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-5059
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5059
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: jobhistoryserver, webapps
>            Reporter: Jason Lowe
>            Assignee: Omkar Vinit Joshi
>         Attachments: MAPREDUCE-5059-20130325.patch
>
>
> When looking at a job overview page on the history server, the Average Merge 
> Time is often reported with a value that is far larger than the Elapsed Merge 
> Time shown for any reduce task attempt.  The job overview page calculates the 
> merge time as the time delta between the sort finishing and the job launching 
> while the attempts page calculates it as the time delta between the sort 
> finishing and the shuffle finishing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to