[
https://issues.apache.org/jira/browse/HADOOP-5210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12674480#action_12674480
]
Devaraj Das commented on HADOOP-5210:
-------------------------------------
This could be because of the way we compute mergeProgress during merges in the
reduce. The mergeProgress is a function of the totalBytesProcessed and the
totalBytesProcessed is incremented for every segment considered during merge.
So if we have multi-level merges, we would run into a case where we report more
progress per byte since many bytes would make hit the disk but they would be
again considered for the next level merge and so on..
> Reduce Task Progress shows > 100% when the total size of map outputs (for a
> single reducer) is high
> ----------------------------------------------------------------------------------------------------
>
> Key: HADOOP-5210
> URL: https://issues.apache.org/jira/browse/HADOOP-5210
> Project: Hadoop Core
> Issue Type: Bug
> Reporter: Jothi Padmanabhan
> Priority: Minor
> Attachments: Picture 3.png
>
>
> When the total map outputs size (reduce input size) is high, the reported
> progress is greater than 100%.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.