[
https://issues.apache.org/jira/browse/HADOOP-4749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12653768#action_12653768
]
he yongqiang commented on HADOOP-4749:
--------------------------------------
while the input data size of one reducer can be collected at the reducer side
after the copy phase is done, it seems that the reducer's input record count
can not be collected after copy phase is done.
Maybe these information can be collected at the map side?
> reducer should output input data size and record count when shuffling is done
> -----------------------------------------------------------------------------
>
> Key: HADOOP-4749
> URL: https://issues.apache.org/jira/browse/HADOOP-4749
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Reporter: Zheng Shao
>
> Sometimes we see a single slow reducer because of the load balancing problem.
> This information will be very useful to understand how imbalanced the load is.
> Should be easy to fix I guess, since reducer should have all information
> needed at the end of the shuffling phase.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.