[ https://issues.apache.org/jira/browse/MAPREDUCE-4502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13426365#comment-13426365 ]
Tsuyoshi OZAWA commented on MAPREDUCE-4502: ------------------------------------------- We prototyped the combining the result of maps per node/rack. the the result of our benchmark is available at mapreduce-dev ML(http://goo.gl/wWBWQ). > Multi-level aggregation with combining the result of maps per node/rack > ----------------------------------------------------------------------- > > Key: MAPREDUCE-4502 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4502 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: applicationmaster, mrv2 > Reporter: Tsuyoshi OZAWA > > The shuffle costs is expensive in Hadoop in spite of the existence of > combiner, because the scope of combining is limited within only one MapTask. > To solve this problem, it's a good way to aggregate the result of maps per > node/rack by launch combiner. > This JIRA is to implement the multi-level aggregation infrastructure, > including combining per container(MAPREDUCE-3902 is related), coordinating > containers by application master without breaking fault tolerance of jobs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira