[ https://issues.apache.org/jira/browse/MAPREDUCE-4502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13454499#comment-13454499 ]
Karthik Kambatla commented on MAPREDUCE-4502: --------------------------------------------- Hope I am not severely digressing here. Just thinking out loud. Would be great if the more experienced campaigners can weigh-in here. bq. Would it make sense to do this either at the end of a map (coordinated by the AM) or, as you suggest, as a ReduceTask rather than a daemon? This might make even more sense taking [acmurthy]'s suggestion on MAPREDUCE-199 (data locality hints for reduce tasks) into consideration. Arun's suggestion (rephrased) was to schedule the reducers based on the distribution of keys across map-outputs. Putting the pieces together, it might be even more beneficial to do the following: # Perform node-level aggregation (reduce) at the end of maps in co-ordination with AM. # Perform rack-level aggregation at the end of node-level aggregation again in co-ordination with AM. The aggregation could be performed in parallel across the involved nodes such that each node has aggregated values of different keys. # Schedule reducers taking the key-distribution into account across racks. The con will be that the shuffle won't be asynchronous to map computation, but hopefully this wouldn't offset the gains of decreased network and disk I/O. PS. http://dl.acm.org/citation.cfm?id=1901088 documents the advantages of multi-level aggregation in the context of graph algorithms modeled as iterative MR jobs. > Multi-level aggregation with combining the result of maps per node/rack > ----------------------------------------------------------------------- > > Key: MAPREDUCE-4502 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4502 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: applicationmaster, mrv2 > Reporter: Tsuyoshi OZAWA > Assignee: Tsuyoshi OZAWA > Attachments: speculative_draft.pdf > > > The shuffle costs is expensive in Hadoop in spite of the existence of > combiner, because the scope of combining is limited within only one MapTask. > To solve this problem, it's a good way to aggregate the result of maps per > node/rack by launch combiner. > This JIRA is to implement the multi-level aggregation infrastructure, > including combining per container(MAPREDUCE-3902 is related), coordinating > containers by application master without breaking fault tolerance of jobs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira