[ https://issues.apache.org/jira/browse/MAPREDUCE-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13556405#comment-13556405 ]
Mariappan Asokan commented on MAPREDUCE-4808: --------------------------------------------- Hi Arun, MAPREDUCE-4049 expects the plugin implementer to implement the shuffle from scratch. With the default implementation of HTTP shuffle being robust and secure it is possible to reuse it in majority of the situations. The alternate implementation of MapOutput can be left to the plugin implementer. For example, it can be optimized to use less JVM memory and minimize Java garbage collection. Some of the concrete use cases for the plugin are: hash aggregation, hash join, limit-N query, etc. Thanks. -- Asokan > Refactor MapOutput and MergeManager to facilitate reuse by Shuffle > implementations > ---------------------------------------------------------------------------------- > > Key: MAPREDUCE-4808 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4808 > Project: Hadoop Map/Reduce > Issue Type: New Feature > Reporter: Arun C Murthy > Assignee: Mariappan Asokan > Attachments: COMBO-mapreduce-4809-4812-4808.patch, > mapreduce-4808.patch, mapreduce-4808.patch, mapreduce-4808.patch, > mapreduce-4808.patch, mapreduce-4808.patch, mapreduce-4808.patch, > mapreduce-4808.patch, MergeManagerPlugin.pdf > > > Now that Shuffle is pluggable (MAPREDUCE-4049), it would be convenient for > alternate implementations to be able to reuse portions of the default > implementation. > This would come with the strong caveat that these classes are LimitedPrivate > and Unstable. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira