[ https://issues.apache.org/jira/browse/MAPREDUCE-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13556536#comment-13556536 ]
Chris Douglas commented on MAPREDUCE-4808: ------------------------------------------ Asokan, the concern is that even breaking an API, even if it's marked unstable, is an incompatible change. Since the pluggable shuffle is particularly useful for frameworks, breaking this contract could require patching/validation/rewrite of plugin and optimizer code in projects that invest in it (Hive, Pig, etc.). Moreover, if we wanted to change the default {{Shuffle}} to a different implementation, then user/framework code would perform badly- or break- unless we exposed this implementation-specific mechanism in the _new_ impl. So it's fair to press for use cases, to ensure it's _sufficient_ and that the abstraction could apply to most {{Shuffle}} implementations. Personally, I'm ambivalent about exposing this as an API and am +1 on the patch overall (mostly because I like the {{MapOutput}} refactoring). The user can always configure the current {{Shuffle}}, which is exactly how frameworks would handle this until they port/specialize their efficient {{MergeManager}} plugin. As a compromise, would it make sense to just add a protected {{createMergeManager}} method to the {{Shuffle}}? The user still needs to configure their custom {{Shuffle}} impl now, but that's better than the inevitable future where they configure both. It also makes its tie to this implementation explicit. > Refactor MapOutput and MergeManager to facilitate reuse by Shuffle > implementations > ---------------------------------------------------------------------------------- > > Key: MAPREDUCE-4808 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4808 > Project: Hadoop Map/Reduce > Issue Type: New Feature > Reporter: Arun C Murthy > Assignee: Mariappan Asokan > Attachments: COMBO-mapreduce-4809-4812-4808.patch, > mapreduce-4808.patch, mapreduce-4808.patch, mapreduce-4808.patch, > mapreduce-4808.patch, mapreduce-4808.patch, mapreduce-4808.patch, > mapreduce-4808.patch, MergeManagerPlugin.pdf > > > Now that Shuffle is pluggable (MAPREDUCE-4049), it would be convenient for > alternate implementations to be able to reuse portions of the default > implementation. > This would come with the strong caveat that these classes are LimitedPrivate > and Unstable. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira