[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13556536#comment-13556536
 ] 

Chris Douglas commented on MAPREDUCE-4808:
------------------------------------------

Asokan, the concern is that even breaking an API, even if it's marked unstable, 
is an incompatible change. Since the pluggable shuffle is particularly useful 
for frameworks, breaking this contract could require 
patching/validation/rewrite of plugin and optimizer code in projects that 
invest in it (Hive, Pig, etc.). Moreover, if we wanted to change the default 
{{Shuffle}} to a different implementation, then user/framework code would 
perform badly- or break- unless we exposed this implementation-specific 
mechanism in the _new_ impl. So it's fair to press for use cases, to ensure 
it's _sufficient_ and that the abstraction could apply to most {{Shuffle}} 
implementations.

Personally, I'm ambivalent about exposing this as an API and am +1 on the patch 
overall (mostly because I like the {{MapOutput}} refactoring). The user can 
always configure the current {{Shuffle}}, which is exactly how frameworks would 
handle this until they port/specialize their efficient {{MergeManager}} plugin.

As a compromise, would it make sense to just add a protected 
{{createMergeManager}} method to the {{Shuffle}}? The user still needs to 
configure their custom {{Shuffle}} impl now, but that's better than the 
inevitable future where they configure both. It also makes its tie to this 
implementation explicit.
                
> Refactor MapOutput and MergeManager to facilitate reuse by Shuffle 
> implementations
> ----------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-4808
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4808
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>            Reporter: Arun C Murthy
>            Assignee: Mariappan Asokan
>         Attachments: COMBO-mapreduce-4809-4812-4808.patch, 
> mapreduce-4808.patch, mapreduce-4808.patch, mapreduce-4808.patch, 
> mapreduce-4808.patch, mapreduce-4808.patch, mapreduce-4808.patch, 
> mapreduce-4808.patch, MergeManagerPlugin.pdf
>
>
> Now that Shuffle is pluggable (MAPREDUCE-4049), it would be convenient for 
> alternate implementations to be able to reuse portions of the default 
> implementation. 
> This would come with the strong caveat that these classes are LimitedPrivate 
> and Unstable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to