[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13527520#comment-13527520
 ] 

Mariappan Asokan commented on MAPREDUCE-4808:
---------------------------------------------

Hi Aun,
   Thanks for your feedback.  Perhaps I should mention some use cases of a 
MergeManager plugin in addition to the technical details of the design 
mentioned here as well as in MAPREDUCE-4812.

MergeManager plugin would allow us and any implementer of the plugin to do 
variety of additional transformations like copy, limit-N query(MAPREDUCE-1928), 
full join, and hashed aggregation more efficiently.  Since shuffle code is 
available in the framework, we want to make use of it.  In my opinion, the 
framework shuffle code seems to be stable in MRv2.

Making Merger to be pluggable will not add much value.  If I understand 
correctly, it allows plugin implementers to implement only a single pass of the 
merge.  The overall merge is still driven by MergeManager.  Also, there is only 
merge operation possible.  Any additional transformation has to be done in the 
Reducer only.  A lot of times this is not very efficient.

Hope I clarified the usefulness of allowing MergeManager to be pluggable.  
Please feel free if you any questions.

Thanks.

-- Asokan
                
> Allow reduce-side merge to be pluggable
> ---------------------------------------
>
>                 Key: MAPREDUCE-4808
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4808
>             Project: Hadoop Map/Reduce
>          Issue Type: Sub-task
>    Affects Versions: 2.0.2-alpha
>            Reporter: Arun C Murthy
>            Assignee: Mariappan Asokan
>             Fix For: 2.0.3-alpha
>
>         Attachments: COMBO-mapreduce-4809-4812-4808.patch, 
> mapreduce-4808.patch
>
>
> Allow reduce-side merge to be pluggable for MAPREDUCE-2454

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to