[
https://issues.apache.org/jira/browse/MAPREDUCE-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13550858#comment-13550858
]
Jerry Chen commented on MAPREDUCE-4808:
---------------------------------------
Hi Arun and Asokan,
I am trying an implemenation of hash based reduce other than the global sort
and merge. And first of all, Asokan's work is valuable for making this
implemenation possible. And I have tried to use interfaces in this patch to
plugin the HashMergeManager. Mostly, the current interface can satisfy the hash
merge manager needs. But I am not sure whether the interface is "suffice for
the future" for others just as Arun point out.
While the problem I encountered during the implementation is the reusing of the
common code of the MergeManager. HashMergeManager shares a lot of common
feature and code from the MergeManager such as OnDiskMapOutput,
InMemoryMapOutput, InMemoryReader and MergeThread (PartitionThread actually).
But the current code base of these classes make references to MergeManager and
thus can not be reused by a HashMergeManager. I have to make another copy of
these classes and modify them to refer to HashMergeManager. These would
generate duplicated code and would not be preferred.
We would best either make MergeManager inheritable (by making some private
member protected) or abstract out OnDiskMapOutput and InMemoryMapoutput and
others to be utility classes that can be shared by some "light weight" merge
managers which share a lot of common with MergeManager and only modify some
small aspects.
Jerry
> Allow reduce-side merge to be pluggable
> ---------------------------------------
>
> Key: MAPREDUCE-4808
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4808
> Project: Hadoop Map/Reduce
> Issue Type: New Feature
> Affects Versions: 2.0.2-alpha
> Reporter: Arun C Murthy
> Assignee: Mariappan Asokan
> Fix For: 2.0.3-alpha
>
> Attachments: COMBO-mapreduce-4809-4812-4808.patch,
> mapreduce-4808.patch, mapreduce-4808.patch, mapreduce-4808.patch,
> mapreduce-4808.patch, mapreduce-4808.patch, MergeManagerPlugin.pdf
>
>
> Allow reduce-side merge to be pluggable for MAPREDUCE-2454
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira