[
https://issues.apache.org/jira/browse/MAPREDUCE-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Arun C Murthy updated MAPREDUCE-4808:
-------------------------------------
Status: Open (was: Patch Available)
Asokan, sorry I've been away traveling home during the holidays and hence the
delay.
I have more comments, but I'll put some here to keep the discussion going.
Thanks for the design doc, but I was looking for thoughts on *how* the plugin
was going used for use-cases you've mentioned (hash-join etc.), alternatives on
design etc.
IAC, taking a step back, the 'goal' here is to make the 'merge' pluggable.
Reduce-side has 2 pieces:
# Shuffle - Move data from maps to the reduce.
# Merge - Merge already sorted map-outputs.
The rest (MergeManager etc.) are merely implementation details to manage memory
etc., which are irrelevant in several scenarios as soon as we consider
alternatives to the current HTTP-based shuffle (several alternatives exist such
RDMA etc.).
Your current approach tries to encapsulate and enshrine the current
implementation of the reduce task, which I'm not wild about. By this I mean,
you are focussing too much on the current state and trying to make interfaces
which are unnecessary for now and might not suffice for the future.
I really don't think we should be tying Shuffle & Merge as you have done by
introducing yet another new interface (regardless of whether it's public or
not).
As I've noted above, adding a simple 'Merge' interface with one 'merge' call
will address all of the use-cases you have outlined. If not, let's discuss.
> Allow reduce-side merge to be pluggable
> ---------------------------------------
>
> Key: MAPREDUCE-4808
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4808
> Project: Hadoop Map/Reduce
> Issue Type: New Feature
> Affects Versions: 2.0.2-alpha
> Reporter: Arun C Murthy
> Assignee: Mariappan Asokan
> Fix For: 2.0.3-alpha
>
> Attachments: COMBO-mapreduce-4809-4812-4808.patch,
> mapreduce-4808.patch, mapreduce-4808.patch, mapreduce-4808.patch,
> mapreduce-4808.patch, mapreduce-4808.patch, MergeManagerPlugin.pdf
>
>
> Allow reduce-side merge to be pluggable for MAPREDUCE-2454
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira