[ https://issues.apache.org/jira/browse/MAPREDUCE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13505252#comment-13505252 ]
Arun C Murthy commented on MAPREDUCE-4049: ------------------------------------------ Sorry, just caught up on this since I'm dealing with some health issues at home. Frankly, worrying about whose work is a subset of whose is a pointless exercise. Having said that, making related tasks sub-tasks makes sense as long as there is a coherent community (one or more developers) working together makes sense, I don't see it for MAPREDUCE-4049 vis-a-vis MAPREDUCE-2454. IAC, there is no need to debate this further - it's just a time sink. Finally, MAPREDUCE-2454 is a bunch of large-scale changes. I'm happy to commit this as long as it's ready to, without tying it in. ---- Overall, I really don't like to see us egregiously rename core MR classes - at best it's pointless for private apis, and at worst it hammers svn log. So, pls do not change existing Shuffle etc. Avner, please upload a patch with other changes: # Use @LimitedPrivate, that way it makes it clear that this is for implementers and not end-users. # I'm ok with suggested config names (again, I'm not religious about naming). With that it's good to go. > plugin for generic shuffle service > ---------------------------------- > > Key: MAPREDUCE-4049 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4049 > Project: Hadoop Map/Reduce > Issue Type: Sub-task > Components: performance, task, tasktracker > Affects Versions: 1.0.3, 1.1.0, 2.0.0-alpha, 3.0.0 > Reporter: Avner BenHanoch > Labels: merge, plugin, rdma, shuffle > Fix For: trunk > > Attachments: HADOOP-1.x.y.patch, Hadoop Shuffle Plugin Design.rtf, > mapreduce-4049.patch, mapreduce-4049.patch, mapreduce-4049.patch, > mapreduce-4049.patch > > > Support generic shuffle service as set of two plugins: ShuffleProvider & > ShuffleConsumer. > This will satisfy the following needs: > # Better shuffle and merge performance. For example: we are working on > shuffle plugin that performs shuffle over RDMA in fast networks (10gE, 40gE, > or Infiniband) instead of using the current HTTP shuffle. Based on the fast > RDMA shuffle, the plugin can also utilize a suitable merge approach during > the intermediate merges. Hence, getting much better performance. > # Satisfy MAPREDUCE-3060 - generic shuffle service for avoiding hidden > dependency of NodeManager with a specific version of mapreduce shuffle > (currently targeted to 0.24.0). > References: > # Hadoop Acceleration through Network Levitated Merging, by Prof. Weikuan Yu > from Auburn University with others, > [http://pasl.eng.auburn.edu/pubs/sc11-netlev.pdf] > # I am attaching 2 documents with suggested Top Level Design for both plugins > (currently, based on 1.0 branch) > # I am providing link for downloading UDA - Mellanox's open source plugin > that implements generic shuffle service using RDMA and levitated merge. > Note: At this phase, the code is in C++ through JNI and you should consider > it as beta only. Still, it can serve anyone that wants to implement or > contribute to levitated merge. (Please be advised that levitated merge is > mostly suit in very fast networks) - > [http://www.mellanox.com/content/pages.php?pg=products_dyn&product_family=144&menu_section=69] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira