[ https://issues.apache.org/jira/browse/MAPREDUCE-3902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13436358#comment-13436358 ]
Siddharth Seth commented on MAPREDUCE-3902: ------------------------------------------- bq. If I create some patches(ex. fixing TODOs or something), should I send pull request against your github or attach patch here? I think a pull request against github for now, and for bigger / more significant changes - a separate subtasks under this jira for the changes. bq. Do you think that it's needed to separate hadoop-mapreduce-client-app from hadoop-mapreduce-client-app2? Your prototype is under hadoop-mapreduce-client-app2 currently. This make it difficult to rebase your code on trunk. The intention was to be able to run the existing code as well as the modified code in the same install - with a simple config change to chose between the implementations. That makes side by side comparisons much easier. Once this implementation stabilizes, it can be moved back to mapreduce-client-app to replace the current implementation. Also, there's some pretty big changes to TaskAttempt, AM scheduling classes, etc - given this, I'm not sure how useful a merge from trunk would be. This will have some overhead though - of pulling in / factoring in jiras which have been fixed after the branch. With mapreduce-client-app2 being a separate module, development could continue in the main branches. However, given that this implementation is not stable, I'd like to create a separate branch for this jira, pull in the current set of changes with some cleanup, and then continue development. Will create a branch later this week unless if noone objects. > MR AM should reuse containers for map tasks, there-by allowing fine-grained > control on num-maps for users without need for CombineFileInputFormat etc. > ------------------------------------------------------------------------------------------------------------------------------------------------------ > > Key: MAPREDUCE-3902 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3902 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: applicationmaster, mrv2 > Reporter: Arun C Murthy > Assignee: Siddharth Seth > Attachments: MAPREDUCE-3902.2.patch, MAPREDUCE-3902.patch > > > The MR AM is now in a great position to reuse containers across (map) tasks. > This is something similar to JVM re-use we had in 0.20.x, but in a > significantly better manner: > # Consider data-locality when re-using containers > # Consider the new shuffle - ensure that reduces fetch output of the whole > container at once (i.e. all maps) : MAPREDUCE-4525 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira