[jira] [Commented] (MAPREDUCE-3902) MR AM should reuse containers for map tasks, there-by allowing fine-grained control on num-maps for users without need for CombineFileInputFormat etc.

Siddharth Seth (JIRA) Thu, 16 Aug 2012 15:09:40 -0700

    [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13436358#comment-13436358
 ]


Siddharth Seth commented on MAPREDUCE-3902:
-------------------------------------------

bq. If I create some patches(ex. fixing TODOs or something), should I send pull 
request against your github or attach patch here?
I think a pull request against github for now, and for bigger / more 
significant changes - a separate subtasks under this jira for the changes.

bq. Do you think that it's needed to separate hadoop-mapreduce-client-app from 
hadoop-mapreduce-client-app2? Your prototype is under 
hadoop-mapreduce-client-app2 currently. This make it difficult to rebase your 
code on trunk.
The intention was to be able to run the existing code as well as the modified 
code in the same install - with a simple config change to chose between the 
implementations. That makes side by side comparisons much easier. Once this 
implementation stabilizes, it can be moved back to mapreduce-client-app to 
replace the current implementation. Also, there's some pretty big changes to 
TaskAttempt, AM scheduling classes, etc - given this, I'm not sure how useful a 
merge from trunk would be. This will have some overhead though - of pulling in 
/ factoring in jiras which have been fixed after the branch.

With mapreduce-client-app2 being a separate module, development could continue 
in the main branches. However, given that this implementation is not stable, 
I'd like to create a separate branch for this jira, pull in the current set of 
changes with some cleanup, and then continue development. Will create a branch 
later this week unless if noone objects.
                
> MR AM should reuse containers for map tasks, there-by allowing fine-grained 
> control on num-maps for users without need for CombineFileInputFormat etc.
> ------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3902
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3902
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: applicationmaster, mrv2
>            Reporter: Arun C Murthy
>            Assignee: Siddharth Seth
>         Attachments: MAPREDUCE-3902.2.patch, MAPREDUCE-3902.patch
>
>
> The MR AM is now in a great position to reuse containers across (map) tasks. 
> This is something similar to JVM re-use we had in 0.20.x, but in a 
> significantly better manner:
> # Consider data-locality when re-using containers
> # Consider the new shuffle - ensure that reduces fetch output of the whole 
> container at once (i.e. all maps)  : MAPREDUCE-4525 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-3902) MR AM should reuse containers for map tasks, there-by allowing fine-grained control on num-maps for users without need for CombineFileInputFormat etc.

Reply via email to