[jira] [Updated] (MAPREDUCE-3902) MR AM should reuse containers for map tasks, there-by allowing fine-grained control on num-maps for users without need for CombineFileInputFormat etc.
[ https://issues.apache.org/jira/browse/MAPREDUCE-3902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsuyoshi OZAWA updated MAPREDUCE-3902: -- Description: The MR AM is now in a great position to reuse containers across (map) tasks. This is something similar to JVM re-use we had in 0.20.x, but in a significantly better manner: # Consider data-locality when re-using containers # Consider the new shuffle - ensure that reduces fetch output of the whole container at once (i.e. all maps) : MAPREDUCE-4525 was: The MR AM is now in a great position to reuse containers across (map) tasks. This is something similar to JVM re-use we had in 0.20.x, but in a significantly better manner: # Consider data-locality when re-using containers # Consider the new shuffle - ensure that reduces fetch output of the whole container at once (i.e. all maps) MR AM should reuse containers for map tasks, there-by allowing fine-grained control on num-maps for users without need for CombineFileInputFormat etc. -- Key: MAPREDUCE-3902 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3902 Project: Hadoop Map/Reduce Issue Type: Improvement Components: applicationmaster, mrv2 Reporter: Arun C Murthy Assignee: Siddharth Seth Attachments: MAPREDUCE-3902.2.patch, MAPREDUCE-3902.patch The MR AM is now in a great position to reuse containers across (map) tasks. This is something similar to JVM re-use we had in 0.20.x, but in a significantly better manner: # Consider data-locality when re-using containers # Consider the new shuffle - ensure that reduces fetch output of the whole container at once (i.e. all maps) : MAPREDUCE-4525 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3902) MR AM should reuse containers for map tasks, there-by allowing fine-grained control on num-maps for users without need for CombineFileInputFormat etc.
[ https://issues.apache.org/jira/browse/MAPREDUCE-3902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsuyoshi OZAWA updated MAPREDUCE-3902: -- Attachment: MAPREDUCE-3902.2.patch As a first step, I fixed the patch by Arun to pass compile against current source code. MR AM should reuse containers for map tasks, there-by allowing fine-grained control on num-maps for users without need for CombineFileInputFormat etc. -- Key: MAPREDUCE-3902 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3902 Project: Hadoop Map/Reduce Issue Type: Improvement Components: applicationmaster, mrv2 Reporter: Arun C Murthy Assignee: Siddharth Seth Attachments: MAPREDUCE-3902.2.patch, MAPREDUCE-3902.patch The MR AM is now in a great position to reuse containers across (map) tasks. This is something similar to JVM re-use we had in 0.20.x, but in a significantly better manner: # Consider data-locality when re-using containers # Consider the new shuffle - ensure that reduces fetch output of the whole container at once (i.e. all maps) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-3902) MR AM should reuse containers for map tasks
[ https://issues.apache.org/jira/browse/MAPREDUCE-3902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun C Murthy updated MAPREDUCE-3902: - Attachment: MAPREDUCE-3902.patch Ok, I spent a long (isolated) flight on this - it clearly needs more work, but it's a start. *smile* This patch improves the classic JVM re-use on both dimensions described in the jira. We need to pay more attention to the user interface, some options: # Allow user to specify actual number of map slots to be used (supported now, in the patch) # Allow user to specify a target block-size for maps (which is greater than real HDFS block size) i.e. get around the small-files problem. Thoughts? MR AM should reuse containers for map tasks --- Key: MAPREDUCE-3902 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3902 Project: Hadoop Map/Reduce Issue Type: Improvement Components: applicationmaster, mrv2 Reporter: Arun C Murthy Assignee: Arun C Murthy Attachments: MAPREDUCE-3902.patch The MR AM is now in a great position to reuse containers across (map) tasks. This is something similar to JVM re-use we had in 0.20.x, but in a significantly better manner: # Consider data-locality when re-using containers # Consider the new shuffle - ensure that reduces fetch output of the whole container at once (i.e. all maps) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira