[jira] [Updated] (MAPREDUCE-3902) MR AM should reuse containers for map tasks, there-by allowing fine-grained control on num-maps for users without need for CombineFileInputFormat etc.

2012-08-08 Thread Tsuyoshi OZAWA (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi OZAWA updated MAPREDUCE-3902:
--

Description: 
The MR AM is now in a great position to reuse containers across (map) tasks. 
This is something similar to JVM re-use we had in 0.20.x, but in a 
significantly better manner:
# Consider data-locality when re-using containers
# Consider the new shuffle - ensure that reduces fetch output of the whole 
container at once (i.e. all maps)  : MAPREDUCE-4525 

  was:
The MR AM is now in a great position to reuse containers across (map) tasks. 
This is something similar to JVM re-use we had in 0.20.x, but in a 
significantly better manner:
# Consider data-locality when re-using containers
# Consider the new shuffle - ensure that reduces fetch output of the whole 
container at once (i.e. all maps) 


 MR AM should reuse containers for map tasks, there-by allowing fine-grained 
 control on num-maps for users without need for CombineFileInputFormat etc.
 --

 Key: MAPREDUCE-3902
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3902
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: applicationmaster, mrv2
Reporter: Arun C Murthy
Assignee: Siddharth Seth
 Attachments: MAPREDUCE-3902.2.patch, MAPREDUCE-3902.patch


 The MR AM is now in a great position to reuse containers across (map) tasks. 
 This is something similar to JVM re-use we had in 0.20.x, but in a 
 significantly better manner:
 # Consider data-locality when re-using containers
 # Consider the new shuffle - ensure that reduces fetch output of the whole 
 container at once (i.e. all maps)  : MAPREDUCE-4525 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3902) MR AM should reuse containers for map tasks, there-by allowing fine-grained control on num-maps for users without need for CombineFileInputFormat etc.

2012-08-03 Thread Tsuyoshi OZAWA (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsuyoshi OZAWA updated MAPREDUCE-3902:
--

Attachment: MAPREDUCE-3902.2.patch

As a first step, I fixed the patch by Arun to pass compile against current 
source code.

 MR AM should reuse containers for map tasks, there-by allowing fine-grained 
 control on num-maps for users without need for CombineFileInputFormat etc.
 --

 Key: MAPREDUCE-3902
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3902
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: applicationmaster, mrv2
Reporter: Arun C Murthy
Assignee: Siddharth Seth
 Attachments: MAPREDUCE-3902.2.patch, MAPREDUCE-3902.patch


 The MR AM is now in a great position to reuse containers across (map) tasks. 
 This is something similar to JVM re-use we had in 0.20.x, but in a 
 significantly better manner:
 # Consider data-locality when re-using containers
 # Consider the new shuffle - ensure that reduces fetch output of the whole 
 container at once (i.e. all maps) 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (MAPREDUCE-3902) MR AM should reuse containers for map tasks

2012-02-23 Thread Arun C Murthy (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated MAPREDUCE-3902:
-

Attachment: MAPREDUCE-3902.patch

Ok, I spent a long (isolated) flight on this - it clearly needs more work, but 
it's a start. *smile*

This patch improves the classic JVM re-use on both dimensions described in the 
jira.

We need to pay more attention to the user interface, some options:
# Allow user to specify actual number of map slots to be used (supported now, 
in the patch)
# Allow user to specify a target block-size for maps (which is greater than 
real HDFS block size) i.e. get around the small-files problem.

Thoughts?

 MR AM should reuse containers for map tasks
 ---

 Key: MAPREDUCE-3902
 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3902
 Project: Hadoop Map/Reduce
  Issue Type: Improvement
  Components: applicationmaster, mrv2
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Attachments: MAPREDUCE-3902.patch


 The MR AM is now in a great position to reuse containers across (map) tasks. 
 This is something similar to JVM re-use we had in 0.20.x, but in a 
 significantly better manner:
 # Consider data-locality when re-using containers
 # Consider the new shuffle - ensure that reduces fetch output of the whole 
 container at once (i.e. all maps) 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira