maxConcurrentMapTask & maxConcurrentReduceTask per job
------------------------------------------------------
Key: MAPREDUCE-1859
URL: https://issues.apache.org/jira/browse/MAPREDUCE-1859
Project: Hadoop Map/Reduce
Issue Type: New Feature
Components: job submission
Affects Versions: 0.20.2
Reporter: Johannes Zillmann
It would be valuable if one could specify the max number of map/reduce slots
which should be used for a given job. An example would be an map-reduce job
importing from a database where you don't want 50 map tasks querying one db at
a time but also you don't want to shrink the overall map task count.
Also this is probably already possible through Fair/Capacity-Scheduler or an
own Extension i think it would be a good addition for the default TaskScheduler
since this seems to be more then a rare used feature.
This would have the benefit in situations where you don't have
control/ownership over the cluster as well.
And its more job-centric whereas the existing scheduler extensions seems to be
more job-type-centric.
Implementing this feature should be relatively straightforward. Adding
something like jobConf.setMaxConcurrentMapTask(int) and respecting this
configuration in JobQueueTaskScheduler.
Not sure if this feature would be harmonical with the existing
Fair/Capacity-Schedulers.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.