[
https://issues.apache.org/jira/browse/HADOOP-1245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael Bieniosek updated HADOOP-1245:
--------------------------------------
Status: Patch Available (was: Open)
See what hudson thinks...
> value for mapred.tasktracker.tasks.maximum taken from two different sources
> ---------------------------------------------------------------------------
>
> Key: HADOOP-1245
> URL: https://issues.apache.org/jira/browse/HADOOP-1245
> Project: Hadoop
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.12.3
> Reporter: Michael Bieniosek
> Attachments: tasktracker-max-tasks-1245.patch
>
>
> I want to create a cluster with machines with different numbers of CPUs.
> Consequently, each machine should have a different value for
> mapred.tasktracker.tasks.maximum, since my map tasks are CPU bound.
> However, hadoop uses BOTH the values for mapred.tasktracker.tasks.maximum on
> the jobtracker and the tasktracker.
> When a new job starts up, the jobtracker uses its (single) value for
> mapred.tasktracker.tasks.maximum to assign tasks. This means that each
> tasktracker gets the same number of tasks, regardless of how I configured
> that particular machine.
> After the first task finishes on each tasktracker, the tasktracker will
> request new tasks from the jobtracker according to the tasktracker's value
> for mapred.tasktracker.tasks.maximum. So after the first round of map tasks
> is done, the cluster reverts to a mode that works well for heterogeneous
> clusters.
> The jobtracker should not consult its config for the value of
> mapred.tasktracker.tasks.maximum. It should assign tasks (or allow
> tasktrackers to request tasks) according to each tasktracker's value of
> mapred.tasktracker.tasks.maximum.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.