[ 
https://issues.apache.org/jira/browse/HADOOP-1245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Bieniosek updated HADOOP-1245:
--------------------------------------

    Description: 
I want to create a cluster with machines with different numbers of CPUs.  
Consequently, each machine should have a different value for 
mapred.tasktracker.tasks.maximum, since my map tasks are CPU bound.

When a new job starts up, the jobtracker uses its (single) value for 
mapred.tasktracker.tasks.maximum to assign tasks.  This means that each 
tasktracker gets the same number of tasks, regardless of how I configured that 
particular machine.

The jobtracker should not consult its config for the value of 
mapred.tasktracker.tasks.maximum.  It should assign tasks (or allow 
tasktrackers to request tasks) according to each tasktracker's value of 
mapred.tasktracker.tasks.maximum.

Originally, I thought the behavior was slightly different, so this issue 
contained this text:
After the first task finishes on each tasktracker, the tasktracker will request 
new tasks from the jobtracker according to the tasktracker's value for 
mapred.tasktracker.tasks.maximum.  So after the first round of map tasks is 
done, the cluster reverts to a mode that works well for heterogeneous clusters.


  was:
I want to create a cluster with machines with different numbers of CPUs.  
Consequently, each machine should have a different value for 
mapred.tasktracker.tasks.maximum, since my map tasks are CPU bound.

However, hadoop uses BOTH the values for mapred.tasktracker.tasks.maximum on 
the jobtracker and the tasktracker.  

When a new job starts up, the jobtracker uses its (single) value for 
mapred.tasktracker.tasks.maximum to assign tasks.  This means that each 
tasktracker gets the same number of tasks, regardless of how I configured that 
particular machine.

After the first task finishes on each tasktracker, the tasktracker will request 
new tasks from the jobtracker according to the tasktracker's value for 
mapred.tasktracker.tasks.maximum.  So after the first round of map tasks is 
done, the cluster reverts to a mode that works well for heterogeneous clusters.

The jobtracker should not consult its config for the value of 
mapred.tasktracker.tasks.maximum.  It should assign tasks (or allow 
tasktrackers to request tasks) according to each tasktracker's value of 
mapred.tasktracker.tasks.maximum.

        Summary: value for mapred.tasktracker.tasks.maximum taken from 
jobtracker, not tasktracker  (was: value for mapred.tasktracker.tasks.maximum 
taken from two different sources)

Fixing issue description to reflect reality as reported by others

> value for mapred.tasktracker.tasks.maximum taken from jobtracker, not 
> tasktracker
> ---------------------------------------------------------------------------------
>
>                 Key: HADOOP-1245
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1245
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.12.3
>            Reporter: Michael Bieniosek
>         Attachments: tasktracker-max-tasks-1245.patch
>
>
> I want to create a cluster with machines with different numbers of CPUs.  
> Consequently, each machine should have a different value for 
> mapred.tasktracker.tasks.maximum, since my map tasks are CPU bound.
> When a new job starts up, the jobtracker uses its (single) value for 
> mapred.tasktracker.tasks.maximum to assign tasks.  This means that each 
> tasktracker gets the same number of tasks, regardless of how I configured 
> that particular machine.
> The jobtracker should not consult its config for the value of 
> mapred.tasktracker.tasks.maximum.  It should assign tasks (or allow 
> tasktrackers to request tasks) according to each tasktracker's value of 
> mapred.tasktracker.tasks.maximum.
> Originally, I thought the behavior was slightly different, so this issue 
> contained this text:
> After the first task finishes on each tasktracker, the tasktracker will 
> request new tasks from the jobtracker according to the tasktracker's value 
> for mapred.tasktracker.tasks.maximum.  So after the first round of map tasks 
> is done, the cluster reverts to a mode that works well for heterogeneous 
> clusters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to