[ 
https://issues.apache.org/jira/browse/HADOOP-3651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12608688#action_12608688
 ] 

Vivek Ratan commented on HADOOP-3651:
-------------------------------------

In the current JT, the code for determining which task to hand a TT uses the 
following logic: the JT first figures out the 'remaining load' per TT for 
maps/reduces (which is the total number of map and reduce tasks that need to be 
run across all running jobs, divided by the num of TTs). It then figures out 
how many maximum map or reduce tasks should be run on the TT (which is the 
minimum of the TT's capacity and the 'remaining load') - call this the 'max 
load'. Finally, if a TT can run something (ie, if the # of maps/reduces it is 
running is less than the 'max load'), it looks to give it a map task or a 
reduce task. 

As I had mentioned in a mail I sent to core-dev on 5/23, this logic can result  
in some TTs not getting a task to run, even when there are tasks waiting to be 
run. It can also result in a skewed distribution of tasks among TTs. Maye 
something like that is happening here. I don't know if it's possible to see the 
log files and determine what exactly happened. 

The new Resource Manager will, I think, result in a better distribution. For 
one, a TT's request is never rejected if there is a task to run. for another, 
the load will likely be spread out more evenly. 

> When assigning tasks to trackers, the job tracker should try to balance the 
> number of tasks among the available trackers
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-3651
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3651
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>
> I encounter a number of situations like this:
> A job tracker has 200 task trackers, each with 2 mapper slots and reducer 
> slots.
> When a job with 200 or fewer reducers was submitted to the job tracker,
> one normally each task tracker will run one reducer.
> Unfortunately, it seems that only  about 1/3 of trackers have one reducer, 
> and 1/3 trackers don't have reducer, and 1/3 have 2 reducers!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to