[
https://issues.apache.org/jira/browse/HADOOP-3651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12608688#action_12608688
]
Vivek Ratan commented on HADOOP-3651:
-------------------------------------
In the current JT, the code for determining which task to hand a TT uses the
following logic: the JT first figures out the 'remaining load' per TT for
maps/reduces (which is the total number of map and reduce tasks that need to be
run across all running jobs, divided by the num of TTs). It then figures out
how many maximum map or reduce tasks should be run on the TT (which is the
minimum of the TT's capacity and the 'remaining load') - call this the 'max
load'. Finally, if a TT can run something (ie, if the # of maps/reduces it is
running is less than the 'max load'), it looks to give it a map task or a
reduce task.
As I had mentioned in a mail I sent to core-dev on 5/23, this logic can result
in some TTs not getting a task to run, even when there are tasks waiting to be
run. It can also result in a skewed distribution of tasks among TTs. Maye
something like that is happening here. I don't know if it's possible to see the
log files and determine what exactly happened.
The new Resource Manager will, I think, result in a better distribution. For
one, a TT's request is never rejected if there is a task to run. for another,
the load will likely be spread out more evenly.
> When assigning tasks to trackers, the job tracker should try to balance the
> number of tasks among the available trackers
> ------------------------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-3651
> URL: https://issues.apache.org/jira/browse/HADOOP-3651
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Affects Versions: 0.17.0
> Reporter: Runping Qi
>
> I encounter a number of situations like this:
> A job tracker has 200 task trackers, each with 2 mapper slots and reducer
> slots.
> When a job with 200 or fewer reducers was submitted to the job tracker,
> one normally each task tracker will run one reducer.
> Unfortunately, it seems that only about 1/3 of trackers have one reducer,
> and 1/3 trackers don't have reducer, and 1/3 have 2 reducers!
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.