[jira] Updated: (MAPREDUCE-936) Allow a load difference in fairshare scheduler

dhruba borthakur (JIRA) Fri, 04 Sep 2009 16:35:23 -0700

     [ 
https://issues.apache.org/jira/browse/MAPREDUCE-936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


dhruba borthakur updated MAPREDUCE-936:
---------------------------------------

       Resolution: Fixed
    Fix Version/s: 0.21.0
     Hadoop Flags: [Reviewed]
           Status: Resolved  (was: Patch Available)

I just committed this. Thanks Zheng.

> Allow a load difference in fairshare scheduler
> ----------------------------------------------
>
>                 Key: MAPREDUCE-936
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-936
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: contrib/fair-share
>    Affects Versions: 0.20.1, 0.21.0, 0.22.0
>            Reporter: Zheng Shao
>            Assignee: Zheng Shao
>             Fix For: 0.21.0
>
>         Attachments: MAPREDUCE-936.1.patch, MAPREDUCE-936.2.patch
>
>
> The problem we are facing: It takes a long time for all tasks of a job to get 
> scheduled on the cluster, even if the cluster is almost empty.
> There are two reasons that together lead to this situation:
> 1. The load factor makes sure each TT runs the same number of tasks. (This is 
> the part that this patch tries to change).
> 2. The scheduler tries to schedule map tasks locally (first node-local, then 
> rack-local). There is a wait time (mapred.fairscheduler.localitywait.node and 
> mapred.fairscheduler.localitywait.rack, both are around 10 sec in our conf), 
> and accumulated wait time (JobInfo.localityWait). The accumulated wait time 
> is reset to 0 whenever a non-local map task is scheduled. That means it takes 
> N * wait_time to schedule N non-local map tasks.
> Because of 1, a lot of TT will not be able to take more tasks, even if they 
> have free slots. As a result, a lot of the map tasks cannot be scheduled 
> locally.
> Because of 2, it's really hard to schedule a non-local task.
> As a result, sometimes we are seeing that it takes more than 2 minutes to 
> schedule all the mappers of a job.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (MAPREDUCE-936) Allow a load difference in fairshare scheduler

Reply via email to