Allow a load difference in fairshare scheduler ----------------------------------------------
Key: MAPREDUCE-936 URL: https://issues.apache.org/jira/browse/MAPREDUCE-936 Project: Hadoop Map/Reduce Issue Type: Improvement Components: contrib/fair-share Reporter: Zheng Shao The problem we are facing: It takes a long time for all tasks of a job to get scheduled on the cluster, even if the cluster is almost empty. There are two reasons that together lead to this situation: 1. The load factor makes sure each TT runs the same number of tasks. (This is the part that this patch tries to change). 2. The scheduler tries to schedule map tasks locally (first node-local, then rack-local). There is a wait time (mapred.fairscheduler.localitywait.node and mapred.fairscheduler.localitywait.rack, both are around 10 sec in our conf), and accumulated wait time (JobInfo.localityWait). The accumulated wait time is reset to 0 whenever a non-local map task is scheduled. That means it takes N * wait_time to schedule N non-local map tasks. Because of 1, a lot of TT will not be able to take more tasks, even if they have free slots. As a result, a lot of the map tasks cannot be scheduled locally. Because of 2, it's really hard to schedule a non-local task. As a result, sometimes we are seeing that it takes more than 2 minutes to schedule all the mappers of a job. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.