[ 
https://issues.apache.org/jira/browse/HADOOP-5199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12671830#action_12671830
 ] 

Vivek Ratan commented on HADOOP-5199:
-------------------------------------

The big difference I see between the three Schedulers is in the way each of 
them sorts jobs when considering which job to look at. The Fair Share Scheduler 
(FS) sorts globally, based on a number of parameters that affect the weight of 
a job. The Capacity Scheduler (CS) sorts queues based on need, then sorts jobs 
based on priority and/or job submission time. The Default Scheduler (DS) sorts 
jobs based on priority and job submission times. 

I suggest we have one scheduler: call it the Hadoop Scheduler (HS). HS has a 
notion of pools similar to that in FS, where a pool is a collection of jobs. As 
in FS, pools can be mapped to any job property. Every pool has a capacity. This 
notion exists in both CS and FS, and is quite similar in both. We may choose to 
make this capacity 'guaranteed', ala CS, if we implement preemption. 

If we have multiple pools, it probably makes sense when processing a heartbeat 
to first decide which pool to look at, and then order jobs in that pools based 
on some ordering criteria, rather than looking at a global ordering of jobs. 
Sorting jobs only within a single pool per hearbeat provides much better 
performance, especially for large clusters with lots of submitted jobs. 

Hadoop Scheduler's _assignTasks()_ method is the central scheduling algorithm. 
It looks as follows: 
{code}
assignTasks() {

  figure out how many M/R tasks to assign; 
  // this is already done by JobQueueTaskScheduler.assignTasks(), which 
computes cluster M/R loads and uses a padding constant to determine 
  // how many map and reduce tasks to assign in this heartbeat. FS does 
something similar, though there are suggestions to improve this  
  // (see the global scheduling Jira - HADOOP 4667)

  for (each task to assign) {

    sort the pools based on some criteria;
    // we can use the criteria in CS, where we sort pools based on how much 
they're lagging behind, i.e., based on the ratio of # of running tasks and 
    // the pool's capacity. 

    for (each pool in the sorted collection) {

      ask the pool for a sorted collection of jobs;
      // this is where we can use different sorting criteria. An FS-based class 
can return jobs sorted on their notion of fairness. A class based on CS or DS 
      // sorts jobs based on priorities and then on submission time. Users can 
pick which mechanism they want, or define one of their own. 

      for (each job in the sorted collection) {

        if (user limits are OK) && (memory requirements are met) {

          get node-local or rack-local task;
          // see the code used in DS where we get one or more node-local map 
task, up to one rack-local map task, or up to one reduce task

          if (we have a task) 
            break;
        }
      }
    }
  }
}
{code}

Features such as looking at free TT memory and a job's memory requirements when 
scheduling, or looking at user job/task limits, can be enabled/disabled through 
configuration, but can be part of HS' _assignTasks()_ methods. It may not be 
clear with some features whether we want them in the core algorithm or whether 
they are specific to individual cases. In the former case, we can put them in 
HS' _assignTasks()_, enable/disabled through configuration, while in the latter 
case, they can be part of the functionality specific to FS or CS-like behavior. 
I will attach a patch to demonstrate how this works. 




> A proposal to merge common functionality of various Schedulers
> --------------------------------------------------------------
>
>                 Key: HADOOP-5199
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5199
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Vivek Ratan
>
> There are at least 3 Schedulers in Hadoop today: Default, Capacity, and 
> Fairshare. Over time, we're seeing a lot of functionality common to all 
> three. Many bug fixes, improvements to existing functionality, and new 
> functionality are applicable to all three schedulers. This trend seems to be 
> getting stronger, as we notice similar problems, solutions, and ideas. This 
> is a proposal to detect and consolidate such common functionality, while at 
> the same time, allowing for differences in behavior and leaving the door open 
> for other schedulers to be built, as per HADOOP-3412. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to