[ https://issues.apache.org/jira/browse/YARN-1969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Karthik Kambatla updated YARN-1969: ----------------------------------- Summary: Fair Scheduler: Add policy for Earliest Endtime First (was: Fair Scheduler: Add policy for Earliest Deadline First) > Fair Scheduler: Add policy for Earliest Endtime First > ----------------------------------------------------- > > Key: YARN-1969 > URL: https://issues.apache.org/jira/browse/YARN-1969 > Project: Hadoop YARN > Issue Type: Improvement > Reporter: Maysam Yabandeh > Assignee: Maysam Yabandeh > > What we are observing is that some big jobs with many allocated containers > are waiting for a few containers to finish. Under *fair-share scheduling* > however they have a low priority since there are other jobs (usually much > smaller, new comers) that are using resources way below their fair share, > hence new released containers are not offered to the big, yet > close-to-be-finished job. Nevertheless, everybody would benefit from an > "unfair" scheduling that offers the resource to the big job since the sooner > the big job finishes, the sooner it releases its "many" allocated resources > to be used by other jobs.In other words, what we require is a kind of > variation of *Earliest Deadline First scheduling*, that takes into account > the number of already-allocated resources and estimated time to finish. > http://en.wikipedia.org/wiki/Earliest_deadline_first_scheduling > For example, if a job is using MEM GB of memory and is expected to finish in > TIME minutes, the priority in scheduling would be a function p of (MEM, > TIME). The expected time to finish can be estimated by the AppMaster using > TaskRuntimeEstimator#estimatedRuntime and be supplied to RM in the resource > request messages. To be less susceptible to the issue of apps gaming the > system, we can have this scheduling limited to *only within a queue*: i.e., > adding a EarliestDeadlinePolicy extends SchedulingPolicy and let the queues > to use it by setting the "schedulingPolicy" field. -- This message was sent by Atlassian JIRA (v6.2#6252)