[ https://issues.apache.org/jira/browse/YARN-1969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13995993#comment-13995993 ]
Maysam Yabandeh commented on YARN-1969: --------------------------------------- Thanks for the comment [~kasha]. I think this is a good point to distinguish between the terms "deadline" and "endtime". "deadline" would be the user-specified SLA and as you correctly mentioned in many cases it is quite likely to be missed due to failures, limited resources, etc. Still the user can express the level of urgency by the desired deadline, but they could also do that via priorities, so the user-specified deadline would be a complementary (and perhaps more expressive) way for users to specify the priorities of their jobs. "endtime", on the other hand, is the estimated end time of the job based on the current progress and assuming that the RM will give the rest of the required resources immediately. endtime is automatically computed by the AppMaster and there is no need for user involvement. When scheduling resources, the advantage of taking endtime into consideration is that the giant jobs that are close to be finished could be prioritized. We in general want to have such jobs finished sooner since (i) they would release the resources that they have occupied such as the disk space for the mappers' output, (ii) a large job is more susceptible to failures and the longer they are hanging around , the more is the likelihood of being affected by a loss of a mapper node. The added subtasks are based on the agenda of (i) estimating the end time, (ii) sending it over to RM, (iii) letting RM take it into consideration. We can also extend the API to allow the users to specify their desired deadline. As for how RM take the specified deadline or estimated endtime into consideration, I think once we have the "endtime" field available in RM, there will be many new opportunities to take advantage of it. One way as you mentioned is to translate them into weights to be used by the current fair scheduler. Any other scheduling algorithm, including EDF, also can be plugged in and do the scheduling based on a function of the endtime and other variables. The other variables could include the size of the job, as discussed above. > Fair Scheduler: Add policy for Earliest Deadline First > ------------------------------------------------------ > > Key: YARN-1969 > URL: https://issues.apache.org/jira/browse/YARN-1969 > Project: Hadoop YARN > Issue Type: Improvement > Reporter: Maysam Yabandeh > Assignee: Maysam Yabandeh > > What we are observing is that some big jobs with many allocated containers > are waiting for a few containers to finish. Under *fair-share scheduling* > however they have a low priority since there are other jobs (usually much > smaller, new comers) that are using resources way below their fair share, > hence new released containers are not offered to the big, yet > close-to-be-finished job. Nevertheless, everybody would benefit from an > "unfair" scheduling that offers the resource to the big job since the sooner > the big job finishes, the sooner it releases its "many" allocated resources > to be used by other jobs.In other words, what we require is a kind of > variation of *Earliest Deadline First scheduling*, that takes into account > the number of already-allocated resources and estimated time to finish. > http://en.wikipedia.org/wiki/Earliest_deadline_first_scheduling > For example, if a job is using MEM GB of memory and is expected to finish in > TIME minutes, the priority in scheduling would be a function p of (MEM, > TIME). The expected time to finish can be estimated by the AppMaster using > TaskRuntimeEstimator#estimatedRuntime and be supplied to RM in the resource > request messages. To be less susceptible to the issue of apps gaming the > system, we can have this scheduling limited to *only within a queue*: i.e., > adding a EarliestDeadlinePolicy extends SchedulingPolicy and let the queues > to use it by setting the "schedulingPolicy" field. -- This message was sent by Atlassian JIRA (v6.2#6252)