[ 
https://issues.apache.org/jira/browse/HADOOP-5701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12700284#action_12700284
 ] 

Tsz Wo (Nicholas), SZE commented on HADOOP-5701:
------------------------------------------------

Consider that a cluster have 2000 map slots and jobs submitted in the following 
sequence:

|1:00pm|JobA|1500 maps, each map runs 24 hours|
|1:30pm|JobB|1000 maps, each map runs 2 hours|
|1:40pm|JobC|3000 maps, each map runs 10 minutes|

Then, all 1500 maps in JobA got scheduled and only 500 map slots remained in 
the cluster at 1pm.  30 minutes later, JobB came and only 500 maps slots got 
scheduled.  At 1:40pm, JobC came but no maps got scheduled until some maps in 
JobB finished 2 hours later.

In this cases, JobA always has 75% of the capacity, JobB and JobC never able to 
obtain 1/N of the capacity.  If JobA has 2000 maps, other jobs have to wait for 
maps in JobA to finish and have no progress in 24 hours.

> With fair scheduler, long running jobs can easily occurpy a lot of task slots
> -----------------------------------------------------------------------------
>
>                 Key: HADOOP-5701
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5701
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/fair-share
>            Reporter: Tsz Wo (Nicholas), SZE
>
> Current fair scheduler implementation favor long running jobs since once a 
> task slot is assigned to a job, the fair scheduler is not able to reclaim it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to