[
https://issues.apache.org/jira/browse/HADOOP-5701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12700284#action_12700284
]
Tsz Wo (Nicholas), SZE commented on HADOOP-5701:
------------------------------------------------
Consider that a cluster have 2000 map slots and jobs submitted in the following
sequence:
|1:00pm|JobA|1500 maps, each map runs 24 hours|
|1:30pm|JobB|1000 maps, each map runs 2 hours|
|1:40pm|JobC|3000 maps, each map runs 10 minutes|
Then, all 1500 maps in JobA got scheduled and only 500 map slots remained in
the cluster at 1pm. 30 minutes later, JobB came and only 500 maps slots got
scheduled. At 1:40pm, JobC came but no maps got scheduled until some maps in
JobB finished 2 hours later.
In this cases, JobA always has 75% of the capacity, JobB and JobC never able to
obtain 1/N of the capacity. If JobA has 2000 maps, other jobs have to wait for
maps in JobA to finish and have no progress in 24 hours.
> With fair scheduler, long running jobs can easily occurpy a lot of task slots
> -----------------------------------------------------------------------------
>
> Key: HADOOP-5701
> URL: https://issues.apache.org/jira/browse/HADOOP-5701
> Project: Hadoop Core
> Issue Type: Bug
> Components: contrib/fair-share
> Reporter: Tsz Wo (Nicholas), SZE
>
> Current fair scheduler implementation favor long running jobs since once a
> task slot is assigned to a job, the fair scheduler is not able to reclaim it.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.