[
https://issues.apache.org/jira/browse/HADOOP-4665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12698187#action_12698187
]
Hemanth Yamijala commented on HADOOP-4665:
------------------------------------------
Matei, I've not been looking at this code much, but based on the discussion, I
have only one comment: regarding the turning off of pre-emption.
Your use case of organizations wanting to try out with pre-emption disabled,
but still seeing when pre-emption would happen seems to me like a dry-run mode
that you can see in utilities like an RPM update. As you've explained, looks
like there are use cases for this.
>From our work with the capacity scheduler, we've found there are environments
>where pre-emption is indeed not necessary. Even when it exists, it has proved
>to be a complex feature to reason about. From this perspective, it seems like
>it may make sense to provide an option to completely turn it off and have
>reasonable confidence that nothing related to pre-emption would be in effect,
>including any additional computation that it requires.
Hence, my suggestion is the following: have a flag to truly turn off
pre-emption and have a variable that allows a dry-run of pre-emption when it is
enabled. I believe this may not be a very difficult change ? (Indeed, I've been
thinking of cases where a dry-run of the entire scheduling logic makes sense -
for e.g. to get a 'scheduling log' that can be replayed).
The flip side of my proposal is an additional configuration option. But
depending on what we think the right defaults are, we can still make the
configuration easy for end users, no ? To that extent, your arguments about the
proposed default values are fine with me.
> Add preemption to the fair scheduler
> ------------------------------------
>
> Key: HADOOP-4665
> URL: https://issues.apache.org/jira/browse/HADOOP-4665
> Project: Hadoop Core
> Issue Type: New Feature
> Components: contrib/fair-share
> Reporter: Matei Zaharia
> Assignee: Matei Zaharia
> Fix For: 0.21.0
>
> Attachments: fs-preemption-v0.patch, hadoop-4665-v1.patch,
> hadoop-4665-v1b.patch, hadoop-4665-v2.patch, hadoop-4665-v3.patch,
> hadoop-4665-v4.patch
>
>
> Task preemption is necessary in a multi-user Hadoop cluster for two reasons:
> users might submit long-running tasks by mistake (e.g. an infinite loop in a
> map program), or tasks may be long due to having to process large amounts of
> data. The Fair Scheduler (HADOOP-3746) has a concept of guaranteed capacity
> for certain queues, as well as a goal of providing good performance for
> interactive jobs on average through fair sharing. Therefore, it will support
> preempting under two conditions:
> 1) A job isn't getting its _guaranteed_ share of the cluster for at least T1
> seconds.
> 2) A job is getting significantly less than its _fair_ share for T2 seconds
> (e.g. less than half its share).
> T1 will be chosen smaller than T2 (and will be configurable per queue) to
> meet guarantees quickly. T2 is meant as a last resort in case non-critical
> jobs in queues with no guaranteed capacity are being starved.
> When deciding which tasks to kill to make room for the job, we will use the
> following heuristics:
> - Look for tasks to kill only in jobs that have more than their fair share,
> ordering these by deficit (most overscheduled jobs first).
> - For maps: kill tasks that have run for the least amount of time (limiting
> wasted time).
> - For reduces: similar to maps, but give extra preference for reduces in the
> copy phase where there is not much map output per task (at Facebook, we have
> observed this to be the main time we need preemption - when a job has a long
> map phase and its reducers are mostly sitting idle and filling up slots).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.