[ 
https://issues.apache.org/jira/browse/YARN-8250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16496872#comment-16496872
 ] 

Haibo Chen commented on YARN-8250:
----------------------------------

[~asuresh], [~leftnoteasy] and I had an offline discussion about this again. 

We think one alternative to avoid two different implementations of the 
container scheduler is to modify the behavior of the existing 
ContainerScheduler to accommodate the requirements of NM over-allocation. 
Specifically, the behavior changes of the current ContainerScheduler will 
include

Before: 

1) Upon a GUARANTEED container scheduling event, always queue the GUARANTEED 
container first and then check if any OPPORTUNISTIC container needs to be 
preempted. If so, wait for the OPPORTUNISTIC container(s) to be killed. 
Otherwise, launch the GUARANTEED container.

2) Upon an OPPORTUNISTIC container scheduling event, queue the container first 
and only launch the OPPORTUNISTIC container if there is enough room.

3) Upon any container completed or finished event that signals resources that 
have been released, check if any container (GUARANTEED containers first, then 
OPPORTUNISTIC containers) can be launched

After:

1) Upon a GUARANTEED container scheduling event, launch the GUARANTEED 
container immediately (without queuing). Rely on cgroups OOM control 
(YARN-6677) to preempt OPPORTUNISTIC containers as necessary.

2) Upon an OPPORTUNISTIC container scheduling event, simply queue the 
OPPORTUNISTIC container. 

3) Upon any container completed or finished event, do not try to launch any 
container.

4) Introduce a periodic check (in ContainersMonitor thread) that launches 
OPPORTUNISTIC container. Ideally, the period is configurable so that the 
latency to launch OPPORTUNISTIC containers can be reduced.

As we have discussed in previous comments, this reduces the latency to launch 
GUARANTEED containers and allow us to control how aggressive OPPORTUNISTIC 
containers are launched, which is especially important for reliability when 
over-allocation is turned on. The code can be a lot simpler as well.

*But it does increase the latency to launch OPPORTUNISTIC containers in cases 
where over-allocation is not on, because we give up opportunities to launch 
them when there are containers finished or paused*. In addition, it does add a 
dependency on cgroup OOM control to preempt OPPORTUNISTIC containers, even 
though I'd argue it's best to turn on cgroup isolation anyway to ensure 
GUARANTEED containers are not adversely impacted by running OPPORUTNISTIC 
containers.

Let us know your thoughts, if the workload you guys are running is okay with 
the change. [~leftnoteasy] Please add anything that I may have missed.

> Create another implementation of ContainerScheduler to support NM 
> overallocation
> --------------------------------------------------------------------------------
>
>                 Key: YARN-8250
>                 URL: https://issues.apache.org/jira/browse/YARN-8250
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Haibo Chen
>            Assignee: Haibo Chen
>            Priority: Major
>         Attachments: YARN-8250-YARN-1011.00.patch, 
> YARN-8250-YARN-1011.01.patch, YARN-8250-YARN-1011.02.patch
>
>
> YARN-6675 adds NM over-allocation support by modifying the existing 
> ContainerScheduler and providing a utilizationBased resource tracker.
> However, the implementation adds a lot of complexity to ContainerScheduler, 
> and future tweak of over-allocation strategy based on how much containers 
> have been launched is even more complicated.
> As such, this Jira proposes a new ContainerScheduler that always launch 
> guaranteed containers immediately and queues opportunistic containers. It 
> relies on a periodical check to launch opportunistic containers. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to