[jira] [Created] (YARN-11428) FairScheduler: Expected preemption may not happen if node has enough free resources

Brian Goerlitz (Jira) Fri, 03 Feb 2023 09:50:42 -0800

Brian Goerlitz created YARN-11428:
-------------------------------------

             Summary: FairScheduler: Expected preemption may not happen if node 
has enough free resources
                 Key: YARN-11428
                 URL: https://issues.apache.org/jira/browse/YARN-11428
             Project: Hadoop YARN
          Issue Type: Bug
            Reporter: Brian Goerlitz



An application can be FairShare starved in the following conditions:
 * intra-queue preemption is needed in order for a new application to receive 
resources
 * The first NodeManager checked for preemption already has idle resources 
greater than the required resources
 * Containers belonging to a different queue that is using no more than its 
fair share are running on that node

 

Illustration using a single node cluster for simplicity
{noformat}
yarn.nodemanager.resource.memory-mb = 9216
yarn.nodemanager.resource.cpu-vcores = 18
yarn.scheduler.fair.preemption = true
yarn.scheduler.fair.preemption.cluster-utilization-threshold = 0.5
{noformat}
 

FairScheduler config

 
{code:java}
<allocations>
...
        <queue name="default">
            <weight>1.0</weight>
            <schedulingPolicy>drf</schedulingPolicy>
        </queue>
        <queue name="limited">
            <maxResources>memory-mb=33.0%, vcores=33.0%</maxResources>
            <weight>1.0</weight>
            <schedulingPolicy>drf</schedulingPolicy>
        </queue>
    <defaultFairSharePreemptionTimeout>5</defaultFairSharePreemptionTimeout>
    
<defaultFairSharePreemptionThreshold>1.0</defaultFairSharePreemptionThreshold>
    <defaultQueueSchedulingPolicy>fair</defaultQueueSchedulingPolicy>
...
</allocations>
{code}
 

 

Procedure:
 # Launch an application (app1) in root.limited which will consume the max 
resources
 # Launch an application (app2) in root.default which will consume no more than 
the queue's fair share
 # Launch another application (app3) in root.limited with container size 
smaller than the remaining cluster capacity

 

Expected result:

Resources from app1 should be preempted and provided to app3 until app3 has its 
fair share.

 

In actuality, this does not always happen. When {{FSPreemptionThread}} iterates 
over the containers on the node, if the first container belongs to app2, it 
will not be eligible for preemption (as app2 would go below its fair share). 
Because the node already had enough capacity for the new container, the next 
container in the list is not checked and an empty {{PreemptableContainers}} is 
returned. The list contains no AM containers, so in a multinode scenario, no 
other nodes will be checked either. No container will be preempted, and until 
the usage scenario changes, app3 is unable to obtain its fair share of 
resources.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-11428) FairScheduler: Expected preemption may not happen if node has enough free resources

Reply via email to