[ 
https://issues.apache.org/jira/browse/YARN-7149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16157189#comment-16157189
 ] 

Wangda Tan commented on YARN-7149:
----------------------------------

[~jlowe], [~eepayne], [~sunilg], apologize for my late response. I just checked 
the behavior, it's not same as Jason and Eric mentioned: 

I tried to write an unit test: 
1) A super fat node has 1000G memory. 
2) Submit app1 to queue under user1, submit app2 to queue under user2. Each of 
them asks 1000 * 5G containers with locality.

When UL=100, a single node heartbeat can allocate: 200 containers to app1. (no 
capacity left for app2).
When UL=50, a single node heartbeat can allocate: 100 containers to app1, 100 
containers to app2. 

The userLimit calculation formula mentioned by [~eepayne] is not correct, it 
should be: 
{code}
userLimitResrouce = max{
   ceil(queueResourceUsedByActiveUsers / #activeUsers),
   ceil(queueConfiguredCapacity * userLimit%) 
}
{code}

Because of the {{ceil}} operation, after each container allocation, we can get 
a new UL, and because user limit validation is a >= check instead of strict >, 
so it won't slow down container allocation. 

But I can see an issue here is, instead of do {{max}} operation in 
userLimitResource calculation, we should do {{min}}. Otherwise: 
- When we have two active users in the queue, and userLimit set to 100, first 
user will always get preferred until queue reaches maxCapacity.

> Cross-queue preemption sometimes starves an underserved queue
> -------------------------------------------------------------
>
>                 Key: YARN-7149
>                 URL: https://issues.apache.org/jira/browse/YARN-7149
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacity scheduler
>    Affects Versions: 2.9.0, 3.0.0-alpha3
>            Reporter: Eric Payne
>            Assignee: Eric Payne
>         Attachments: YARN-7149.demo.unit-test.patch
>
>
> In branch 2 and trunk, I am consistently seeing some use cases where 
> cross-queue preemption does not happen when it should. I do not see this in 
> branch-2.8.
> Use Case:
> | | *Size* | *Minimum Container Size* |
> |MyCluster | 20 GB | 0.5 GB |
> | *Queue Name* | *Capacity* | *Absolute Capacity* | *Minimum User Limit 
> Percent (MULP)* | *User Limit Factor (ULF)* |
> |Q1 | 50% = 10 GB | 100% = 20 GB | 10% = 1 GB | 2.0 |
> |Q2 | 50% = 10 GB | 100% = 20 GB | 10% = 1 GB | 2.0 |
> - {{User1}} launches {{App1}} in {{Q1}} and consumes all resources (20 GB)
> - {{User2}} launches {{App2}} in {{Q2}} and requests 10 GB
> - _Note: containers are 0.5 GB._
> - Preemption monitor kills 2 containers (equals 1 GB) from {{App1}} in {{Q1}}.
> - Capacity Scheduler assigns 2 containers (equals 1 GB) to {{App2}} in {{Q2}}.
> - _No more containers are ever preempted, even though {{Q2}} is far 
> underserved_



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to