[ 
https://issues.apache.org/jira/browse/YARN-7149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16155619#comment-16155619
 ] 

Sunil G commented on YARN-7149:
-------------------------------

Thanks [~eepayne] for the detailed analysis. And thanks [~jlowe] for details, 
it makes sense. I think the optimization made in YARN-5889 had two parts, one 
for allocation to make it gradually shared for all users (Jason has given a 
very detailed explanation on this), second is for preemption part. For this 
preemption calculation, we have to consider all users resource as some could be 
non-active as well. In the example given here, I think app1 would have become 
deactive as all resource requests of app1 might have been served. [~eepayne] 
please correct me if I am wrong

In this case i think {{getTotalPendingResourcesConsideringUserLimit}} is not 
correct.
{code}
        if (!userNameToHeadroom.containsKey(userName)) {
          User user = getUser(userName);
          Resource headroom = Resources.subtract(
              getResourceLimitForActiveUsers(app.getUser(), clusterResources,
                  partition, SchedulingMode.RESPECT_PARTITION_EXCLUSIVITY),
              user.getUsed(partition));
          // Make sure headroom is not negative.
          headroom = Resources.componentwiseMax(headroom, Resources.none());
          userNameToHeadroom.put(userName, headroom);
        }
{code}

Here I think we have to use  {{getResourceLimitForAllUsers}} instead of 
{{getResourceLimitForActiveUsers}}. I will wait for Eric to confirm whether 
app1 was not active or not.

Apart from this,
bq.Do we really need to keep the assignments balanced as users grow to their 
limit? 
I think we were trying to make a uniform allocation pattern here, and I feel I 
can try to share more test results here how slow this is. And definitely wait 
for more discussion here meanwhile.

> Cross-queue preemption sometimes starves an underserved queue
> -------------------------------------------------------------
>
>                 Key: YARN-7149
>                 URL: https://issues.apache.org/jira/browse/YARN-7149
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacity scheduler
>    Affects Versions: 2.9.0, 3.0.0-alpha3
>            Reporter: Eric Payne
>            Assignee: Eric Payne
>
> In branch 2 and trunk, I am consistently seeing some use cases where 
> cross-queue preemption does not happen when it should. I do not see this in 
> branch-2.8.
> Use Case:
> | | *Size* | *Minimum Container Size* |
> |MyCluster | 20 GB | 0.5 GB |
> | *Queue Name* | *Capacity* | *Absolute Capacity* | *Minimum User Limit 
> Percent (MULP)* | *User Limit Factor (ULF)* |
> |Q1 | 50% = 10 GB | 100% = 20 GB | 10% = 1 GB | 2.0 |
> |Q2 | 50% = 10 GB | 100% = 20 GB | 10% = 1 GB | 2.0 |
> - {{User1}} launches {{App1}} in {{Q1}} and consumes all resources (20 GB)
> - {{User2}} launches {{App2}} in {{Q2}} and requests 10 GB
> - _Note: containers are 0.5 GB._
> - Preemption monitor kills 2 containers (equals 1 GB) from {{App1}} in {{Q1}}.
> - Capacity Scheduler assigns 2 containers (equals 1 GB) to {{App2}} in {{Q2}}.
> - _No more containers are ever preempted, even though {{Q2}} is far 
> underserved_



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to