[ 
https://issues.apache.org/jira/browse/YARN-3769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Payne updated YARN-3769:
-----------------------------
    Attachment: YARN-3769-branch-2.7.006.patch

[~leftnoteasy], thanks for your comments.
{quote}
The problem is getUserResourceLimit is not always updated by scheduler. If a 
queue is not traversed by scheduler OR apps of a queue-user have long heartbeat 
interval, the user resource limit could be staled.
{quote}
Got it
{quote}
I found 0005 patch for trunk is computing user-limit every time and 0005 patch 
for 2.7 is using getUserResourceLimit.
{quote}
Yes, I was concerned about using the 2.7 version of {{computeUserLimit}}. It is 
different than the branch-2 and trunk versions, and it expects a {{required}} 
parameter which, in 2.7, is calculated in {{assignContainers}}  based on an 
app's capability requests for a given container priority. I noticed that in 
branch-2 and trunk, it looks like this {{required}} parameter is just given the 
value of {{minimumAllocation}}.

So, in {{YARN-3769-branch-2.7.006.patch}} I passed {{minimumAllocation}} in the 
{{required}} parameter of {{computeUserLimit}}.

> Preemption occurring unnecessarily because preemption doesn't consider user 
> limit
> ---------------------------------------------------------------------------------
>
>                 Key: YARN-3769
>                 URL: https://issues.apache.org/jira/browse/YARN-3769
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacityscheduler
>    Affects Versions: 2.6.0, 2.7.0, 2.8.0
>            Reporter: Eric Payne
>            Assignee: Eric Payne
>         Attachments: YARN-3769-branch-2.002.patch, 
> YARN-3769-branch-2.7.002.patch, YARN-3769-branch-2.7.003.patch, 
> YARN-3769-branch-2.7.005.patch, YARN-3769-branch-2.7.006.patch, 
> YARN-3769.001.branch-2.7.patch, YARN-3769.001.branch-2.8.patch, 
> YARN-3769.003.patch, YARN-3769.004.patch, YARN-3769.005.patch
>
>
> We are seeing the preemption monitor preempting containers from queue A and 
> then seeing the capacity scheduler giving them immediately back to queue A. 
> This happens quite often and causes a lot of churn.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to