[ 
https://issues.apache.org/jira/browse/YARN-5889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15834233#comment-15834233
 ] 

Sunil G commented on YARN-5889:
-------------------------------

Hi [~eepayne]
Thank you for the detailed comments.

bq.do we need the isAnActiveUser checks in assignContainer and releaseContainer?
bq.I removed these checks in my local build and the application is able to use 
all of the queue and cluster.
If we remove the active user check, then 
{{activeUsersManager.getTotalResUsedByActiveUsers}} will be for all users. And 
hence it works like old. But I agree that the computation is not very correct. 
For example, *user1* was initially active and whenever a container was 
allocated for *user1*, we incremented resource to  
{{AUM#TotalResUsedByActiveUsers}}. Now this user has become in-active since it 
doesnot have any more outstanding resource requests. So *user1* resources has 
to be removed from  {{AUM#TotalResUsedByActiveUsers}} at that time. This is not 
happening now. Eventhough I fix this, there are some changes in behavior. I can 
explain.

{noformat}
    // User limit resource is determined by:
    // max{resourceUsedForActiveUsers / #activeUsers, queueCapacity *
    // user-limit-percentage%)
{noformat}


Now here, lets assume 2 cases: ( 1. usedResource < queuCap and 2. usedResource 
> queueCap)

1. {{resourceUsedForActiveUsers / #activeUsers}} will be much lesser value now 
as we consider only active-users used cap. In old case, 
{{total_used/#activeUsers}} will be definitely more. So as per above equation, 
UL will be {{queueCapacity * userLimit%}} for higher MULP (something like 
80~99%). Hence UL will be less than queueCapacity. (If MULP is lesser value, 
then UL will also be lower)
2. If {{usedResource > queueCap}}, then the UL can go more than queue cap based 
on two factors. If #active_users is lesser and active_users resource usage is 
more than queue cap OR usedResource which is more than queuCap is multiplied 
with a higher MULP value.

Altogether, first part of the existing UL compute equation will matter only if 
#active-users is lesser or MULP is very low in cluster. I think its somewhat 
fine. Please share your thoughts.

> Improve user-limit calculation in capacity scheduler
> ----------------------------------------------------
>
>                 Key: YARN-5889
>                 URL: https://issues.apache.org/jira/browse/YARN-5889
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacity scheduler
>            Reporter: Sunil G
>            Assignee: Sunil G
>         Attachments: YARN-5889.0001.patch, 
> YARN-5889.0001.suggested.patchnotes, YARN-5889.0002.patch, 
> YARN-5889.0003.patch, YARN-5889.0004.patch, YARN-5889.0005.patch, 
> YARN-5889.v0.patch, YARN-5889.v1.patch, YARN-5889.v2.patch
>
>
> Currently user-limit is computed during every heartbeat allocation cycle with 
> a write lock. To improve performance, this tickets is focussing on moving 
> user-limit calculation out of heartbeat allocation flow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to