[ 
https://issues.apache.org/jira/browse/YARN-4606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16522811#comment-16522811
 ] 

Eric Payne commented on YARN-4606:
----------------------------------

{quote}At the same time, this patch is less "strict" in terms of updates 
(specifically on when? ) compared to approaches discussed in our earlier 
patches.
{quote}
The value for number of active apps per user used to be calculated every time 
through the scheduler loop, which was a performance problem. In order to avoid 
this heavy calculation, YARN-5889 created the {{UsersManager}}. Instead of 
doing the calculation every time through the loop, YARN-5889 only recalculates 
these values when events occurs that could affect this count like new 
application, app completes, new container request, completed container, etc. In 
the latest POC patch, {{activeUsersWithOnlyPendingApps}} is part of this flow, 
so it will always be updated whenever anything happens that could affect this 
value.
{quote}Also, based on our earlier discussions, We need to depend on 
activeUsers.get() only in certain context and sum of activeUsers.get() and 
activeUsersWithOnlyPendingApps.get() in some other places. But POC patch always 
depends on later value. I didn't understand this part.
{quote}
I think you are referencing this comment from above:
{quote}My understanding is that user limit would use activeUsers and things 
like max AM limit per user, we'd use activeUsers + activeUsersOfPendingApps
{quote}
{{LeafQueue#activateApplications}} is the only thing that calls 
{{UsersManager#getNumActiveUsers}}, which it uses to calculate the 
user-specific AM limit, so it's the one that needs both activeusers + 
{{activeUsersWithOnlyPendingApps}}.
 {{UsersManager#computeUserLimit}} uses only activeUsers to calculate the 
headroom and user limit, which is what we decided in the comment above. Is that 
your understanding of these comments?

> CapacityScheduler: applications could get starved because computation of 
> #activeUsers considers pending apps 
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-4606
>                 URL: https://issues.apache.org/jira/browse/YARN-4606
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacity scheduler, capacityscheduler
>    Affects Versions: 2.8.0, 2.7.1
>            Reporter: Karam Singh
>            Assignee: Manikandan R
>            Priority: Critical
>         Attachments: YARN-4606.001.patch, YARN-4606.002.patch, 
> YARN-4606.003.patch, YARN-4606.004.patch, YARN-4606.1.poc.patch, 
> YARN-4606.POC.2.patch, YARN-4606.POC.3.patch, YARN-4606.POC.patch
>
>
> Currently, if all applications belong to same user in LeafQueue are pending 
> (caused by max-am-percent, etc.), ActiveUsersManager still considers the user 
> is an active user. This could lead to starvation of active applications, for 
> example:
> - App1(belongs to user1)/app2(belongs to user2) are active, app3(belongs to 
> user3)/app4(belongs to user4) are pending
> - ActiveUsersManager returns #active-users=4
> - However, there're only two users (user1/user2) are able to allocate new 
> resources. So computed user-limit-resource could be lower than expected.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to