[ https://issues.apache.org/jira/browse/YARN-4606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16522811#comment-16522811 ]
Eric Payne commented on YARN-4606: ---------------------------------- {quote}At the same time, this patch is less "strict" in terms of updates (specifically on when? ) compared to approaches discussed in our earlier patches. {quote} The value for number of active apps per user used to be calculated every time through the scheduler loop, which was a performance problem. In order to avoid this heavy calculation, YARN-5889 created the {{UsersManager}}. Instead of doing the calculation every time through the loop, YARN-5889 only recalculates these values when events occurs that could affect this count like new application, app completes, new container request, completed container, etc. In the latest POC patch, {{activeUsersWithOnlyPendingApps}} is part of this flow, so it will always be updated whenever anything happens that could affect this value. {quote}Also, based on our earlier discussions, We need to depend on activeUsers.get() only in certain context and sum of activeUsers.get() and activeUsersWithOnlyPendingApps.get() in some other places. But POC patch always depends on later value. I didn't understand this part. {quote} I think you are referencing this comment from above: {quote}My understanding is that user limit would use activeUsers and things like max AM limit per user, we'd use activeUsers + activeUsersOfPendingApps {quote} {{LeafQueue#activateApplications}} is the only thing that calls {{UsersManager#getNumActiveUsers}}, which it uses to calculate the user-specific AM limit, so it's the one that needs both activeusers + {{activeUsersWithOnlyPendingApps}}. {{UsersManager#computeUserLimit}} uses only activeUsers to calculate the headroom and user limit, which is what we decided in the comment above. Is that your understanding of these comments? > CapacityScheduler: applications could get starved because computation of > #activeUsers considers pending apps > ------------------------------------------------------------------------------------------------------------- > > Key: YARN-4606 > URL: https://issues.apache.org/jira/browse/YARN-4606 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, capacityscheduler > Affects Versions: 2.8.0, 2.7.1 > Reporter: Karam Singh > Assignee: Manikandan R > Priority: Critical > Attachments: YARN-4606.001.patch, YARN-4606.002.patch, > YARN-4606.003.patch, YARN-4606.004.patch, YARN-4606.1.poc.patch, > YARN-4606.POC.2.patch, YARN-4606.POC.3.patch, YARN-4606.POC.patch > > > Currently, if all applications belong to same user in LeafQueue are pending > (caused by max-am-percent, etc.), ActiveUsersManager still considers the user > is an active user. This could lead to starvation of active applications, for > example: > - App1(belongs to user1)/app2(belongs to user2) are active, app3(belongs to > user3)/app4(belongs to user4) are pending > - ActiveUsersManager returns #active-users=4 > - However, there're only two users (user1/user2) are able to allocate new > resources. So computed user-limit-resource could be lower than expected. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org