[ https://issues.apache.org/jira/browse/YARN-5889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15709832#comment-15709832 ]
Eric Payne commented on YARN-5889: ---------------------------------- {quote} bq. It seems like this should be longer than 1 ms. It could be possible that containers are released and created very fast in a big cluster. {quote} [~sunilg], I now realize that with this design, the {{preComputedUserLimit}} cache will become out of date very quickly if the {{ComputeUserLimitAsyncThread}} thread is not run in a very tight loop. Even with that, {{preComputedUserLimit}} could still be out of date at the moment the scheduler needs to fill a large request. On the other hand, with this design the user limit resource is being calculated a lot more often than it is currently. Currently, it is only being calculated during the scheduler loop, and only then for apps that are asking for resources. However, this design calculates it twice every millisecond (once with partition exclusivity and once without). If a cluster is not full and has mostly apps with long-running containers, then this is being calculated thousands of times when it doesn't need to be. Instead could we add a boolean flag to {{UserToPartitionRecord}}? This flag would be set when a container is allocated or releaseed for an app from that user. Then, whenever {{getComputedUserLimit}} is called, if the flag is set, it calls {{computeUserLimit}} and clears the flag. What do you think? > Improve user-limit calculation in capacity scheduler > ---------------------------------------------------- > > Key: YARN-5889 > URL: https://issues.apache.org/jira/browse/YARN-5889 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler > Reporter: Sunil G > Assignee: Sunil G > Attachments: YARN-5889.v0.patch, YARN-5889.v1.patch, > YARN-5889.v2.patch > > > Currently user-limit is computed during every heartbeat allocation cycle with > a write lock. To improve performance, this tickets is focussing on moving > user-limit calculation out of heartbeat allocation flow. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org