[ https://issues.apache.org/jira/browse/YARN-6775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16083045#comment-16083045 ]
Nathan Roberts commented on YARN-6775: -------------------------------------- Thanks [~leftnoteasy] for the review. bq. 1) CachedUserLimit.canAssign is not necessary as we can set CachedUserLimit.reservation to UNBOUNDED initially. I think it is necessary because we need to keep track of the largest reservation for which canAssignToUser() returns false. Anything smaller than the largest we've already calculated we know won't work so we can avoid the call. Therefore we can't start at UNBOUNDED and work our way down. bq. 2) Directly set cul.reservation = rsrv could be problematic under async scheduling logic since reserved resource of app could be updated while allocating. Is this something we need to address? Couldn't this be mutating between the various lookups that are already occurring in today's assignContainers()? bq. 3) Do you think is it necessary to add another Resource to track queue's verified_minimum_violated_reserved_resource similar to user limit? My thought was we'll quickly run across an app that has no reservation and be able to skip assignToQueue() check from that point forward. The check against Resources.none() should be very fast compared to a resource comparison. I'm open to keeping track of the minimum though if you feel there would be sufficient gain. Naming suggestions look good. I'll clean those up.(although minimum is actually a maximum) > CapacityScheduler: Improvements to assignContainers() > ----------------------------------------------------- > > Key: YARN-6775 > URL: https://issues.apache.org/jira/browse/YARN-6775 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacityscheduler > Affects Versions: 2.8.1, 3.0.0-alpha3 > Reporter: Nathan Roberts > Assignee: Nathan Roberts > Attachments: YARN-6775.001.patch > > > There are several things in assignContainers() that are done multiple times > even though the result cannot change (canAssignToUser, canAssignToQueue). Add > some local caching to take advantage of this fact. > Will post patch shortly. Patch includes a simple throughput test that > demonstrates when we have users at their user-limit, the number of > NodeUpdateSchedulerEvents we can process can be improved from 13K/sec to > 50K/sec. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org