[ https://issues.apache.org/jira/browse/YARN-897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13706348#comment-13706348 ]
Djellel Eddine Difallah commented on YARN-897: ---------------------------------------------- Omkar, thanks for the feedback {quote}any reason for this even after this patch? if we don't see any other issues then why not just use childQueues.remove instead of iterating?{quote} The tree is already out of order because of the new usedCapacity, the remove() won't work. We have to iterate and add() to fix the order. {quote}reinsertQueue could be marked synchronized? thoughts? But yeah.. without that too it is thread safe as we are locking it at CapacitySchedulder.nodeUpdate(). but still it is better to mark it.{quote} ok, sounds reasonable to put a synchronize there. {quote}LOG.info("Re-sorting queues since queue got completed: " + childQueue.getQueuePath() + nit. line > 80{quote} sure {quote}at present we send the container completed event to leaf queue and then keep propagating it till root. why not sent the event to root grab the locks from root->leaf and update it? any thoughts?{quote} Because the released container is linked to a leaf queue and we have to walk bottom up to figure out to which parent propagate. The assignment phase, however, works the way you described. > CapacityScheduler wrongly sorted queues > --------------------------------------- > > Key: YARN-897 > URL: https://issues.apache.org/jira/browse/YARN-897 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler > Reporter: Djellel Eddine Difallah > Attachments: TestBugParentQueue.java, YARN-897-1.patch > > > The childQueues of a ParentQueue are stored in a TreeSet where UsedCapacity > defines the sort order. This ensures the queue with least UsedCapacity to > receive resources next. On containerAssignment we correctly update the order, > but we miss to do so on container completions. This corrupts the TreeSet > structure, and under-capacity queues might starve for resources. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira