[ https://issues.apache.org/jira/browse/YARN-8379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16511469#comment-16511469 ]
Eric Payne commented on YARN-8379: ---------------------------------- [~Zian Chen], thank you for the work on this issue and for the new patch. I am still working through the patch, but I have the following concerns so far. - Cross-queue preemption does not work when this patch is applied. My test environment simulates a 6-node pseudo YARN cluster. I use the same queue configs with and without this patch. With this patch, no cross-queue preemption happens at all. - Please address the failed unit tests, failed findbugs, and failed shadedclient warnings. I think they are related to this patch. - {{ProportionalCapacityPreemptionPolicy#updateConfigIfNeeded}} This code adds {{FifoCandidatesSelector}} to candidatesSelectionPolicies twice, which will cause it to be called twice since candidatesSelectionPolicies is an {{ArrayList}}. If I understand correctly, the reason for this is so that the first time {{FifoCandidatesSelector#selectCandidates}} is called, it will preempt up to the requesting queue's guarantee, and the second time it will not preempt until the requsting queue is above its guarantee AND the {{allowQueuesBalanceAfterAllQueuesSatisfied}} variable is set. Why can't {{FifoCandidatesSelector}} just be added once and do all the processing it needs to based on whether or not {{allowQueuesBalanceAfterAllQueuesSatisfied}} is set? - {{FifoCandidatesSelector#selectCandidates}} If the skip logic is necessary (depending on answer to my first question), I think the return needs to be moved up above the previous curly brace (}). The way it is now, it returns whether containers is empty or not empty. {code:title=FifoCandidatesSelector#selectCandidates} for (Set<RMContainer> containers : selectedCandidates.values()) { if (!containers.isEmpty()) { if (LOG.isDebugEnabled()) { LOG.debug(...); } } return selectedCandidates; } {code} - {{FifoCandidatesSelector#selectCandidates}} For the debug log statement, I would not use the word "Fairness" because the word "Fair" has a lot of different meanings when it comes to schedulers and policies. To make it more grammatically correct (and to remove the confusion surrounding "fairness"), I would say, "The preemption-to-balance feature is on. Some containers were chosen for preemption by previous selectors. Skipping container selection for FifoCandidatesSelector"); - General. For the same reason as above, I think we can just remove the workd "Fair" or "Fairness" from all method and variable names and the meaning would remain. - {{AbstractPreemptableResourceCalculator#getIdealPctOfGuaranteed}} bq. Should we allow queues continue grow after queues satisfied? This could be misinterpreted to mean that the capacity scheduler doesn't allow a queue to grow over its capacity guarantee. It may be better to modify this to make it clear that we are talking about preemption: "Should resources be preempted from an over-served queue when the requesting queues are all at or over their guarantees?" > Add an option to allow Capacity Scheduler preemption to balance satisfied > queues > -------------------------------------------------------------------------------- > > Key: YARN-8379 > URL: https://issues.apache.org/jira/browse/YARN-8379 > Project: Hadoop YARN > Issue Type: Bug > Reporter: Wangda Tan > Assignee: Zian Chen > Priority: Major > Attachments: YARN-8379.001.patch, YARN-8379.002.patch, > YARN-8379.003.patch > > > Existing capacity scheduler only supports preemption for an underutilized > queue to reach its guaranteed resource. In addition to that, there’s an > requirement to get better balance between queues when all of them reach > guaranteed resource but with different fairness resource. > An example is, 3 queues with capacity, queue_a = 30%, queue_b = 30%, queue_c > = 40%. At time T. queue_a is using 30%, queue_b is using 70%. Existing > scheduler preemption won't happen. But this is unfair to queue_a since > queue_a has the same guaranteed resources. > Before YARN-5864, capacity scheduler do additional preemption to balance > queues. We changed the logic since it could preempt too many containers > between queues when all queues are satisfied. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org