[ https://issues.apache.org/jira/browse/YARN-10587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17268670#comment-17268670 ]
zhuqi commented on YARN-10587: ------------------------------ cc [~leftnoteasy] [~bteke] When i deep into the testAutoCreateLeafQueueCreation failed about : {code:java} // TODO: Wangda: I think this test case is not correct, Sunil could help look // into details. {code} I find the root reason is , when absolute mode enabled in auto created leaf queue, we finally will update cap related for the UI in : {code:java} private void deriveCapacityFromAbsoluteConfigurations(String label, Resource clusterResource, ResourceCalculator rc) { /* * In case when queues are configured with absolute resources, it is better * to update capacity/max-capacity etc w.r.t absolute resource as well. In * case of computation, these values wont be used any more. However for * metrics and UI, its better these values are pre-computed here itself. */ // 1. Update capacity as a float based on parent's minResource float f = rc.divide(clusterResource, queueResourceQuotas.getEffectiveMinResource(label), parent.getQueueResourceQuotas().getEffectiveMinResource(label)); queueCapacities.setCapacity(label, Float.isInfinite(f) ? 0 : f); // 2. Update max-capacity as a float based on parent's maxResource f = rc.divide(clusterResource, queueResourceQuotas.getEffectiveMaxResource(label), parent.getQueueResourceQuotas().getEffectiveMaxResource(label)); queueCapacities.setMaximumCapacity(label, Float.isInfinite(f) ? 0 : f); // 3. Update absolute capacity as a float based on parent's minResource and // cluster resource. queueCapacities.setAbsoluteCapacity(label, queueCapacities.getCapacity(label) * parent.getQueueCapacities() .getAbsoluteCapacity(label)); // 4. Update absolute max-capacity as a float based on parent's maxResource // and cluster resource. queueCapacities.setAbsoluteMaximumCapacity(label, queueCapacities.getMaximumCapacity(label) * parent.getQueueCapacities() .getAbsoluteMaximumCapacity(label)); // Re-visit max applications for a queue based on absolute capacity if // needed. if (this instanceof LeafQueue) { LeafQueue leafQueue = (LeafQueue) this; CapacitySchedulerConfiguration conf = csContext.getConfiguration(); int maxApplications = conf.getMaximumApplicationsPerQueue(queuePath); if (maxApplications < 0) { int maxGlobalPerQueueApps = conf.getGlobalMaximumApplicationsPerQueue(); if (maxGlobalPerQueueApps > 0) { maxApplications = (int) (maxGlobalPerQueueApps * queueCapacities .getAbsoluteCapacity(label)); } else{ maxApplications = (int) (conf.getMaximumSystemApplications() * queueCapacities .getAbsoluteCapacity(label)); } } leafQueue.setMaxApplications(maxApplications); int maxApplicationsPerUser = Math.min(maxApplications, (int) (maxApplications * (leafQueue.getUsersManager().getUserLimit() / 100.0f) * leafQueue.getUsersManager().getUserLimitFactor())); leafQueue.setMaxApplicationsPerUser(maxApplicationsPerUser); LOG.info("LeafQueue:" + leafQueue.getQueuePath() + ", maxApplications=" + maxApplications + ", maxApplicationsPerUser=" + maxApplicationsPerUser + ", Abs Cap:" + queueCapacities .getAbsoluteCapacity(label) + ", Cap: " + queueCapacities .getCapacity(label) + ", MaxCap : " + queueCapacities .getMaximumCapacity(label)); } } {code} But the queueResourceQuotas.getEffectiveMinResource is not correct when we add a absolute auto created leaf queue, just exceed the capacity: {code:java} @Override public AutoCreatedLeafQueueConfig getInitialLeafQueueConfiguration( AbstractAutoCreatedLeafQueue leafQueue) throws SchedulerDynamicEditException { ... float availableCapacity = managedParentQueue.getQueueCapacities(). getAbsoluteCapacity(nodeLabel) - parentQueueState. getAbsoluteActivatedChildQueueCapacity(nodeLabel) + EPSILON; if (availableCapacity >= leafQueueTemplateCapacities .getAbsoluteCapacity(nodeLabel)) { ... } else{ updateToZeroCapacity(capacities, nodeLabel, leafQueue); } ... } {code} In updateToZeroCapacity we should change to ,For absolute auto created leaf queue: {code:java} private void updateToZeroCapacity(QueueCapacities capacities, String nodeLabel, LeafQueue leafQueue) { capacities.setCapacity(nodeLabel, 0.0f); capacities.setMaximumCapacity(nodeLabel, leafQueueTemplateCapacities.getMaximumCapacity(nodeLabel)); leafQueue.getQueueResourceQuotas(). setConfiguredMinResource(nodeLabel, Resource.newInstance(0,0)); } {code} Then in calculateEffectiveResourcesAndCapacity the ratio will be correct, when the absolute auto created leaf queue will not add to. Then the finally updating cap related for the UI, will be correct. {code:java} private void calculateEffectiveResourcesAndCapacity(String label, Resource clusterResource) { Resource configuredMinResources = Resource.newInstance(0L, 0); for (CSQueue childQueue : getChildQueues()) { Resources.addTo(configuredMinResources, childQueue.getQueueResourceQuotas().getConfiguredMinResource(label)); } } {code} Thanks. > Fix AutoCreateLeafQueueCreation cap related caculation when in absolute mode. > ----------------------------------------------------------------------------- > > Key: YARN-10587 > URL: https://issues.apache.org/jira/browse/YARN-10587 > Project: Hadoop YARN > Issue Type: Sub-task > Reporter: zhuqi > Assignee: zhuqi > Priority: Major > > When introduced YARN-10504. > The logic related to auto created leaf queue changed. > The test in testAutoCreateLeafQueueCreation failed, we should fix the Error. > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org