[jira] [Updated] (YARN-11641) Can't update a queue hierarchy in absolute mode when the configured capacities are zero
[ https://issues.apache.org/jira/browse/YARN-11641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan updated YARN-11641: -- Target Version/s: 3.5.0 > Can't update a queue hierarchy in absolute mode when the configured > capacities are zero > --- > > Key: YARN-11641 > URL: https://issues.apache.org/jira/browse/YARN-11641 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.4.0 >Reporter: Tamas Domok >Assignee: Tamas Domok >Priority: Major > Labels: pull-request-available > Attachments: hierarchy.png > > > h2. Error symptoms > It is not possible to modify a queue hierarchy in absolute mode when the > parent or every child queue of the parent has 0 min resource configured. > {noformat} > 2024-01-05 15:38:59,016 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerQueueManager: > Initialized queue: root.a.c > 2024-01-05 15:38:59,016 ERROR > org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebServices: Exception > thrown when modifying configuration. > java.io.IOException: Failed to re-init queues : Parent=root.a: When absolute > minResource is used, we must make sure both parent and child all use absolute > minResource > {noformat} > h2. Reproduction > capacity-scheduler.xml > {code:xml} > > > > yarn.scheduler.capacity.root.queues > default,a > > > yarn.scheduler.capacity.root.capacity > [memory=40960, vcores=16] > > > yarn.scheduler.capacity.root.default.capacity > [memory=1024, vcores=1] > > > yarn.scheduler.capacity.root.default.maximum-capacity > [memory=1024, vcores=1] > > > yarn.scheduler.capacity.root.a.capacity > [memory=0, vcores=0] > > > yarn.scheduler.capacity.root.a.maximum-capacity > [memory=39936, vcores=15] > > > yarn.scheduler.capacity.root.a.queues > b,c > > > yarn.scheduler.capacity.root.a.b.capacity > [memory=0, vcores=0] > > > yarn.scheduler.capacity.root.a.b.maximum-capacity > [memory=39936, vcores=15] > > > yarn.scheduler.capacity.root.a.c.capacity > [memory=0, vcores=0] > > > yarn.scheduler.capacity.root.a.c.maximum-capacity > [memory=39936, vcores=15] > > > {code} > !hierarchy.png! > updatequeue.xml > {code:xml} > > > > root.a > > > capacity > [memory=1024,vcores=1] > > > maximum-capacity > [memory=39936,vcores=15] > > > > > {code} > {code} > $ curl -X PUT -H 'Content-Type: application/xml' -d @updatequeue.xml > http://localhost:8088/ws/v1/cluster/scheduler-conf\?user.name\=yarn > Failed to re-init queues : Parent=root.a: When absolute minResource is used, > we must make sure both parent and child all use absolute minResource > {code} > h2. Root cause > setChildQueues is called during reinit, where: > {code:java} > void setChildQueues(Collection childQueues) throws IOException { > writeLock.lock(); > try { > boolean isLegacyQueueMode = > queueContext.getConfiguration().isLegacyQueueMode(); > if (isLegacyQueueMode) { > QueueCapacityType childrenCapacityType = > getCapacityConfigurationTypeForQueues(childQueues); > QueueCapacityType parentCapacityType = > getCapacityConfigurationTypeForQueues(ImmutableList.of(this)); > if (childrenCapacityType == QueueCapacityType.ABSOLUTE_RESOURCE > || parentCapacityType == QueueCapacityType.ABSOLUTE_RESOURCE) { > // We don't allow any mixed absolute + {weight, percentage} between > // children and parent > if (childrenCapacityType != parentCapacityType && > !this.getQueuePath() > .equals(CapacitySchedulerConfiguration.ROOT)) { > throw new IOException("Parent=" + this.getQueuePath() > + ": When absolute minResource is used, we must make sure > both " > + "parent and child all use absolute minResource"); > } > {code} > The parent or childrenCapacityType will be considered as PERCENTAGE, because > getCapacityConfigurationTypeForQueues fails to detect the absolute mode, here: > {code:java} > if > (!queue.getQueueResourceQuotas().getConfiguredMinResource(nodeLabel) > .equals(Resources.none())) { > absoluteMinResSet = true; > {code} > (It only happens in legacy queue mode.) > h2. Possible fixes > Possible fix in AbstractParentQueue.getCapacityConfigurationTypeForQueues > using the capacityVector: > {code:java} > for (CSQueue queue : queues) { > for (String nodeLabel : queueCapacities.getExistingNodeLabels()) { > Set > definedCapacityTypes = >
[jira] [Updated] (YARN-11641) Can't update a queue hierarchy in absolute mode when the configured capacities are zero
[ https://issues.apache.org/jira/browse/YARN-11641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated YARN-11641: -- Labels: pull-request-available (was: ) > Can't update a queue hierarchy in absolute mode when the configured > capacities are zero > --- > > Key: YARN-11641 > URL: https://issues.apache.org/jira/browse/YARN-11641 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.4.0 >Reporter: Tamas Domok >Assignee: Tamas Domok >Priority: Major > Labels: pull-request-available > Attachments: hierarchy.png > > > h2. Error symptoms > It is not possible to modify a queue hierarchy in absolute mode when the > parent or every child queue of the parent has 0 min resource configured. > {noformat} > 2024-01-05 15:38:59,016 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerQueueManager: > Initialized queue: root.a.c > 2024-01-05 15:38:59,016 ERROR > org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebServices: Exception > thrown when modifying configuration. > java.io.IOException: Failed to re-init queues : Parent=root.a: When absolute > minResource is used, we must make sure both parent and child all use absolute > minResource > {noformat} > h2. Reproduction > capacity-scheduler.xml > {code:xml} > > > > yarn.scheduler.capacity.root.queues > default,a > > > yarn.scheduler.capacity.root.capacity > [memory=40960, vcores=16] > > > yarn.scheduler.capacity.root.default.capacity > [memory=1024, vcores=1] > > > yarn.scheduler.capacity.root.default.maximum-capacity > [memory=1024, vcores=1] > > > yarn.scheduler.capacity.root.a.capacity > [memory=0, vcores=0] > > > yarn.scheduler.capacity.root.a.maximum-capacity > [memory=39936, vcores=15] > > > yarn.scheduler.capacity.root.a.queues > b,c > > > yarn.scheduler.capacity.root.a.b.capacity > [memory=0, vcores=0] > > > yarn.scheduler.capacity.root.a.b.maximum-capacity > [memory=39936, vcores=15] > > > yarn.scheduler.capacity.root.a.c.capacity > [memory=0, vcores=0] > > > yarn.scheduler.capacity.root.a.c.maximum-capacity > [memory=39936, vcores=15] > > > {code} > !hierarchy.png! > updatequeue.xml > {code:xml} > > > > root.a > > > capacity > [memory=1024,vcores=1] > > > maximum-capacity > [memory=39936,vcores=15] > > > > > {code} > {code} > $ curl -X PUT -H 'Content-Type: application/xml' -d @updatequeue.xml > http://localhost:8088/ws/v1/cluster/scheduler-conf\?user.name\=yarn > Failed to re-init queues : Parent=root.a: When absolute minResource is used, > we must make sure both parent and child all use absolute minResource > {code} > h2. Root cause > setChildQueues is called during reinit, where: > {code:java} > void setChildQueues(Collection childQueues) throws IOException { > writeLock.lock(); > try { > boolean isLegacyQueueMode = > queueContext.getConfiguration().isLegacyQueueMode(); > if (isLegacyQueueMode) { > QueueCapacityType childrenCapacityType = > getCapacityConfigurationTypeForQueues(childQueues); > QueueCapacityType parentCapacityType = > getCapacityConfigurationTypeForQueues(ImmutableList.of(this)); > if (childrenCapacityType == QueueCapacityType.ABSOLUTE_RESOURCE > || parentCapacityType == QueueCapacityType.ABSOLUTE_RESOURCE) { > // We don't allow any mixed absolute + {weight, percentage} between > // children and parent > if (childrenCapacityType != parentCapacityType && > !this.getQueuePath() > .equals(CapacitySchedulerConfiguration.ROOT)) { > throw new IOException("Parent=" + this.getQueuePath() > + ": When absolute minResource is used, we must make sure > both " > + "parent and child all use absolute minResource"); > } > {code} > The parent or childrenCapacityType will be considered as PERCENTAGE, because > getCapacityConfigurationTypeForQueues fails to detect the absolute mode, here: > {code:java} > if > (!queue.getQueueResourceQuotas().getConfiguredMinResource(nodeLabel) > .equals(Resources.none())) { > absoluteMinResSet = true; > {code} > (It only happens in legacy queue mode.) > h2. Possible fixes > Possible fix in AbstractParentQueue.getCapacityConfigurationTypeForQueues > using the capacityVector: > {code:java} > for (CSQueue queue : queues) { > for (String nodeLabel : queueCapacities.getExistingNodeLabels()) { > Set > definedCap
[jira] [Updated] (YARN-11641) Can't update a queue hierarchy in absolute mode when the configured capacities are zero
[ https://issues.apache.org/jira/browse/YARN-11641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tamas Domok updated YARN-11641: --- Attachment: hierarchy.png > Can't update a queue hierarchy in absolute mode when the configured > capacities are zero > --- > > Key: YARN-11641 > URL: https://issues.apache.org/jira/browse/YARN-11641 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler >Affects Versions: 3.4.0 >Reporter: Tamas Domok >Assignee: Tamas Domok >Priority: Major > Attachments: hierarchy.png > > > h2. Error symptoms > It is not possible to modify a queue hierarchy in absolute mode when the > parent or every child queue of the parent has 0 min resource configured. > {noformat} > 2024-01-05 15:38:59,016 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerQueueManager: > Initialized queue: root.a.c > 2024-01-05 15:38:59,016 ERROR > org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebServices: Exception > thrown when modifying configuration. > java.io.IOException: Failed to re-init queues : Parent=root.a: When absolute > minResource is used, we must make sure both parent and child all use absolute > minResource > {noformat} > h2. Reproduction > capacity-scheduler.xml > {code:xml} > > > > yarn.scheduler.capacity.root.queues > default,a > > > yarn.scheduler.capacity.root.capacity > [memory=40960, vcores=16] > > > yarn.scheduler.capacity.root.default.capacity > [memory=1024, vcores=1] > > > yarn.scheduler.capacity.root.default.maximum-capacity > [memory=1024, vcores=1] > > > yarn.scheduler.capacity.root.a.capacity > [memory=0, vcores=0] > > > yarn.scheduler.capacity.root.a.maximum-capacity > [memory=39936, vcores=15] > > > yarn.scheduler.capacity.root.a.queues > b,c > > > yarn.scheduler.capacity.root.a.b.capacity > [memory=0, vcores=0] > > > yarn.scheduler.capacity.root.a.b.maximum-capacity > [memory=39936, vcores=15] > > > yarn.scheduler.capacity.root.a.c.capacity > [memory=0, vcores=0] > > > yarn.scheduler.capacity.root.a.c.maximum-capacity > [memory=39936, vcores=15] > > > {code} > updatequeue.xml > {code:xml} > > > > root.a > > > capacity > [memory=1024,vcores=1] > > > maximum-capacity > [memory=39936,vcores=15] > > > > > {code} > {code} > $ curl -X PUT -H 'Content-Type: application/xml' -d @updatequeue.xml > http://localhost:8088/ws/v1/cluster/scheduler-conf\?user.name\=yarn > Failed to re-init queues : Parent=root.a: When absolute minResource is used, > we must make sure both parent and child all use absolute minResource > {code} > h2. Root cause > setChildQueues is called during reinit, where: > {code:java} > void setChildQueues(Collection childQueues) throws IOException { > writeLock.lock(); > try { > boolean isLegacyQueueMode = > queueContext.getConfiguration().isLegacyQueueMode(); > if (isLegacyQueueMode) { > QueueCapacityType childrenCapacityType = > getCapacityConfigurationTypeForQueues(childQueues); > QueueCapacityType parentCapacityType = > getCapacityConfigurationTypeForQueues(ImmutableList.of(this)); > if (childrenCapacityType == QueueCapacityType.ABSOLUTE_RESOURCE > || parentCapacityType == QueueCapacityType.ABSOLUTE_RESOURCE) { > // We don't allow any mixed absolute + {weight, percentage} between > // children and parent > if (childrenCapacityType != parentCapacityType && > !this.getQueuePath() > .equals(CapacitySchedulerConfiguration.ROOT)) { > throw new IOException("Parent=" + this.getQueuePath() > + ": When absolute minResource is used, we must make sure > both " > + "parent and child all use absolute minResource"); > } > {code} > The parent or childrenCapacityType will be considered as PERCENTAGE, because > getCapacityConfigurationTypeForQueues fails to detect the absolute mode, here: > {code:java} > if > (!queue.getQueueResourceQuotas().getConfiguredMinResource(nodeLabel) > .equals(Resources.none())) { > absoluteMinResSet = true; > {code} > (It only happens in legacy queue mode.) > h2. Possible fixes > Possible fix in AbstractParentQueue.getCapacityConfigurationTypeForQueues > using the capacityVector: > {code:java} > for (CSQueue queue : queues) { > for (String nodeLabel : queueCapacities.getExistingNodeLabels()) { > Set > definedCapacityTypes = > > queue.getConfiguredCapacityVector(nodeLabel).getDefinedC
[jira] [Updated] (YARN-11641) Can't update a queue hierarchy in absolute mode when the configured capacities are zero
[ https://issues.apache.org/jira/browse/YARN-11641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tamas Domok updated YARN-11641: --- Description: h2. Error symptoms It is not possible to modify a queue hierarchy in absolute mode when the parent or every child queue of the parent has 0 min resource configured. {noformat} 2024-01-05 15:38:59,016 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerQueueManager: Initialized queue: root.a.c 2024-01-05 15:38:59,016 ERROR org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebServices: Exception thrown when modifying configuration. java.io.IOException: Failed to re-init queues : Parent=root.a: When absolute minResource is used, we must make sure both parent and child all use absolute minResource {noformat} h2. Reproduction capacity-scheduler.xml {code:xml} yarn.scheduler.capacity.root.queues default,a yarn.scheduler.capacity.root.capacity [memory=40960, vcores=16] yarn.scheduler.capacity.root.default.capacity [memory=1024, vcores=1] yarn.scheduler.capacity.root.default.maximum-capacity [memory=1024, vcores=1] yarn.scheduler.capacity.root.a.capacity [memory=0, vcores=0] yarn.scheduler.capacity.root.a.maximum-capacity [memory=39936, vcores=15] yarn.scheduler.capacity.root.a.queues b,c yarn.scheduler.capacity.root.a.b.capacity [memory=0, vcores=0] yarn.scheduler.capacity.root.a.b.maximum-capacity [memory=39936, vcores=15] yarn.scheduler.capacity.root.a.c.capacity [memory=0, vcores=0] yarn.scheduler.capacity.root.a.c.maximum-capacity [memory=39936, vcores=15] {code} !hierarchy.png! updatequeue.xml {code:xml} root.a capacity [memory=1024,vcores=1] maximum-capacity [memory=39936,vcores=15] {code} {code} $ curl -X PUT -H 'Content-Type: application/xml' -d @updatequeue.xml http://localhost:8088/ws/v1/cluster/scheduler-conf\?user.name\=yarn Failed to re-init queues : Parent=root.a: When absolute minResource is used, we must make sure both parent and child all use absolute minResource {code} h2. Root cause setChildQueues is called during reinit, where: {code:java} void setChildQueues(Collection childQueues) throws IOException { writeLock.lock(); try { boolean isLegacyQueueMode = queueContext.getConfiguration().isLegacyQueueMode(); if (isLegacyQueueMode) { QueueCapacityType childrenCapacityType = getCapacityConfigurationTypeForQueues(childQueues); QueueCapacityType parentCapacityType = getCapacityConfigurationTypeForQueues(ImmutableList.of(this)); if (childrenCapacityType == QueueCapacityType.ABSOLUTE_RESOURCE || parentCapacityType == QueueCapacityType.ABSOLUTE_RESOURCE) { // We don't allow any mixed absolute + {weight, percentage} between // children and parent if (childrenCapacityType != parentCapacityType && !this.getQueuePath() .equals(CapacitySchedulerConfiguration.ROOT)) { throw new IOException("Parent=" + this.getQueuePath() + ": When absolute minResource is used, we must make sure both " + "parent and child all use absolute minResource"); } {code} The parent or childrenCapacityType will be considered as PERCENTAGE, because getCapacityConfigurationTypeForQueues fails to detect the absolute mode, here: {code:java} if (!queue.getQueueResourceQuotas().getConfiguredMinResource(nodeLabel) .equals(Resources.none())) { absoluteMinResSet = true; {code} (It only happens in legacy queue mode.) h2. Possible fixes Possible fix in AbstractParentQueue.getCapacityConfigurationTypeForQueues using the capacityVector: {code:java} for (CSQueue queue : queues) { for (String nodeLabel : queueCapacities.getExistingNodeLabels()) { Set definedCapacityTypes = queue.getConfiguredCapacityVector(nodeLabel).getDefinedCapacityTypes(); if (definedCapacityTypes.size() == 1) { QueueCapacityVector.ResourceUnitCapacityType next = definedCapacityTypes.iterator().next(); if (Objects.requireNonNull(next) == PERCENTAGE) { percentageIsSet = true; diagMsg.append("{Queue=").append(queue.getQueuePath()).append(", label=").append(nodeLabel) .append(" uses percentage mode}. "); } else if (next == QueueCapacityVector.ResourceUnitCapacityType.ABSOLUTE) { absoluteMinResSet = true; diagMsg.append("{Queue=").append(queue.getQueuePath()).append(", label=").append(nodeLabel) .append(" uses absolute mode}. "); } else if (next == QueueCapacityVector.ResourceUnitCapacityType.WEIGHT) { weightIsSet = tru
[jira] [Updated] (YARN-11641) Can't update a queue hierarchy in absolute mode when the configured capacities are zero
[ https://issues.apache.org/jira/browse/YARN-11641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tamas Domok updated YARN-11641: --- Description: h2. Error symptoms It is not possible to modify a queue hierarchy in absolute mode when the parent or every child queue of the parent has 0 min resource configured. {noformat} 2024-01-05 15:38:59,016 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerQueueManager: Initialized queue: root.a.c 2024-01-05 15:38:59,016 ERROR org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebServices: Exception thrown when modifying configuration. java.io.IOException: Failed to re-init queues : Parent=root.a: When absolute minResource is used, we must make sure both parent and child all use absolute minResource {noformat} h2. Reproduction capacity-scheduler.xml {code:xml} yarn.scheduler.capacity.root.queues default,a yarn.scheduler.capacity.root.capacity [memory=40960, vcores=16] yarn.scheduler.capacity.root.default.capacity [memory=1024, vcores=1] yarn.scheduler.capacity.root.default.maximum-capacity [memory=1024, vcores=1] yarn.scheduler.capacity.root.a.capacity [memory=0, vcores=0] yarn.scheduler.capacity.root.a.maximum-capacity [memory=39936, vcores=15] yarn.scheduler.capacity.root.a.queues b,c yarn.scheduler.capacity.root.a.b.capacity [memory=0, vcores=0] yarn.scheduler.capacity.root.a.b.maximum-capacity [memory=39936, vcores=15] yarn.scheduler.capacity.root.a.c.capacity [memory=0, vcores=0] yarn.scheduler.capacity.root.a.c.maximum-capacity [memory=39936, vcores=15] {code} updatequeue.xml {code:xml} root.a capacity [memory=1024,vcores=1] maximum-capacity [memory=39936,vcores=15] {code} {code} $ curl -X PUT -H 'Content-Type: application/xml' -d @updatequeue.xml http://localhost:8088/ws/v1/cluster/scheduler-conf\?user.name\=yarn Failed to re-init queues : Parent=root.a: When absolute minResource is used, we must make sure both parent and child all use absolute minResource {code} h2. Root cause setChildQueues is called during reinit, where: {code:java} void setChildQueues(Collection childQueues) throws IOException { writeLock.lock(); try { boolean isLegacyQueueMode = queueContext.getConfiguration().isLegacyQueueMode(); if (isLegacyQueueMode) { QueueCapacityType childrenCapacityType = getCapacityConfigurationTypeForQueues(childQueues); QueueCapacityType parentCapacityType = getCapacityConfigurationTypeForQueues(ImmutableList.of(this)); if (childrenCapacityType == QueueCapacityType.ABSOLUTE_RESOURCE || parentCapacityType == QueueCapacityType.ABSOLUTE_RESOURCE) { // We don't allow any mixed absolute + {weight, percentage} between // children and parent if (childrenCapacityType != parentCapacityType && !this.getQueuePath() .equals(CapacitySchedulerConfiguration.ROOT)) { throw new IOException("Parent=" + this.getQueuePath() + ": When absolute minResource is used, we must make sure both " + "parent and child all use absolute minResource"); } {code} The parent or childrenCapacityType will be considered as PERCENTAGE, because getCapacityConfigurationTypeForQueues fails to detect the absolute mode, here: {code:java} if (!queue.getQueueResourceQuotas().getConfiguredMinResource(nodeLabel) .equals(Resources.none())) { absoluteMinResSet = true; {code} (It only happens in legacy queue mode.) h2. Possible fixes Possible fix in AbstractParentQueue.getCapacityConfigurationTypeForQueues using the capacityVector: {code:java} for (CSQueue queue : queues) { for (String nodeLabel : queueCapacities.getExistingNodeLabels()) { Set definedCapacityTypes = queue.getConfiguredCapacityVector(nodeLabel).getDefinedCapacityTypes(); if (definedCapacityTypes.size() == 1) { QueueCapacityVector.ResourceUnitCapacityType next = definedCapacityTypes.iterator().next(); if (Objects.requireNonNull(next) == PERCENTAGE) { percentageIsSet = true; diagMsg.append("{Queue=").append(queue.getQueuePath()).append(", label=").append(nodeLabel) .append(" uses percentage mode}. "); } else if (next == QueueCapacityVector.ResourceUnitCapacityType.ABSOLUTE) { absoluteMinResSet = true; diagMsg.append("{Queue=").append(queue.getQueuePath()).append(", label=").append(nodeLabel) .append(" uses absolute mode}. "); } else if (next == QueueCapacityVector.ResourceUnitCapacityType.WEIGHT) { weightIsSet = true; diag