[ https://issues.apache.org/jira/browse/YARN-10497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Shilun Fan updated YARN-10497: ------------------------------ Component/s: capacity scheduler Hadoop Flags: Reviewed Target Version/s: 3.4.0 Affects Version/s: 3.4.0 > Fix an issue in CapacityScheduler which fails to delete queues > -------------------------------------------------------------- > > Key: YARN-10497 > URL: https://issues.apache.org/jira/browse/YARN-10497 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacity scheduler > Affects Versions: 3.4.0 > Reporter: Wangda Tan > Assignee: Wangda Tan > Priority: Major > Labels: capacity-scheduler, capacityscheduler > Fix For: 3.4.0 > > Attachments: YARN-10497.001.patch, YARN-10497.002.patch, > YARN-10497.003.patch, YARN-10497.004.patch, YARN-10497.005.patch, > YARN-10497.006.patch > > > We saw an exception when using queue mutation APIs: > {code:java} > 2020-11-13 16:47:46,327 WARN > org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebServices: > CapacityScheduler configuration validation failed:java.io.IOException: Queue > root.am2cmQueueSecond not found > {code} > Which comes from this code: > {code:java} > List<String> siblingQueues = getSiblingQueues(queueToRemove, > proposedConf); > if (!siblingQueues.contains(queueName)) { > throw new IOException("Queue " + queueToRemove + " not found"); > } > {code} > (Inside MutableCSConfigurationProvider) > If you look at the method: > {code:java} > > private List<String> getSiblingQueues(String queuePath, Configuration conf) > { > String parentQueue = queuePath.substring(0, queuePath.lastIndexOf('.')); > String childQueuesKey = CapacitySchedulerConfiguration.PREFIX + > parentQueue + CapacitySchedulerConfiguration.DOT + > CapacitySchedulerConfiguration.QUEUES; > return new ArrayList<>(conf.getStringCollection(childQueuesKey)); > } > {code} > And here's capacity-scheduler.xml I got > {code:java} > <property><name>yarn.scheduler.capacity.root.queues</name><value>default, q1, > q2</value></property> > {code} > You can notice there're spaces between default, q1, a2 > So conf.getStringCollection returns: > {code:java} > default > <space>q1 > ... > {code} > Which causes match issue when we try to delete the queue. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org