[ https://issues.apache.org/jira/browse/YARN-6325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15924965#comment-15924965 ]
Wangda Tan commented on YARN-6325: ---------------------------------- [~jlowe] thanks for jumping in. I agree that client operations directly on parent queue is limited. However, since we have some APIs like ApplicationClientProtocol#getQueueInfo, and this cause some features cannot work, such as user-queue mapping. Instead of fixing this in trunk only, I prefer to have this fix in branch-2 as well. > ParentQueue and LeafQueue with same name can cause queue name based > operations to fail > -------------------------------------------------------------------------------------- > > Key: YARN-6325 > URL: https://issues.apache.org/jira/browse/YARN-6325 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler > Reporter: Jonathan Hung > Attachments: capacity-scheduler.xml, Screen Shot 2017-03-13 at > 2.28.30 PM.png > > > For example, configure capacity scheduler with two leaf queues: {{root.a.a1}} > and {{root.b.a}}, with {{yarn.scheduler.capacity.root.queues}} as {{b,a}} (in > that order). > Then add a mapping e.g. {{u:username:a}} to {{capacity-scheduler.xml}} and > call {{refreshQueues}}. Operation fails with {noformat}refreshQueues: > java.io.IOException: Failed to re-init queues : mapping contains invalid or > non-leaf queue a > at > org.apache.hadoop.yarn.ipc.RPCUtil.getRemoteException(RPCUtil.java:38) > at > org.apache.hadoop.yarn.server.resourcemanager.AdminService.logAndWrapException(AdminService.java:866) > at > org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshQueues(AdminService.java:391) > at > org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshQueues(ResourceManagerAdministrationProtocolPBServiceImpl.java:114) > at > org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:271) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:522) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:867) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:813) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1857) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2653) > Caused by: java.io.IOException: Failed to re-init queues : mapping contains > invalid or non-leaf queue a > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.reinitialize(CapacityScheduler.java:404) > at > org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshQueues(AdminService.java:396) > at > org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshQueues(AdminService.java:386) > ... 10 more > Caused by: java.io.IOException: mapping contains invalid or non-leaf queue a > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.getUserGroupMappingPlacementRule(CapacityScheduler.java:547) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updatePlacementRules(CapacityScheduler.java:571) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.reinitializeQueues(CapacityScheduler.java:595) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.reinitialize(CapacityScheduler.java:400) > ... 12 more > {noformat} > Part of the issue is that the {{queues}} map in > {{CapacitySchedulerQueueManager}} stores queues by queue name. We could do > one of a few things: > # Disallow ParentQueues and LeafQueues to have the same queue name. (this > breaks compatibility) > # Store queues by queue path instead of queue name. But this might require > changes in lots of places, e.g. in this case the queue-mappings would have to > map to a queue path instead of a queue name (which also breaks compatibility) > and possibly others. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org