[jira] [Comment Edited] (YARN-10528) maxAMShare should only be accepted for leaf queues, not parent queues

2020-12-15 Thread Siddharth Ahuja (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17250153#comment-17250153
 ] 

Siddharth Ahuja edited comment on YARN-10528 at 12/16/20, 7:52 AM:
---

I have made the behaviour similar to the {{reservation}} element in code.

Performed the following testing on the single node cluster:

Have FS XML as follows:

{code}


    
        1.0
        drf
        *
        *
        
            1.0
            drf
        
        
            1.0
            drf
            0.76 
<- root.users is a parent queue 
with maxAMShare set. This should not be possible.
        
        
            1.0
            drf
            
                1.0
                drf
            
        
        
            1.0
            drf
            
                1.0
                drf
            
        
    
    fair
    0.75
    
        
        
            
        
    

{code}

Refresh YARN queues and observe the RM logs:

{code}
% bin/yarn rmadmin -refreshQueues
{code}

{code}
2020-12-16 18:12:29,665 ERROR 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService:
 Failed to reload fair scheduler config file - will use existing allocations.
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationConfigurationException:
 The configuration settings for root.users are invalid. A queue element that 
contains child queue elements or that has the type='parent' attribute cannot 
also include a maxAMShare element.
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.allocation.AllocationFileQueueParser.loadQueue(AllocationFileQueueParser.java:238)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.allocation.AllocationFileQueueParser.loadQueue(AllocationFileQueueParser.java:221)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.allocation.AllocationFileQueueParser.parse(AllocationFileQueueParser.java:97)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.reloadAllocations(AllocationFileLoaderService.java:257)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.lambda$serviceInit$0(AllocationFileLoaderService.java:128)
at java.lang.Thread.run(Thread.java:748)


2020-12-16 18:15:04,056 ERROR 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
Failed to reload allocations file
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationConfigurationException:
 The configuration settings for root.users are invalid. A queue element that 
contains child queue elements or that has the type='parent' attribute cannot 
also include a maxAMShare element.
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.allocation.AllocationFileQueueParser.loadQueue(AllocationFileQueueParser.java:238)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.allocation.AllocationFileQueueParser.loadQueue(AllocationFileQueueParser.java:221)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.allocation.AllocationFileQueueParser.parse(AllocationFileQueueParser.java:97)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.reloadAllocations(AllocationFileLoaderService.java:257)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.reinitialize(FairScheduler.java:1571)
at 
org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshQueues(AdminService.java:438)
at 
org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshQueues(AdminService.java:409)
at 
org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshQueues(ResourceManagerAdministrationProtocolPBServiceImpl.java:120)
at 
org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:293)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:537)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1086)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1035)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:963)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2966)
{code}

Now, update FS XML such that {{maxAMShare}} is not set for root.users but set 
for a parent queue which is not explicitly tagged as one with "type=parent":

{code}


    
        1.0
        drf
        *
        *
        
            1.0
            drf
        
        
            1.0
     

[jira] [Comment Edited] (YARN-10528) maxAMShare should only be accepted for leaf queues, not parent queues

2020-12-15 Thread Siddharth Ahuja (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-10528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17250153#comment-17250153
 ] 

Siddharth Ahuja edited comment on YARN-10528 at 12/16/20, 7:51 AM:
---

I have made the behaviour similar to the {{reservation}} element in code.

Performed the following testing on the single node cluster:

Have FS XML as follows:

{code}


    
        1.0
        drf
        *
        *
        
            1.0
            drf
        
        
            1.0
            drf
            0.76 
<- root.users is a parent queue 
with maxAMShare set. This should not be possible.
        
        
            1.0
            drf
            
                1.0
                drf
            
        
        
            1.0
            drf
            
                1.0
                drf
            
        
    
    fair
    0.75
    
        
        
            
        
    

{code}

Refresh YARN queues and observe the RM logs:

{code}
% bin/yarn rmadmin -refreshQueues
{code}

{code}
2020-12-16 18:12:29,665 ERROR 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService:
 Failed to reload fair scheduler config file - will use existing allocations.
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationConfigurationException:
 The configuration settings for root.users are invalid. A queue element that 
contains child queue elements or that has the type='parent' attribute cannot 
also include a maxAMShare element.
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.allocation.AllocationFileQueueParser.loadQueue(AllocationFileQueueParser.java:238)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.allocation.AllocationFileQueueParser.loadQueue(AllocationFileQueueParser.java:221)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.allocation.AllocationFileQueueParser.parse(AllocationFileQueueParser.java:97)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.reloadAllocations(AllocationFileLoaderService.java:257)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.lambda$serviceInit$0(AllocationFileLoaderService.java:128)
at java.lang.Thread.run(Thread.java:748)


2020-12-16 18:15:04,056 ERROR 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: 
Failed to reload allocations file
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationConfigurationException:
 The configuration settings for root.users are invalid. A queue element that 
contains child queue elements or that has the type='parent' attribute cannot 
also include a maxAMShare element.
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.allocation.AllocationFileQueueParser.loadQueue(AllocationFileQueueParser.java:238)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.allocation.AllocationFileQueueParser.loadQueue(AllocationFileQueueParser.java:221)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.allocation.AllocationFileQueueParser.parse(AllocationFileQueueParser.java:97)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.reloadAllocations(AllocationFileLoaderService.java:257)
at 
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.reinitialize(FairScheduler.java:1571)
at 
org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshQueues(AdminService.java:438)
at 
org.apache.hadoop.yarn.server.resourcemanager.AdminService.refreshQueues(AdminService.java:409)
at 
org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceManagerAdministrationProtocolPBServiceImpl.refreshQueues(ResourceManagerAdministrationProtocolPBServiceImpl.java:120)
at 
org.apache.hadoop.yarn.proto.ResourceManagerAdministrationProtocol$ResourceManagerAdministrationProtocolService$2.callBlockingMethod(ResourceManagerAdministrationProtocol.java:293)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:537)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1086)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1035)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:963)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2966)
{code}

Now, update FS XML such that {{maxAMShare}} is not set for root.users but set 
for a parent queue which is not explicitly tagged as one with "type=parent":

{code}


    
        1.0
        drf
        *
        *
        
            1.0
            drf
        
        
            1.0