[ 
https://issues.apache.org/jira/browse/YARN-8951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16667328#comment-16667328
 ] 

Wilfred Spiegelenburg commented on YARN-8951:
---------------------------------------------

You should not be able to pass the last rule without being assigned a queue. 
The default rule is marked as terminal and the default queue should also always 
exist. I even have YARN-7769 open to remove the creation of the default queue 
on startup by the QueueManager.
Unless you have overridden the DEFAULT_QUEUE_NAME in the yarn configuration it 
should always exist.

It is almost as [~jlowe] stated an invalid configuration, but not in the way 
described. The test code has not correctly initialised the scheduler. That 
causes for instance the QueueManager to not be initialised. The default queue 
thus does not exist, which then causes your issue.

The init that you call in the test is AbstractService.init(conf) you should 
call FairScheduler.initScheduler(conf) to properly init the scheduler. If you 
do not want to do that at least call QueueManager.initialize(conf) to 
initialise the queue setup correctly in your test.

> Defining default queue placement rule in allocations file with create="false" 
> throws an NPE
> -------------------------------------------------------------------------------------------
>
>                 Key: YARN-8951
>                 URL: https://issues.apache.org/jira/browse/YARN-8951
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Szilard Nemeth
>            Assignee: Szilard Nemeth
>            Priority: Major
>         Attachments: default-placement-rule-with-create-false.patch
>
>
> If the default queue placement rule is defined with {{create="false"}} and a 
> scheduling request is created for queue {{"root.default"}}, then 
> {{FairScheduler#assignToQueue}} throws an NPE, while trying to construct an 
> error message in the catch block of {{IllegalStateException}}, relying on the 
> fact that the {{rmApp}} is not null but it is.
> Example of such a config file:
> {code:java}
> <?xml version="1.0"?>
> <allocations>
>       <queue name="parentq" type="parent">
>               <minResources>1024mb,0vcores</minResources>
>       </queue>
>       <queuePlacementPolicy>
>               <rule name="default" create="false"/>
>       </queuePlacementPolicy>
> </allocations>
> {code}
> This is suspicious, as there are some null checks for {{rmApp}} in the same 
> method.
>  Not sure if this is a special case for the tests or it is reproducable in a 
> cluster, this needs further investigation.
> In any case, it's not good that we try to dereference the {{rmApp}} that is 
> null.
> On the other hand, I'm not sure if the default queue placement rule with 
> {{create="false"}} makes sense at all. Looking at the documentation 
> ([https://hadoop.apache.org/docs/r3.1.0/hadoop-yarn/hadoop-yarn-site/FairScheduler.html):]
> {quote}default: the app is placed into the queue specified in the ‘queue’ 
> attribute of the default rule. *If ‘queue’ attribute is not specified, the 
> app is placed into ‘root.default’ queue.*
> A queuePlacementPolicy element: which contains a list of rule elements that 
> tell the scheduler how to place incoming apps into queues. Rules are applied 
> in the order that they are listed. Rules may take arguments. *All rules 
> accept the “create” argument, which indicates whether the rule can create a 
> new queue. “Create” defaults to true; if set to false and the rule would 
> place the app in a queue that is not configured in the allocations file, we 
> continue on to the next rule.* The last rule must be one that can never issue 
> a continue....
> {quote}
> In this case, the rule has the queue property suppressed so the apps should 
> be placed to the {{root.default}} queue (which is an undefined queue 
> according to the config file), and create is false, meaning that the queue 
> {{root.default}} cannot be created at all.
> *This seems to be a case of an invalid queue configuration file for me.*
> [~jlowe], [~leftnoteasy]: What is your take on this?
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to