[ 
https://issues.apache.org/jira/browse/YARN-9879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16997290#comment-16997290
 ] 

Wilfred Spiegelenburg commented on YARN-9879:
---------------------------------------------

I have read through the design document and was wondering if we cannot take a 
far simpler approach.

If we simply relax the rule that the leaf queue must be unique in the system in 
favour of the fact that a queue must be unique based on the full queue path. 
This does not break existing configurations as the unique leaf queue is also 
unique when you take into account the whole path. That means there is nothing 
for the current clusters that needs to change. Internally the scheduler does 
have to change to make sure that all references use the queue path. This will 
require a lot of changes throughout the scheduler when you look up a queue and 
the way we store the reference if it is not directly to the leaf queue. 

The only other point that we need to correctly handle this now is on the submit 
side. This must be handled backward compatible. We have two cases to handle: 
just a queue name and a queue path. I'll discuss updating  the configuration is 
later.

# When an application is submitted with just a queue name (not a path) we 
expect that the name is a unique leaf queue name. If that queue does not exist 
or is not uniquely identifiable we reject the application submission. 
Resolution of the real leaf queue follows the same steps as it does now. The 
queue name in the end is converted to the correct leaf queue identified by the 
a path. For existing configurations nothing has changed. Internally we hide all 
the changes.
# When the submit has a queue path (fully qualified or not) we check that the 
queue exists based on that path. If the leaf queue is not defined using its 
path the application submission is rejected. 

In the case that the scheduler has a non unique leaf queue name submitting to 
those queues can only be done by using their paths. There is nothing that needs 
to be configured to switch this behaviour on or off.

The important part is applying a new configuration. If the configuration adds a 
leaf queue that is not unique the configuration update currently is rejected. 
With this change we would allow that config to become active. This *could* 
break existing applications when they try to submit to the leaf queue that is 
no longer unique.
We should at least log and warn clearly in the response of the update. Maybe 
even show it in the UI or we could ask for a confirmation. The first update 
that adds a non unique queue to the configuration should always fail 
complaining loudly. It should then keep warning the user and rejecting the 
update unless a confirmation flag is set to force the update through. After the 
first update that would not be needed anymore.
Reading a config from a file or store which is used to initialise the scheduler 
should not trigger such behaviour. We still should show a warning in the logs 
to make sure it is not lost.

What do you think about this approach?

> Allow multiple leaf queues with the same name in CS
> ---------------------------------------------------
>
>                 Key: YARN-9879
>                 URL: https://issues.apache.org/jira/browse/YARN-9879
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Gergely Pollak
>            Assignee: Gergely Pollak
>            Priority: Major
>         Attachments: DesignDoc_v1.pdf
>
>
> Currently the leaf queue's name must be unique regardless of its position in 
> the queue hierarchy. 
> Design doc and first proposal is being made, I'll attach it as soon as it's 
> done.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to