[ 
https://issues.apache.org/jira/browse/YARN-10496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17264106#comment-17264106
 ] 

Benjamin Teke commented on YARN-10496:
--------------------------------------

[~pbacsko], regarding the max capacity: as of now YARN-10504 disabled the 
validation for the absolute and absolute max capacity of a queue. I think we 
should allow some flexibility by either introducing a flag or a special format 
like you mentioned. Couple of concerns/questions:
 * Should we allow the max capacity to be lower than the capacity?
 ** In "relative to the cluster" mode this can be straightforward, especially 
with weight mode, I can setup a quite large queue hierarchy with weights and 
not worry about any queue eating up large part of the cluster resources.
** In "relative to the parent" mode this can allow an option where the weights 
are basically disabled, and the queues are configured with the max capacity. 
Not necessarily a problem, but this can lead to hard to read configurations.
* If we keep/reintroduce the capacity < max capacity constraint in weight mode 
the user might have to calculate the percentages from weight manually. For 
example child1 and child2 are the only child queues under a parent with weights 
3 and 1. In this setup child1 has to have the configured max capacity as 75% 
while child2 can have anything above 25%. This is ok for a static parent, but 
if/when auto-create templates/wildcard configs will be supported the capacity 
can greatly change based on the number of dynamic queues. If I want to express 
the max capacity of any child as 33% of the parent's resources I will need to 
define at least 3 static queues with the same weight, I can't allow these to be 
auto created (because 1 queue with weight 1 will have the capacity 100%, 2 
queues with weight 1 will have 50%). This is another reason to let this 
constraint go.

> [Umbrella] Support Flexible Auto Queue Creation in Capacity Scheduler
> ---------------------------------------------------------------------
>
>                 Key: YARN-10496
>                 URL: https://issues.apache.org/jira/browse/YARN-10496
>             Project: Hadoop YARN
>          Issue Type: New Feature
>          Components: capacity scheduler
>            Reporter: Wangda Tan
>            Priority: Major
>
> CapacityScheduler today doesn’t support an auto queue creation which is 
> flexible enough. The current constraints: 
>  * Only leaf queues can be auto-created
>  * A parent can only have either static queues or dynamic ones. This causes 
> multiple constraints. For example:
>  * It isn’t possible to have a VIP user like Alice with a static queue 
> root.user.alice with 50% capacity while the other user queues (under 
> root.user) are created dynamically and they share the remaining 50% of 
> resources.
>  
>  * In comparison, FairScheduler allows the following scenarios, Capacity 
> Scheduler doesn’t:
>  ** This implies that there is no possibility to have both dynamically 
> created and static queues at the same time under root
>  * A new queue needs to be created under an existing parent, while the parent 
> already has static queues
>  * Nested queue mapping policy, like in the following example: 
> |<rule name="nestedUserQueue" create=”true”>
>         <rule name="primaryGroup" create="true" />
> </rule>|
>  * Here two levels of queues may need to be created 
> If an application belongs to user _alice_ (who has the primary_group of 
> _engineering_), the scheduler checks whether _root.engineering_ exists, if it 
> doesn’t,  it’ll be created. Then scheduler checks whether 
> _root.engineering.alice_ exists, and creates it if it doesn't.
>  
> When we try to move users from FairScheduler to CapacityScheduler, we face 
> feature gaps which blocks users migrate from FS to CS.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to