[ 
https://issues.apache.org/jira/browse/YARN-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16552885#comment-16552885
 ] 

Antal Bálint Steinbach edited comment on YARN-8468 at 7/23/18 1:47 PM:
-----------------------------------------------------------------------

Hi [~haibochen] !

I only commented "Thanks for the feedback [~wilfreds]", but I also fixed his 
suggestions. I am sorry for that, please find my responses inline.
 - a {{FSLeafQueue}} and {{FSParentQueue}} always have a parent doing a null 
check on the parent is unneeded. The only queue that does not have a parent is 
the root queue which you already have special cased. _(In some tests sub-queue 
does not have a parent)_
 - {{getMaximumResourceCapability}} must support resource types and not just 
memory and vcores, same as YARN-7556 for this setting (_It returns Resource I 
assume than it is ok with resource types_)
 - {{getMaxAllowedAllocation}} from the NodeTracker support more than just 
memory and vcores, needs to flow through (_It returns Resource I assume than it 
is ok with resource types_)
 - {{FairScheduler}}: Why change the static imports only for a part of the 
config values, either change all or none (none is preferred) (_Fixed_)
 - {{FairSchedulerPage}}: missing toString on the ResourceInfo (_added but I 
can't see why is it necessary_)
 - Testing must also use resource types not only the old configuration type 
like: "memory-mb=5120, test1=4, vcores=2" _(Test added)_
 - {{TestFairScheduler}} Testing must also include failure cases for sub queues 
not just the root queue: setting value on root queue should throw and should 
not be applied (_Fixed_)
 - If this TestQueueMaxContainerAllocationValidator is a new file make sure 
that you add the license etc (_license text added for the new files_)

Balint


was (Author: bsteinbach):
Hi [~haibochen] !

I only commented "Thanks for the feedback [~wilfreds]", but I also fixed his 
suggestions. I am sorry for that, please find my responses inline.
 - a {{FSLeafQueue}} and {{FSParentQueue}} always have a parent doing a null 
check on the parent is unneeded. The only queue that does not have a parent is 
the root queue which you already have special cased. _(In some tests sub-queue 
does not have a parent)_
 - {{getMaximumResourceCapability}} must support resource types and not just 
memory and vcores, same as YARN-7556 for this setting (_It supports Resource I 
assume than it is ok with resource types_)
 - {{getMaxAllowedAllocation}} from the NodeTracker support more than just 
memory and vcores, needs to flow through (_It supports Resource I assume than 
it is ok with resource types_)
 - {{FairScheduler}}: Why change the static imports only for a part of the 
config values, either change all or none (none is preferred) (_Fixed_)
 - {{FairSchedulerPage}}: missing toString on the ResourceInfo (_added but I 
can't see why is it necessary_)
 - Testing must also use resource types not only the old configuration type 
like: "memory-mb=5120, test1=4, vcores=2" _(Test added)_
 - {{TestFairScheduler}} Testing must also include failure cases for sub queues 
not just the root queue: setting value on root queue should throw and should 
not be applied (_Fixed_)
 - If this TestQueueMaxContainerAllocationValidator is a new file make sure 
that you add the license etc (_license text added for the new files_)

Balint

> Limit container sizes per queue in FairScheduler
> ------------------------------------------------
>
>                 Key: YARN-8468
>                 URL: https://issues.apache.org/jira/browse/YARN-8468
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: resourcemanager
>    Affects Versions: 3.1.0
>            Reporter: Antal Bálint Steinbach
>            Assignee: Antal Bálint Steinbach
>            Priority: Critical
>              Labels: patch
>         Attachments: YARN-8468.000.patch, YARN-8468.001.patch, 
> YARN-8468.002.patch, YARN-8468.003.patch
>
>
> When using any scheduler, you can use "yarn.scheduler.maximum-allocation-mb" 
> to limit the overall size of a container. This applies globally to all 
> containers and cannot be limited by queue or and is not scheduler dependent.
>  
> The goal of this ticket is to allow this value to be set on a per queue basis.
>  
> The use case: User has two pools, one for ad hoc jobs and one for enterprise 
> apps. User wants to limit ad hoc jobs to small containers but allow 
> enterprise apps to request as many resources as needed. Setting 
> yarn.scheduler.maximum-allocation-mb sets a default value for maximum 
> container size for all queues and setting maximum resources per queue with 
> “maxContainerResources” queue config value.
>  
> Suggested solution:
>  
> All the infrastructure is already in the code. We need to do the following:
>  * add the setting to the queue properties for all queue types (parent and 
> leaf), this will cover dynamically created queues.
>  * if we set it on the root we override the scheduler setting and we should 
> not allow that.
>  * make sure that queue resource cap can not be larger than scheduler max 
> resource cap in the config.
>  * implement getMaximumResourceCapability(String queueName) in the 
> FairScheduler
>  * implement getMaximumResourceCapability() in both FSParentQueue and 
> FSLeafQueue as follows
>  * expose the setting in the queue information in the RM web UI.
>  * expose the setting in the metrics etc for the queue.
>  * write JUnit tests.
>  * update the scheduler documentation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to