[ https://issues.apache.org/jira/browse/YARN-8468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564416#comment-16564416 ]
Haibo Chen commented on YARN-8468: ---------------------------------- Thanks [~bsteinbach] for the patch and the detailed response to Wilfred's comments. I have some comments to the latest patch 1) I'd add to Wilfred's previous comment about this change being a queue-specific override of the scheduler-level configuration. The current patch would throw an exception if queue specific configuration value is larger than the scheduler-level value. Instead of doing that, for any queue, the final value can be either the scheduler-level value if there is no queue-specific override, or componentwise minimum of the scheduler-level value and the queue level override. 2) The scheduler-level configuration is used to normalize resource requests and from then on, the scheduler will take the normalized requests and handle things correctly. I don't see how the scheduler picks up the queue-specific values. We can verify this by setting up a unit test in TestFairScheduler that submits a resource request to a queue of which the queue-level max container allocation is smaller than the resource request. 2) This is a queue configuration, not so much a queue metric, so I don't think we need to expose them in FSQueueMetrics. 3) Let's rename 'maxContainerResources' to 'maxContainerAllocation' () to be consistent with the naming of the existing scheduler-level configuration property Please also make sure lines are not over 80 characters and use 4 spaces as indentation, just to be consistent with the code base. > Limit container sizes per queue in FairScheduler > ------------------------------------------------ > > Key: YARN-8468 > URL: https://issues.apache.org/jira/browse/YARN-8468 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler > Affects Versions: 3.1.0 > Reporter: Antal Bálint Steinbach > Assignee: Antal Bálint Steinbach > Priority: Critical > Attachments: YARN-8468.000.patch, YARN-8468.001.patch, > YARN-8468.002.patch, YARN-8468.003.patch > > > When using any scheduler, you can use "yarn.scheduler.maximum-allocation-mb" > to limit the overall size of a container. This applies globally to all > containers and cannot be limited by queue or and is not scheduler dependent. > > The goal of this ticket is to allow this value to be set on a per queue basis. > > The use case: User has two pools, one for ad hoc jobs and one for enterprise > apps. User wants to limit ad hoc jobs to small containers but allow > enterprise apps to request as many resources as needed. Setting > yarn.scheduler.maximum-allocation-mb sets a default value for maximum > container size for all queues and setting maximum resources per queue with > “maxContainerResources” queue config value. > > Suggested solution: > > All the infrastructure is already in the code. We need to do the following: > * add the setting to the queue properties for all queue types (parent and > leaf), this will cover dynamically created queues. > * if we set it on the root we override the scheduler setting and we should > not allow that. > * make sure that queue resource cap can not be larger than scheduler max > resource cap in the config. > * implement getMaximumResourceCapability(String queueName) in the > FairScheduler > * implement getMaximumResourceCapability() in both FSParentQueue and > FSLeafQueue as follows > * expose the setting in the queue information in the RM web UI. > * expose the setting in the metrics etc for the queue. > * write JUnit tests. > * update the scheduler documentation. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org