[ https://issues.apache.org/jira/browse/YARN-5202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15427275#comment-15427275 ]
Ha Son Hai commented on YARN-5202: ---------------------------------- Hi [~nroberts]! Would you mind if I ask for a little bit more on the explanation of this parameter "RM_OVERCOMMIT_MEM_MAX_FACTOR"? I set it to a different value, but the log of the ResourceManager reports that value 0. It's the same for vcoreFactor. I wonder if it's a bug? By the way, is RM_OVERCOMMIT_MEM_MAX_FACTOR redundant with RM_OVERCOMMIT_MEM_INCREMENT? one is in ratio and the other is in MBs? In the case that I have node with 32Gigs RAM, if I set RM_OVERCOMMIT_MEM_MAX_FACTOR to 2, does it mean that I can over-commit for 2 times the total memory that I have (in case the utilization is very low) that is 64Gigs? Sorry if the question is a "basic" or a "stupid" one. I have just started to work with the code of HADOOP so there is a lot of things that's new to me. Thanks a lot for your clarification. I attached below your explanation for the parameters. + RM_OVERCOMMIT_MEM_INCREMENT: Specifies the largest memory increment in megabytes when enlarging a node's total resource for overcommit. Once incremented at least one container must be launched on the node to increase the value further. A value <= 0 will disable memory overcommit. + RM_OVERCOMMIT_MEM_MAX_FACTOR Maximum amount of memory to overcommit as a factor of the total node memory. A value <= 0 disables memory overcommit. > Dynamic Overcommit of Node Resources - POC > ------------------------------------------ > > Key: YARN-5202 > URL: https://issues.apache.org/jira/browse/YARN-5202 > Project: Hadoop YARN > Issue Type: Improvement > Components: nodemanager, resourcemanager > Affects Versions: 3.0.0-alpha1 > Reporter: Nathan Roberts > Assignee: Nathan Roberts > Attachments: YARN-5202-branch2.7-uber.patch, YARN-5202.patch > > > This Jira is to present a proof-of-concept implementation (collaboration > between [~jlowe] and myself) of a dynamic over-commit implementation in YARN. > The type of over-commit implemented in this jira is similar to but not as > full-featured as what's being implemented via YARN-1011. YARN-1011 is where > we see ourselves heading but we needed something quick and completely > transparent so that we could test it at scale with our varying workloads > (mainly MapReduce, Spark, and Tez). Doing so has shed some light on how much > additional capacity we can achieve with over-commit approaches, and has > fleshed out some of the problems these approaches will face. > Primary design goals: > - Avoid changing protocols, application frameworks, or core scheduler logic, > - simply adjust individual nodes' available resources based on current node > utilization and then let scheduler do what it normally does > - Over-commit slowly, pull back aggressively - If things are looking good and > there is demand, slowly add resource. If memory starts to look over-utilized, > aggressively reduce the amount of over-commit. > - Make sure the nodes protect themselves - i.e. if memory utilization on a > node gets too high, preempt something - preferably something from a > preemptable queue > A patch against trunk will be attached shortly. Some notes on the patch: > - This feature was originally developed against something akin to 2.7. Since > the patch is mainly to explain the approach, we didn't do any sort of testing > against trunk except for basic build and basic unit tests > - The key pieces of functionality are in {{SchedulerNode}}, > {{AbstractYarnScheduler}}, and {{NodeResourceMonitorImpl}}. The remainder of > the patch is mainly UI, Config, Metrics, Tests, and some minor code > duplication (e.g. to optimize node resource changes we treat an over-commit > resource change differently than an updateNodeResource change - i.e. > remove_node/add_node is just too expensive for the frequency of over-commit > changes) > - We only over-commit memory at this point. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org