[ https://issues.apache.org/jira/browse/YARN-291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13619374#comment-13619374 ]
Luke Lu commented on YARN-291: ------------------------------ bq. Changes in NM capacity triggered from outside of the regular scheduling would unbalance existing distribution of allocations potentially triggering preemption. You'd need to handle this specially in the RM/scheduler to handle such scenarios. The existing mechanism would/should work by simply killing off containers when necessary. The container fault tolerant mechanism would/should take care of the rest (including preemption). We can do a better job to differentiate the faults induced by preemption, which would be straight-forward if we expose a preemption API, when we get around to implement the preemption feature. If container suspend/resume API is implemented, we can do that as well. bq. It depends how you design you AM that handles unmanaged containers. You could request several small resources on peak and then release them as you don't need them. This requires many missing features in RM in order to work properly: finer grain OS/application resource metrics, application priority, conflict arbitration, preemption and related security features (mostly related authorization stuff). This approach is also problematic to support coexistence of different instances/versions of YARN on the same physical cluster. bq. It is adding a new one, that is a change. The change doesn't affect existing/future YARN applications. The management protocol allows existing/future cluster schedulers to expose appropriate resource views to (multiple instances/versions of) YARN in a straight forward manner. IMO, the solution is orthogonal and to what you have proposed. It allows any existing non-YARN applications to efficiently coexist with YARN applications without having to write a special AM using "unmanaged resource" API, with no new features "required" in YARN now. In other words, it is a simple solution to allow YARN to coexist with other schedulers (including other instances/versions of YARN) that already have the features people use/want. I'd be interested in hearing cases, where our approach "breaks" YARN applications in any way. > Dynamic resource configuration > ------------------------------ > > Key: YARN-291 > URL: https://issues.apache.org/jira/browse/YARN-291 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager, scheduler > Reporter: Junping Du > Assignee: Junping Du > Labels: features > Attachments: Elastic Resources for YARN-v0.2.pdf, > YARN-291-AddClientRMProtocolToSetNodeResource-03.patch, > YARN-291-all-v1.patch, YARN-291-core-HeartBeatAndScheduler-01.patch, > YARN-291-JMXInterfaceOnNM-02.patch, > YARN-291-OnlyUpdateWhenResourceChange-01-fix.patch, > YARN-291-YARNClientCommandline-04.patch > > > The current Hadoop YARN resource management logic assumes per node resource > is static during the lifetime of the NM process. Allowing run-time > configuration on per node resource will give us finer granularity of resource > elasticity. This allows Hadoop workloads to coexist with other workloads on > the same hardware efficiently, whether or not the environment is virtualized. > About more background and design details, please refer: HADOOP-9165. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira