[ 
https://issues.apache.org/jira/browse/YARN-291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13619374#comment-13619374
 ] 

Luke Lu commented on YARN-291:
------------------------------

bq. Changes in NM capacity triggered from outside of the regular scheduling 
would unbalance existing distribution of allocations potentially triggering 
preemption. You'd need to handle this specially in the RM/scheduler to handle 
such scenarios.

The existing mechanism would/should work by simply killing off containers when 
necessary. The container fault tolerant mechanism would/should take care of the 
rest (including preemption). We can do a better job to differentiate the faults 
induced by preemption, which would be straight-forward if we expose a 
preemption API, when we get around to implement the preemption feature. If 
container suspend/resume API is implemented, we can do that as well.

bq. It depends how you design you AM that handles unmanaged containers. You 
could request several small resources on peak and then release them as you 
don't need them.

This requires many missing features in RM in order to work properly: finer 
grain OS/application resource metrics, application priority, conflict 
arbitration, preemption and related security features (mostly related 
authorization stuff). This approach is also problematic to support coexistence 
of different instances/versions of YARN on the same physical cluster.

bq. It is adding a new one, that is a change.

The change doesn't affect existing/future YARN applications. The management 
protocol allows existing/future cluster schedulers to expose appropriate 
resource views to (multiple instances/versions of) YARN in a straight forward 
manner.

IMO, the solution is orthogonal and to what you have proposed. It allows any 
existing non-YARN applications to efficiently coexist with YARN applications 
without having to write a special AM using "unmanaged resource" API, with no 
new features "required" in YARN now. In other words, it is a simple solution to 
allow YARN to coexist with other schedulers (including other instances/versions 
of YARN) that already have the features people use/want.

I'd be interested in hearing cases, where our approach "breaks" YARN 
applications in any way.


                
> Dynamic resource configuration
> ------------------------------
>
>                 Key: YARN-291
>                 URL: https://issues.apache.org/jira/browse/YARN-291
>             Project: Hadoop YARN
>          Issue Type: New Feature
>          Components: nodemanager, scheduler
>            Reporter: Junping Du
>            Assignee: Junping Du
>              Labels: features
>         Attachments: Elastic Resources for YARN-v0.2.pdf, 
> YARN-291-AddClientRMProtocolToSetNodeResource-03.patch, 
> YARN-291-all-v1.patch, YARN-291-core-HeartBeatAndScheduler-01.patch, 
> YARN-291-JMXInterfaceOnNM-02.patch, 
> YARN-291-OnlyUpdateWhenResourceChange-01-fix.patch, 
> YARN-291-YARNClientCommandline-04.patch
>
>
> The current Hadoop YARN resource management logic assumes per node resource 
> is static during the lifetime of the NM process. Allowing run-time 
> configuration on per node resource will give us finer granularity of resource 
> elasticity. This allows Hadoop workloads to coexist with other workloads on 
> the same hardware efficiently, whether or not the environment is virtualized. 
> About more background and design details, please refer: HADOOP-9165.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to