[ 
https://issues.apache.org/jira/browse/YARN-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13784061#comment-13784061
 ] 

Wangda Tan commented on YARN-1197:
----------------------------------

To [~bikassaha],
Thanks for your comments, see my opinions below,

{quote}
Still thinking through the RM-NM interactions. The request for change should 
probably be a new object that is basically a map of (containerId, Resource) 
where Resource is new value for the existing containerId. Not quite sure how we 
would use the new container token for a running container since its only used 
in start container.
{quote}

Agree, we need to update interface of YarnScheduler.allocate to accept this as 
a paramter if we make request for change independent.
And as you mentioned below, we can use the new token to update NM's resource 
monitoring limitations of containers.

{quote}
If we wait for RM to sync with NM about the increased resources then it might 
be too slow since this happens on a heartbeat and the heartbeat interval can be 
in the order of seconds. An alternative would be a new NM API to allow AM's to 
increase resources and this would be signed with new container token. But this 
would burden the AMs by requiring them to make that additional call.
{quote}

Agree, this is much more time-effective than RM-NM communications. Yes, it's a 
cost for both AM/NM for changing container size, but AM should be 
self-discipline not do this too frequent.

{quote}
There could be a race between a new container token coming in with increased 
resources for an acquired container and the old container token being used by 
the NMClient to launch the container (in case the AM decides to launch the 
smaller container while it was waiting for an increase).
{quote}

Hmmm... thanks for reminding, this is really a problem. I find another issue is 
AM may "lie" to RM/NM about resource usage, AM can
1) allocate a big container, launch it
2) ask for decrease the container, RM released resource in corresponding 
node/application
3) but AM doesn't tell NM about this decrease, it can still use resource before 
releasing in the container

I don't have a good idea to solve such problem now. Hope to get more idea from 
you about this, I will think it through as well.



> Support changing resources of an allocated container
> ----------------------------------------------------
>
>                 Key: YARN-1197
>                 URL: https://issues.apache.org/jira/browse/YARN-1197
>             Project: Hadoop YARN
>          Issue Type: Task
>          Components: api, nodemanager, resourcemanager
>    Affects Versions: 2.1.0-beta
>            Reporter: Wangda Tan
>         Attachments: yarn-1197.pdf
>
>
> Currently, YARN cannot support merge several containers in one node to a big 
> container, which can make us incrementally ask resources, merge them to a 
> bigger one, and launch our processes. The user scenario is described in the 
> comments.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to