[ 
https://issues.apache.org/jira/browse/YARN-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14725533#comment-14725533
 ] 

MENG DING commented on YARN-1651:
---------------------------------

Hi, [~leftnoteasy], thanks so much for posting the patch.  

I do have one question regarding the patch. Recall during the design 
discussion, we agreed that as long as an increase has not yet completed for a 
container, we should not process any other increase/decrease requests for the 
same container. It seems that this patch will still process decrease/increase 
requests even an increase action is ongoing? 

If the following sequence of events happen:
Example 1:
1. AM sends container increase request to RM
2. RM allocates the resource and gives out increase token to AM
3. AM sends decrease request to RM for the same container
4. AM uses the increase token to increase resource on NM
5. NM reports container status back to RM

IIUC, at step 3, this patch will decrease the container size, and remove the 
container from allocation expirer. At step 5, this patch will see that the RM 
container size is smaller than the reported NM container size, and will tell NM 
to decrease the container resource. The concern I have with this approach is 
that in step 4, the user will think that the increase is successfully done in 
NM, but in fact it won't. 

Also, what will happen in the following sequence of events?
Example 2:
1. AM sends container increase request to RM
2. RM allocates the resource and gives out increase token (token1) to AM
3. AM sends a new container increase request for the same container to RM with 
more resource
4. RM allocates the resource and gives out increase token (token2) to AM
5. AM uses token1 (the one with smaller size) to increase resource on NM, but 
not token2

IIUC, when RM receives the increase report from NM, it will find out that the 
RM container size is larger than the reported NM container size, and do nothing 
about it, later on when token2 expires, the entire container will be killed 
according to the current implementation. I think this behavior could be quite 
confusing to the user.

IMHO, at least for the case in example 2, we should delay processing of the 
second increase request until the first increase action is completed.

> CapacityScheduler side changes to support increase/decrease container 
> resource.
> -------------------------------------------------------------------------------
>
>                 Key: YARN-1651
>                 URL: https://issues.apache.org/jira/browse/YARN-1651
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager, scheduler
>            Reporter: Wangda Tan
>            Assignee: Wangda Tan
>         Attachments: YARN-1651-1.YARN-1197.patch, 
> YARN-1651-WIP.YARN-1197.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to