[ 
https://issues.apache.org/jira/browse/YARN-671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13802878#comment-13802878
 ] 

Jason Lowe commented on YARN-671:
---------------------------------

Note that simply waiting for containers to complete doesn't address auxiliary 
services like MapReduce's ShuffleHandler that want to serve up data after 
containers have completed.  If we simply wait until containers have completed 
and stop the NM immediately afterwards then we can cause fetch failures leading 
to a relaunch of map tasks that ran on that node.  In that case we would have 
been better off killing the map task immediately, as the application would have 
recovered faster.

Therefore without more intimate knowledge of what an auxiliary service is 
really doing with respect to the containers that ran on a node, we may have to 
generalize the timeout to all applications have completed that have run at 
least one container on the node.



> Add an interface on the RM to move NMs into a maintenance state
> ---------------------------------------------------------------
>
>                 Key: YARN-671
>                 URL: https://issues.apache.org/jira/browse/YARN-671
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>    Affects Versions: 2.0.4-alpha
>            Reporter: Siddharth Seth
>            Assignee: Siddharth Seth
>




--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to