[ 
https://issues.apache.org/jira/browse/YARN-914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14289286#comment-14289286
 ] 

Jason Lowe commented on YARN-914:
---------------------------------

bq. The first step I was thinking to keep NM running in a low resource mode 
after graceful decommissioned

I think it could be useful to leave the NM process up after the graceful 
decommission completes.  That allows automated decommissioning tools to know 
the process completed by querying the NM directly.  If the NM exits then the 
tool may have difficulty distinguishing between the NM crashing just before 
decommisioning completed vs. successful completion.  The RM will be tracking 
this state as well, so it may not be critical to do it one way or the other if 
the tool is querying the RM rather than the NM directly.

bq. However, I am not sure if they can handle state migration to new node ahead 
of predictable node lost here, or be stateless more or less make more sense 
here?

I agree with Ming that it would be nice if the graceful decommission process 
could give the AMs a "heads up" about what's going on.  The simplest way to 
accomplish that is to leverage the already existing preemption framework to 
tell the AM that YARN is about to take the resources away.  The 
StrictPreemptionContract portion of the PreemptionMessage can be used to list 
exact resources that YARN will be reclaiming and give the AM a chance to react 
to that before the containers are reclaimed.  It's then up to the AM if it 
wants to do anything special or just let the containers get killed after a 
timeout.

bq. These notification may still be necessary, so AM won't add these nodes into 
blacklist if container get killed afterwards. Thoughts?

I thought we could leverage the updated nodes list of the AllocateResponse to 
let AMs know when nodes are entering the decommissioning state or at least when 
the decommission state completes (and containers are killed).  Although if the 
AM adds the node to the blacklist, that's not such a bad thing either since the 
RM should never allocate new containers on a decommissioning node anyway.


> Support graceful decommission of nodemanager
> --------------------------------------------
>
>                 Key: YARN-914
>                 URL: https://issues.apache.org/jira/browse/YARN-914
>             Project: Hadoop YARN
>          Issue Type: Improvement
>    Affects Versions: 2.0.4-alpha
>            Reporter: Luke Lu
>            Assignee: Junping Du
>
> When NMs are decommissioned for non-fault reasons (capacity change etc.), 
> it's desirable to minimize the impact to running applications.
> Currently if a NM is decommissioned, all running containers on the NM need to 
> be rescheduled on other NMs. Further more, for finished map tasks, if their 
> map output are not fetched by the reducers of the job, these map tasks will 
> need to be rerun as well.
> We propose to introduce a mechanism to optionally gracefully decommission a 
> node manager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to