ResourceManager crash on deleted NM node back from the dead

2014-03-03 Thread John Lilley
We had a DN/NM node that went offline for a while and been removed from the cluster via Ambari without decommissioning (because it was offline). When the node came back up, its NM attempted connection to the RM. Later the RM failed with this exception (dokken is the errant node): 2014-03-03

Re: ResourceManager crash on deleted NM node back from the dead

2014-03-03 Thread Jian He
Hi, I believe this is recently fixed in https://issues.apache.org/jira/browse/YARN-713, and will be part of 2.4.0 release. Jian On Mon, Mar 3, 2014 at 5:19 AM, John Lilley john.lil...@redpoint.netwrote: We had a DN/NM node that went offline for a while and been removed from the cluster via