The situation is the same regardless of master/slave/client.

Basically, you have to
bring up a host with the same FQDN
install ambari-agent on it

At this point, any components that used to be on that host will report 
heartbeat lost and the cluster may not be fully operational if it contained 
masters (especially NameNode).
You may then have to restart services on that host, which will actually end up 
installing the bits again and generating configs.
The hard part is that you may have to run additional commands depending on the 
type of master, think of NameNode or even hosts that contain databases for 
Hive, Oozie, etc.

Attempting to move masters may be complicated because it may require the 
original host to be heartbeating and with the bits installed in order to be 
able to stop the services Ambari knows about.

Thanks,
Alejandro

From: cs user <acldstk...@gmail.com<mailto:acldstk...@gmail.com>>
Reply-To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
<user@ambari.apache.org<mailto:user@ambari.apache.org>>
Date: Thursday, March 3, 2016 at 5:00 AM
To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" 
<user@ambari.apache.org<mailto:user@ambari.apache.org>>
Subject: Recovering from a dead master namenode server

Hi All,

I'm trying to understand how to recover from certain failures within Ambari. 
When launching within a cloud environment, it's possible that a host may be 
completed deleted, and you won't have the chance to decommission the node.

For example, in the event that the server hosting the master hdfs namenode was 
lost, would it be possible to spin up another server in its place, built 
completely from scratch and have this replace the old namenode master?

Currently when I attempt to delete a failed host, it warns me that the 
following components need to be moved:

NameNode, Spark History Server

It also then tries to talk me through the process of copying data from the old 
namenode to the new namenode. If the server has been deleted, this would not be 
possible. Would it be possible to copy this data from the secondary namenode 
instead?

Many thanks in advance.

Cheers!


Reply via email to