Hi Rahul,
The SNN does not act as a backup / standby NameNode in the event of failure. The sole purpose of the Secondary NameNode (or as it’s otherwise / more correctly known as the Checkpoint Node) is to perform checkpointing of the current state of HDFS: The SNN retrieves the fsimage and edits files from the NN The NN rolls the edits file The SNN Loads the fsimage into memory Then the SNN replays the edits log file to merge the two Then the SNN transfers the merged checkpoint back to the NN The NN uses the checkpoint as the new fsimage file It’s true that technically you could use the fsimage from the SNN if completely lost the NN – and yes as you said you would “lose” any changes to HDFS that occurred between the NN dieing and the last time the checkpoint occurred. But as mentioned the SNN is not a backup for the NN. Regards, Vijay From: Rahul Bhattacharjee [mailto:rahul.rec....@gmail.com] Sent: 03 April 2013 15:40 To: user@hadoop.apache.org Subject: NameNode failure and recovery! Hi all, I was reading about Hadoop and got to know that there are two ways to protect against the name node failures. 1) To write to a nfs mount along with the usual local disk. -or- 2) Use secondary name node. In case of failure of NN , the SNN can take in charge. My questions :- 1) SNN is always lagging , so when SNN becomes primary in event of a NN failure , then the edits which have not been merged into the image file would be lost , so the system of SNN would not be consistent with the NN before its failure. 2) Also I have read that other purpose of SNN is to periodically merge the edit logs with the image file. In case a setup goes with option #1 (writing to NFS, no SNN) , then who does this merging. Thanks, Rahul