Re: NameNode failure and recovery!

2013-04-03 Thread Rahul Bhattacharjee
Or both the options are used together. NFS + SNN ? On Wed, Apr 3, 2013 at 8:10 PM, Rahul Bhattacharjee rahul.rec@gmail.com wrote: Hi all, I was reading about Hadoop and got to know that there are two ways to protect against the name node failures. 1) To write to a nfs mount along

RE: NameNode failure and recovery!

2013-04-03 Thread Vijay Thakorlal
Hi Rahul, The SNN does not act as a backup / standby NameNode in the event of failure. The sole purpose of the Secondary NameNode (or as it’s otherwise / more correctly known as the Checkpoint Node) is to perform checkpointing of the current state of HDFS: The SNN retrieves the

Re: NameNode failure and recovery!

2013-04-03 Thread Mohammad Tariq
Hello Rahul, It's always better to have both 1 and 2 together. One common misconception is that SNN is a backup of the NN, which is wrong. SNN is a helper node to the NN. In case of any failure SNN is not gonna take up the NN spot. Yes, we can't guarantee that the SNN fsimage replica will

Re: NameNode failure and recovery!

2013-04-03 Thread Mohammad Tariq
@Vijay : We seem to be in 100% sync though :) Warm Regards, Tariq https://mtariq.jux.com/ cloudfront.blogspot.com On Wed, Apr 3, 2013 at 8:27 PM, Mohammad Tariq donta...@gmail.com wrote: Hello Rahul, It's always better to have both 1 and 2 together. One common misconception is that

Re: NameNode failure and recovery!

2013-04-03 Thread Rahul Bhattacharjee
Thanks to all of you for precise and complete responses. S ​o in case of failure we have to bring another backup system up with the fsimage and edit logs from the NFS filer. SNN stays as is for the new NN. Thanks, Rahul​ On Wed, Apr 3, 2013 at 8:38 PM, Azuryy Yu azury...@gmail.com wrote: for

Re: NameNode failure and recovery!

2013-04-03 Thread Harsh J
There is a 3rd, most excellent way: Use HDFS's own HA, see http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailabilityWithQJM.html :) On Wed, Apr 3, 2013 at 8:10 PM, Rahul Bhattacharjee rahul.rec@gmail.com wrote: Hi all, I was reading about Hadoop and got to

Re: NameNode failure and recovery!

2013-04-03 Thread shashwat shriparv
If you are not in position to go for HA just keep your checkpoint period shorter to have recent data recoverable from SNN. and you always have a option hadoop namenode -recover try this on testing cluster and get versed to it. and take backup of image at some solid state storage. ∞ Shashwat

Re: NameNode failure and recovery!

2013-04-03 Thread Rahul Bhattacharjee
Thats also doable , reducing the checkpoint period would also have have some amount of edit log loss and how short should be the checkpoint interval has to be evaluated.I think the good way to go , in case HA is not doable is SNN and secondary storage NFS. Thanks, Rahul On Thu, Apr 4, 2013 at