Hi Jan, Don't confuse with the backupnode/checkpoint nodes here.
The new HA architecture mainly targetted to build HA with Namenode states. 1) Active Namenode 2) Standby Namenode When you start NN, they both will start in standby mode bydefault. then you can switch one NN to active state by giving ha admin commands or by configuring ZKFC( auto failover) process(not release officially yet). So, the NN state will start required services accordingly. This is almost like a new implementation for StandbyNode checkpointing process. Active NN will write edits to local dirs and shared NN dirs. Standby node will keep tail the edits from Shared NN dirs. Coming to this Shared storage part: Currently there are 3 options. 1) NFS filers ( mey need to buy external devices) 2) BookKeeper ( Its a subproject of open source ZooKeeper). This is mainly inspired by NN. This is high performance write ahead logging system. and also it can scale to more nodes depending on usage dynamically. Now the integration with BookKeeper already available and we are running the some clusters with that. HDFS-3399 3) Other option is Quorum based approach, this is under development. This is mainly aimed to develop shared storage nodes inside HDFS itself and can make use of proven RPC protocols for unified security mechanisms and use the proven edits storage layers. HDFS-3077. I hope, this will give more idea on current HA in community. Regards, Uma ________________________________________ From: Jan Van Besien [ja...@ngdata.com] Sent: Thursday, August 16, 2012 1:41 PM To: user@hadoop.apache.org Subject: checkpointnode backupnode hdfs HA I am a bit confused about the different options for namenode high availability (or something along those lines) in CDH4 (hadoop-2.0.0). I understand that the secondary namenode is deprecated, and that there are two options to replace it: checkpoint or backup namenodes. Both are well explained in the documentation, but the confusion begins when reading about "HDFS High Availability", for example here: http://hadoop.apache.org/common/docs/r2.0.0-alpha/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailability.html Is the topic "HDFS High Availability" as described there (using shared storage) related to checkpoint/backup nodes. If so, in what way? If I read about backup nodes, it also seems to be aimed at high availability. From what I understood, the current implementation doesn't provide (warm) fail-over yet, but this is planned. So starting to replace secondary namenodes now with backup namenodes sounds like a future proof idea? thanks, Jan