On Thu, Feb 23, 2012 at 12:41 AM, Jeremy Hansen <jer...@skidrow.la> wrote:
> Thanks. Could you clarify what BackupNode does? > > -jeremy > > Namenode currently keeps the entire file system namespace in memory. It logs the write operations (create, delete file etc.) into a journal file called editlog. This journal needs to be merged with the file system image periodically to avoid journal file growing to a large size. This is called checkpointing. Checkpoint also reduces the startup time, since the namenode need not load large editlog file. Prior to release 0.21, another node called SecondaryNamenode was used for checkpointing. It periodically gets the file system image and edit, load it into memory and write checkpoint image. This image is then then shipped to the Namenode. In 0.21, BackupNode was introduced. Unlike SecondaryNamenode, it gets edits streamed from the Namenode. It periodically writes the checkpoint image and ships it back to Namenode. The goal was for this to become Standby node, towards Namenode HA. Konstantin and few others are pursuing this. I have not seen any deployments of BackupNode in production. I would love to hear if any one has deployed it in production and how stable it is. Regards, Suresh