HDFS Design Documentation is outdated -------------------------------------
Key: HDFS-1612 URL: https://issues.apache.org/jira/browse/HDFS-1612 Project: Hadoop HDFS Issue Type: Bug Components: documentation Affects Versions: 0.21.0, 0.20.2 Environment: http://hadoop.apache.org/hdfs/docs/current/hdfs_design.html#The+Persistence+of+File+System+Metadata http://hadoop.apache.org/common/docs/r0.20.2/hdfs_design.html#The+Persistence+of+File+System+Metadata Reporter: Joe Crobak Priority: Minor I was trying to discover details about the Secondary NameNode, and came across the description below in the HDFS design doc. {quote} The NameNode keeps an image of the entire file system namespace and file Blockmap in memory. This key metadata item is designed to be compact, such that a NameNode with 4 GB of RAM is plenty to support a huge number of files and directories. When the NameNode starts up, it reads the FsImage and EditLog from disk, applies all the transactions from the EditLog to the in-memory representation of the FsImage, and flushes out this new version into a new FsImage on disk. It can then truncate the old EditLog because its transactions have been applied to the persistent FsImage. This process is called a checkpoint. *In the current implementation, a checkpoint only occurs when the NameNode starts up. Work is in progress to support periodic checkpointing in the near future.* {quote} (emphasis mine). Note that this directly conflicts with information in the hdfs user guide, http://hadoop.apache.org/common/docs/r0.20.2/hdfs_user_guide.html#Secondary+NameNode and http://hadoop.apache.org/hdfs/docs/current/hdfs_user_guide.html#Checkpoint+Node I haven't done a thorough audit of that doc-- I only noticed the above inaccuracy. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira