[ https://issues.apache.org/jira/browse/HDFS-315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Allen Wittenauer resolved HDFS-315. ----------------------------------- Resolution: Fixed I'm going to resolve this since FSimage and buddies have been version-ed for a very long time now. :D > Allow simplified versioning for namenode and datanode metadata. > --------------------------------------------------------------- > > Key: HDFS-315 > URL: https://issues.apache.org/jira/browse/HDFS-315 > Project: Hadoop HDFS > Issue Type: Improvement > Environment: All > Reporter: Milind Bhandarkar > Assignee: Sameer Paranjpye > Attachments: hadoop-224.patch > > > Currently namenode has two types of metadata: The FSImage, and FSEdits. > FSImage contains information abut Inodes, and FSEdits contains a list of > operations that were not saved to FSImage. Datanode currently does not have > any metadata, but would have it some day. > The file formats used for storing these metadata will evolve over time. It is > important for the file-system to be backward compatible. That is, the > metadata readers need to be able to identify which version of the file-format > we are using, and need to be able to read information therein. As we add > information to these metadata, the complexity of the reader increases > dramatically. > I propose a versioning scheme with a major and minor version number, where a > different reader class is associated with a major number, and that class > interprets the minor number internally. The readers essentially form a chain > starting with the latest version. Each version-reader looks at the file and > if it does not recognize the version number, passes it to the version reader > next to it by calling the parse method, returnng the results of the parse > method up the chain (In case of the namenode, the parse result is an array of > Inodes. > This scheme has an advantage that every time a new major version is added, > the new reader only needs to know about the reader for its immediately > previous version, and every reader needs to know only about which major > version numbers it can read. > The writer is not so versioned, because metadata is always written in the > most current version format. > One more change that is needed for simplified versioning is that the > "struct-surping" of dfs.Block needs to be removed. Block's contents will > change in later versions, and older versions should still be able to > readFields properly. This is more general than Block of course, and in > general only basic datatypes should be used as Writables in DFS metadata. > For edits, the reader should return <opcode, ArrayWritable> pairs' array. > This will also remove the limitation of two operands for very opcodes, and > will be more extensible. > Even with this new versioning scheme, the last Reader in the reader-chain > would recognize current format, thus maintaining full backward compatibility. -- This message was sent by Atlassian JIRA (v6.2#6252)