[ https://issues.apache.org/jira/browse/HDFS-1435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12919079#action_12919079 ]
Doug Cutting commented on HDFS-1435: ------------------------------------ Hairong, Avro's file format has little overhead. It supports compression. However it assumes that a file is composed of a sequence of entries with a the same schema. The fsimage has various sections. The header information could be added as Avro file metadata. The files and directories, datanodes and files under construction are currently written as separate blocks. Instead, the schema for every item might be something like a union of [File, Directory, Symlink, DataNode, FileUnderConstruction]. > Provide an option to store fsimage compressed > --------------------------------------------- > > Key: HDFS-1435 > URL: https://issues.apache.org/jira/browse/HDFS-1435 > Project: Hadoop HDFS > Issue Type: Improvement > Components: name-node > Affects Versions: 0.22.0 > Reporter: Hairong Kuang > Assignee: Hairong Kuang > Fix For: 0.22.0 > > > Our HDFS has fsimage as big as 20G bytes. It consumes a lot of network > bandwidth when secondary NN uploads a new fsimage to primary NN. > If we could store fsimage compressed, the problem could be greatly alleviated. > I plan to provide a new configuration hdfs.image.compressed with a default > value of false. If it is set to be true, fsimage is stored as compressed. > The fsimage will have a new layout with a new field "compressed" in its > header, indicating if the namespace is stored compressed or not. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.