[ https://issues.apache.org/jira/browse/HDFS-512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12738533#action_12738533 ]
Konstantin Shvachko commented on HDFS-512: ------------------------------------------ Most of the maps Nicholas listed above are {{HashMap}}s. They are based on {{Block.hash()}} method, which is not modified by the patch, and has never used generation stamp in calculating block's hash. I found only 4 maps, which use {{TreeSet<Block>}} or {{TreeMap}} with the {{Block}} as a key. Here they are: # UnderReplicatedBlocks # BlockManager.excessReplicateMap # CorruptReplicasMap # DatanodeDescriptor.invalidateBlocks Neither of them need to know about generation stamp. I think it is safe to make the change. We should commit it to the append branch. Additional comments: - {{getReplicaInfo()}} adds generation stamp checking. I don't think this is necessary. - comment {{// ... ignore generation stamp!!!}} is misleading, should be removed. - {{ReplicaInfo.setGenStamp(), getGenStamp()}} should rather be called {{setGenerationStamp(), getGenerationStamp()}} - Why does {{ReplicaInfo}} need genStamp field. Don't we always have it in {{Block}}? If we do could you please add a comment clarifying what this field actually is. > Set block id as the key to Block > -------------------------------- > > Key: HDFS-512 > URL: https://issues.apache.org/jira/browse/HDFS-512 > Project: Hadoop HDFS > Issue Type: Improvement > Affects Versions: 0.21.0 > Reporter: Hairong Kuang > Assignee: Hairong Kuang > Fix For: 0.21.0 > > Attachments: blockKey.patch > > > Currently the key to Block is block id + generation stamp. I would propose to > change it to be only block id. This is based on the following properties of > the dfs cluster: > 1. On each datanode only one replica of block exists. Therefore there is only > one generation of a block. > 2. NameNode has only one entry for a block in its blocks map. > With this change, search for a block/replica's meta information is easier > since most of the time we know a block's id but may not know its generation > stamp. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.