[ https://issues.apache.org/jira/browse/HDFS-512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12737604#action_12737604 ]
Hairong Kuang commented on HDFS-512: ------------------------------------ > Block is an existing base class and it is there for a very long time. We > cannot simply view it in one way yesterday and view it in another way today. I agree that Block is a long existing base class. But I disagree that we could not do any change to it. Adding generation stamp as a part of its key was introduced by the append project. As I think more on the new append design, I feel that this was a design flaw and has caused many problems that we could not handle multiple generation stamps. That's why I want to make the proposed change. As I said in the description of this jira, this change is based on the following facts: (1) On each datanode only one replica of block exists. Therefore there is only one generation of a block. (2) NameNode has only one entry for a block in its blocks map. So in all the maps that you mentioned, there should be only one entry of a block per block id. In either NN or DN, there is only one entry of blockInfo or replicaInfo per block. I do not think changing the key of the Block should cause any problem to all these data structures in dfs. If there are two generations of a block in those data structures, this is an error. Whether the key contains generation stamp or not, dfs should handle this error. I agree this may break some external applications. We could put this change in the release note. > Set block id as the key to Block > -------------------------------- > > Key: HDFS-512 > URL: https://issues.apache.org/jira/browse/HDFS-512 > Project: Hadoop HDFS > Issue Type: Improvement > Affects Versions: 0.21.0 > Reporter: Hairong Kuang > Assignee: Hairong Kuang > Fix For: 0.21.0 > > Attachments: blockKey.patch > > > Currently the key to Block is block id + generation stamp. I would propose to > change it to be only block id. This is based on the following properties of > the dfs cluster: > 1. On each datanode only one replica of block exists. Therefore there is only > one generation of a block. > 2. NameNode has only one entry for a block in its blocks map. > With this change, search for a block/replica's meta information is easier > since most of the time we know a block's id but may not know its generation > stamp. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.