[ https://issues.apache.org/jira/browse/HDFS-6094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13935500#comment-13935500 ]
Arpit Agarwal commented on HDFS-6094: ------------------------------------- Jing, I think it is a good idea to learn about storages from the IBR. One issue with doing so is that the storage type and state are not known while processing the IBR. We can assume some defaults but this can lead to bugs since the type and state can be used to make replication decisions. I think we need to enhance the incremental report protocol to send the storage type and state along with the storage ID. Then we can safely create a new storage entry. For protocol compatibility we can assume defaults if the type and state are not provided. I am going to code up the patch. Thanks for the ideas! > The same block can be counted twice towards safe mode threshold > --------------------------------------------------------------- > > Key: HDFS-6094 > URL: https://issues.apache.org/jira/browse/HDFS-6094 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode > Affects Versions: 2.4.0 > Reporter: Arpit Agarwal > Assignee: Arpit Agarwal > Attachments: HDFS-6904.01.patch, TestHASafeMode-output.txt > > > {{BlockManager#addStoredBlock}} can cause the same block can be counted > towards safe mode threshold. We see this manifest via > {{TestHASafeMode#testBlocksAddedWhileStandbyIsDown}} failures on Ubuntu. More > details to follow in a comment. > Exception details: > {code} > Time elapsed: 12.874 sec <<< FAILURE! > java.lang.AssertionError: Bad safemode status: 'Safe mode is ON. The reported > blocks 7 has reached the threshold 0.9990 of total blocks 6. The number of > live datanodes 3 has reached the minimum number 0. Safe mode will be turned > off automatically in 28 seconds.' > at org.junit.Assert.fail(Assert.java:93) > at org.junit.Assert.assertTrue(Assert.java:43) > at > org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode.assertSafeMode(TestHASafeMode.java:493) > at > org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode.testBlocksAddedWhileStandbyIsDown(TestHASafeMode.java:660) > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)