[ 
https://issues.apache.org/jira/browse/HDFS-6094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13935500#comment-13935500
 ] 

Arpit Agarwal commented on HDFS-6094:
-------------------------------------

Jing, I think it is a good idea to learn about storages from the IBR.

One issue with doing so is that the storage type and state are not known while 
processing the IBR. We can assume some defaults but this can lead to bugs since 
the  type and state can be used to make replication decisions. I think we need 
to enhance the incremental report protocol to send the storage type and state 
along with the storage ID. Then we can safely create a new storage entry. For 
protocol compatibility we can assume defaults if the type and state are not 
provided. I am going to code up the patch.

Thanks for the ideas!

> The same block can be counted twice towards safe mode threshold
> ---------------------------------------------------------------
>
>                 Key: HDFS-6094
>                 URL: https://issues.apache.org/jira/browse/HDFS-6094
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 2.4.0
>            Reporter: Arpit Agarwal
>            Assignee: Arpit Agarwal
>         Attachments: HDFS-6904.01.patch, TestHASafeMode-output.txt
>
>
> {{BlockManager#addStoredBlock}} can cause the same block can be counted 
> towards safe mode threshold. We see this manifest via 
> {{TestHASafeMode#testBlocksAddedWhileStandbyIsDown}} failures on Ubuntu. More 
> details to follow in a comment.
> Exception details:
> {code}
>   Time elapsed: 12.874 sec  <<< FAILURE!
> java.lang.AssertionError: Bad safemode status: 'Safe mode is ON. The reported 
> blocks 7 has reached the threshold 0.9990 of total blocks 6. The number of 
> live datanodes 3 has reached the minimum number 0. Safe mode will be turned 
> off automatically in 28 seconds.'
>         at org.junit.Assert.fail(Assert.java:93)
>         at org.junit.Assert.assertTrue(Assert.java:43)
>         at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode.assertSafeMode(TestHASafeMode.java:493)
>         at 
> org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode.testBlocksAddedWhileStandbyIsDown(TestHASafeMode.java:660)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to