[jira] [Commented] (HDFS-11609) Some blocks can be permanently lost if nodes are decommissioned while dead

Hudson (JIRA) Mon, 01 May 2017 12:43:37 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-11609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15991386#comment-15991386
 ]


Hudson commented on HDFS-11609:
-------------------------------

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11661 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/11661/])
HDFS-11609. Some blocks can be permanently lost if nodes are (kihwal: rev 
07b98e7830c2214340cb7f434df674057e89df94)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/LowRedundancyBlocks.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestDecommissioningStatus.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java


> Some blocks can be permanently lost if nodes are decommissioned while dead
> --------------------------------------------------------------------------
>
>                 Key: HDFS-11609
>                 URL: https://issues.apache.org/jira/browse/HDFS-11609
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 2.7.0
>            Reporter: Kihwal Lee
>            Assignee: Kihwal Lee
>            Priority: Blocker
>             Fix For: 2.7.4, 3.0.0-alpha3, 2.8.1
>
>         Attachments: HDFS-11609.branch-2.patch, HDFS-11609.trunk.patch, 
> HDFS-11609_v2.branch-2.patch, HDFS-11609_v2.trunk.patch, 
> HDFS-11609_v3.branch-2.7.patch, HDFS-11609_v3.branch-2.patch, 
> HDFS-11609_v3.trunk.patch
>
>
> When all the nodes containing a replica of a block are decommissioned while 
> they are dead, they get decommissioned right away even if there are missing 
> blocks. This behavior was introduced by HDFS-7374.
> The problem starts when those decommissioned nodes are brought back online. 
> The namenode no longer shows missing blocks, which creates a false sense of 
> cluster health. When the decommissioned nodes are removed and reformatted, 
> the block data is permanently lost. The namenode will report missing blocks 
> after the heartbeat recheck interval (e.g. 10 minutes) from the moment the 
> last node is taken down.
> There are multiple issues in the code. As some cause different behaviors in 
> testing vs. production, it took a while to reproduce it in a unit test. I 
> will present analysis and proposal soon.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-11609) Some blocks can be permanently lost if nodes are decommissioned while dead

Reply via email to