[ 
https://issues.apache.org/jira/browse/HDFS-8676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14955775#comment-14955775
 ] 

Hudson commented on HDFS-8676:
------------------------------

FAILURE: Integrated in Hadoop-Hdfs-trunk #2428 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2428/])
HDFS-8676. Delayed rolling upgrade finalization can cause heartbeat (kihwal: 
rev 5b43db47a313decccdcca8f45c5708aab46396df)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataStorage.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockPoolSliceStorage.java


> Delayed rolling upgrade finalization can cause heartbeat expiration and write 
> failures
> --------------------------------------------------------------------------------------
>
>                 Key: HDFS-8676
>                 URL: https://issues.apache.org/jira/browse/HDFS-8676
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Kihwal Lee
>            Assignee: Walter Su
>            Priority: Critical
>             Fix For: 3.0.0, 2.7.2
>
>         Attachments: HDFS-8676.01.patch, HDFS-8676.02.patch
>
>
> In big busy clusters where the deletion rate is also high, a lot of blocks 
> can pile up in the datanode trash directories until an upgrade is finalized.  
> When it is finally finalized, the deletion of trash is done in the service 
> actor thread's context synchronously.  This blocks the heartbeat and can 
> cause heartbeat expiration.  
> We have seen a namenode losing hundreds of nodes after a delayed upgrade 
> finalization.  The deletion of trash directories should be made asynchronous.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to