[ 
https://issues.apache.org/jira/browse/HDFS-15634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17844437#comment-17844437
 ] 

Adam Binford edited comment on HDFS-15634 at 5/7/24 8:47 PM:
-------------------------------------------------------------

Chiming in on a 4 year old issue, we've hit similar issues where returning a 
node after decommissioning it cause the active namenode to failover because it 
holds the write lock too long processing the redundant blocks. In our case we 
have around ~1.2 million blocks on a single data node across three drives. I 
assume `DatanodeStorageInfo` represents a single drive? In which case the write 
lock is only released after processing all blocks on a single drive.


was (Author: kimahriman):
Chiming in on a 4 year old issue, we've hit similar issues where returning a 
node after decommissioning it cause the active namenode to failover because it 
holds the write lock too long processing the redundant blocks. In our case we 
have around ~120k blocks on a single data node across three drives. I assume 
`DatanodeStorageInfo` represents a single drive? In which case the write lock 
is only released after processing all blocks on a single drive.

> Invalidate block on decommissioning DataNode after replication
> --------------------------------------------------------------
>
>                 Key: HDFS-15634
>                 URL: https://issues.apache.org/jira/browse/HDFS-15634
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs
>            Reporter: Fengnan Li
>            Assignee: Fengnan Li
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: write lock.png
>
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> Right now when a DataNode starts decommission, Namenode will mark it as 
> decommissioning and its blocks will be replicated over to different 
> DataNodes, then marked as decommissioned. These blocks are not touched since 
> they are not counted as live replicas.
> Proposal: Invalidate these blocks once they are replicated and there are 
> enough live replicas in the cluster.
> Reason: A recent shutdown of decommissioned datanodes to finished the flow 
> caused Namenode latency spike since namenode needs to remove all of the 
> blocks from its memory and this step requires holding write lock. If we have 
> gradually invalidated these blocks the deletion will be much easier and 
> faster.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to