[
https://issues.apache.org/jira/browse/HDFS-17516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chenyu Zheng resolved HDFS-17516.
---------------------------------
Resolution: Duplicate
> Erasure Coding: Some reconstruction blocks and metrics are inaccuracy when
> decommission DN which contains many EC blocks.
> --------------------------------------------------------------------------------------------------------------------------
>
> Key: HDFS-17516
> URL: https://issues.apache.org/jira/browse/HDFS-17516
> Project: Hadoop HDFS
> Issue Type: Improvement
> Reporter: Chenyu Zheng
> Assignee: Chenyu Zheng
> Priority: Major
> Attachments: 截屏2024-05-09 下午3.59.22.png, 截屏2024-05-09 下午3.59.44.png
>
>
> When decommission DN which contains many EC blocks, this DN will mark as
> busy by scheduleReconstruction, then ErasureCodingWork::addTaskToDatanode
> will not generate any block to ecBlocksToBeReplicated.
> Although no DNA_TRANSFER BlockCommand will be generated for this block,
> pendingReconstruction and neededReconstruction are still updated, and
> blockmanager mistakenly believes that the block is being copied.
> The periodic increases of Metrics
> `fs_namesystem_num_timed_out_pending_reconstructions` and
> `fs_namesystem_under_replicated_blocks` also prove this. In fact, many blocks
> are not actually copied. These blocks are re-added to neededReconstruction
> until they time out.
> !截屏2024-05-09 下午3.59.44.png|width=470,height=160!!截屏2024-05-09
> 下午3.59.22.png|width=465,height=160!
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]