[jira] [Created] (HDFS-17542) EC: Optimize the EC block reconstruction.
Chenyu Zheng created HDFS-17542: --- Summary: EC: Optimize the EC block reconstruction. Key: HDFS-17542 URL: https://issues.apache.org/jira/browse/HDFS-17542 Project: Hadoop HDFS Issue Type: Improvement Reporter: Chenyu Zheng Assignee: Chenyu Zheng The current reconstruction process of EC blocks is based on the original contiguous blocks. It is mainly implemented through the work constructed by computeReconstructionWorkForBlocks. It can be roughly divided into three processes: * scheduleReconstruction * chooseTargets * validateReconstructionWork For ordinary contiguous blocks: * (1) scheduleReconstruction Select srcNodes as the source of the copy block according to the status of each replica of the block. * (2) chooseTargets Select the target of the copy. * (3) validateReconstructionWork Add the copy command to srcNode, srcNode receives the command through heartbeat, and executes the block copy from src to target. For EC blocks: (1) and (2) are nearly same. However, in (3), block copying or block reconstruction may occur, or no work may be generated, such as when some storage are busy. If no work is generated, it will lead to the problem described in HDFS-17516. Even if no block copying or block reconstruction is generated, pendingReconstruction and neededReconstruction will still be updated until the block times out, which wastes the scheduling opportunity. In order to be compatible with the original contiguous blocks and decide the specific action in (3), unnecessary liveBlockIndices, liveBusyBlockIndices, and excludeReconstructedIndices are introduced. We know many bug is related here. These can be avoided. Improvements: * Move the work of deciding whether to copy or reconstruct blocks from (3) to (1). Such improvements are more conducive to implementing the explicit specification of the reconstruction block index mentioned in HDFS-16874, and do not need to pass liveBlockIndices, liveBusyBlockIndice. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-17521) Erasure Coding: Fix calculation errors caused by special index order
Chenyu Zheng created HDFS-17521: --- Summary: Erasure Coding: Fix calculation errors caused by special index order Key: HDFS-17521 URL: https://issues.apache.org/jira/browse/HDFS-17521 Project: Hadoop HDFS Issue Type: Bug Reporter: Chenyu Zheng Assignee: Chenyu Zheng -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-17516) Erasure Coding: Some reconstruction blocks and metrics are inaccuracy when decommission DN which contains many EC blocks.
Chenyu Zheng created HDFS-17516: --- Summary: Erasure Coding: Some reconstruction blocks and metrics are inaccuracy when decommission DN which contains many EC blocks. Key: HDFS-17516 URL: https://issues.apache.org/jira/browse/HDFS-17516 Project: Hadoop HDFS Issue Type: Improvement Reporter: Chenyu Zheng Assignee: Chenyu Zheng Attachments: 截屏2024-05-09 下午3.59.22.png, 截屏2024-05-09 下午3.59.44.png When decommission DN which contains many EC blocks, this DN will mark as busy by scheduleReconstruction, then ErasureCodingWork::addTaskToDatanode will not generate any block to ecBlocksToBeReplicated. Although no DNA_TRANSFER BlockCommand will be generated for this block, pendingReconstruction and neededReconstruction are still updated, and blockmanager mistakenly believes that the block is being copied. The periodic increases of Metrics `fs_namesystem_num_timed_out_pending_reconstructions` and `fs_namesystem_under_replicated_blocks` also prove this. In fact, many blocks are not actually copied. These blocks are re-added to neededReconstruction until they time out. !截屏2024-05-09 下午3.59.44.png!!截屏2024-05-09 下午3.59.22.png! -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-17515) Erasure Coding: ErasureCodingWork is not effectively limited during a block reconstruction cycle.
Chenyu Zheng created HDFS-17515: --- Summary: Erasure Coding: ErasureCodingWork is not effectively limited during a block reconstruction cycle. Key: HDFS-17515 URL: https://issues.apache.org/jira/browse/HDFS-17515 Project: Hadoop HDFS Issue Type: Improvement Reporter: Chenyu Zheng Assignee: Chenyu Zheng In a block reconstruction cycle, ErasureCodingWork is not effectively limited. I add some debug log, log when ecBlocksToBeReplicated is an integer multiple of 100. {code:java} 2024-05-09 10:46:06,986 DEBUG org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManagerZCY: ecBlocksToBeReplicated for IP:PORT already have 100 blocks 2024-05-09 10:46:06,987 DEBUG org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManagerZCY: ecBlocksToBeReplicated for IP:PORT already have 200 blocks ... 2024-05-09 10:46:06,992 DEBUG org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManagerZCY: ecBlocksToBeReplicated for IP:PORT already have 2000 blocks 2024-05-09 10:46:06,992 DEBUG org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManagerZCY: ecBlocksToBeReplicated for IP:PORT already have 2100 blocks {code} During a block reconstruction cycle, ecBlocksToBeReplicated increases from 0 to 2100, This is much larger than replicationStreamsHardLimit. This brings unfairness and leads to a greater tendency to copy EC blocks. In fact, for non ec block, this is not a problem. pendingReplicationWithoutTargets increase when schedule work. When pendingReplicationWithoutTargets is too big, will not schedule work for this node. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org