[jira] [Created] (HDFS-17542) EC: Optimize the EC block reconstruction.

2024-06-06 Thread Chenyu Zheng (Jira)
Chenyu Zheng created HDFS-17542:
---

 Summary: EC: Optimize the EC block reconstruction.
 Key: HDFS-17542
 URL: https://issues.apache.org/jira/browse/HDFS-17542
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Chenyu Zheng
Assignee: Chenyu Zheng


The current reconstruction process of EC blocks is based on the original 
contiguous blocks. It is mainly implemented through the work constructed by 
computeReconstructionWorkForBlocks. It can be roughly divided into three 
processes:
 * scheduleReconstruction
 * chooseTargets
 * validateReconstructionWork

For ordinary contiguous blocks:

* (1) scheduleReconstruction

Select srcNodes as the source of the copy block according to the status of each 
replica of the block. 

* (2) chooseTargets

Select the target of the copy.

* (3) validateReconstructionWork

Add the copy command to srcNode, srcNode receives the command through 
heartbeat, and executes the block copy from src to target.

For EC blocks:
(1) and (2) are nearly same. However, in (3), block copying or block 
reconstruction may occur, or no work may be generated, such as when some 
storage are busy. If no work is generated, it will lead to the problem 
described in HDFS-17516. Even if no block copying or block reconstruction is 
generated, pendingReconstruction and neededReconstruction will still be updated 
until the block times out, which wastes the scheduling opportunity.
In order to be compatible with the original contiguous blocks and decide the 
specific action in (3), unnecessary liveBlockIndices, liveBusyBlockIndices, and 
excludeReconstructedIndices are introduced. We know many bug is related here. 
These can be avoided.

Improvements:
* Move the work of deciding whether to copy or reconstruct blocks from (3) to 
(1).

Such improvements are more conducive to implementing the explicit specification 
of the reconstruction block index mentioned in HDFS-16874, and do not need to 
pass liveBlockIndices, liveBusyBlockIndice.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-17521) Erasure Coding: Fix calculation errors caused by special index order

2024-05-11 Thread Chenyu Zheng (Jira)
Chenyu Zheng created HDFS-17521:
---

 Summary: Erasure Coding: Fix calculation errors caused by special 
index order
 Key: HDFS-17521
 URL: https://issues.apache.org/jira/browse/HDFS-17521
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Chenyu Zheng
Assignee: Chenyu Zheng






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-17516) Erasure Coding: Some reconstruction blocks and metrics are inaccuracy when decommission DN which contains many EC blocks.

2024-05-09 Thread Chenyu Zheng (Jira)
Chenyu Zheng created HDFS-17516:
---

 Summary: Erasure Coding: Some reconstruction blocks and metrics 
are inaccuracy when decommission DN  which contains many EC blocks.
 Key: HDFS-17516
 URL: https://issues.apache.org/jira/browse/HDFS-17516
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Chenyu Zheng
Assignee: Chenyu Zheng
 Attachments: 截屏2024-05-09 下午3.59.22.png, 截屏2024-05-09 下午3.59.44.png

When decommission DN  which contains many EC blocks, this DN will mark as busy 
by 
scheduleReconstruction, then ErasureCodingWork::addTaskToDatanode will not 
generate any block to ecBlocksToBeReplicated. 
Although no DNA_TRANSFER BlockCommand will be generated for this block, 
pendingReconstruction and neededReconstruction are still updated, and 
blockmanager mistakenly believes that the block is being copied.
The periodic increases of Metrics 
`fs_namesystem_num_timed_out_pending_reconstructions` and 
`fs_namesystem_under_replicated_blocks` also prove this. In fact, many blocks 
are not actually copied. These blocks are re-added to neededReconstruction 
until they time out.
!截屏2024-05-09 下午3.59.44.png!!截屏2024-05-09 下午3.59.22.png!
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-17515) Erasure Coding: ErasureCodingWork is not effectively limited during a block reconstruction cycle.

2024-05-09 Thread Chenyu Zheng (Jira)
Chenyu Zheng created HDFS-17515:
---

 Summary: Erasure Coding: ErasureCodingWork is not effectively 
limited during a block reconstruction cycle.
 Key: HDFS-17515
 URL: https://issues.apache.org/jira/browse/HDFS-17515
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Chenyu Zheng
Assignee: Chenyu Zheng


In a block reconstruction cycle, ErasureCodingWork is not effectively limited. 
I add some debug log, log when ecBlocksToBeReplicated is an integer multiple of 
100.

 
{code:java}
2024-05-09 10:46:06,986 DEBUG 
org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManagerZCY: 
ecBlocksToBeReplicated for IP:PORT already have 100 blocks
2024-05-09 10:46:06,987 DEBUG 
org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManagerZCY: 
ecBlocksToBeReplicated for IP:PORT already have 200 blocks
...
2024-05-09 10:46:06,992 DEBUG 
org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManagerZCY: 
ecBlocksToBeReplicated for IP:PORT already have 2000 blocks
2024-05-09 10:46:06,992 DEBUG 
org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManagerZCY: 
ecBlocksToBeReplicated for IP:PORT already have 2100 blocks {code}
 

During a block reconstruction cycle, ecBlocksToBeReplicated increases from 0 to 
2100, This is much larger than replicationStreamsHardLimit. This brings 
unfairness and leads to a greater tendency to copy EC blocks.

In fact, for non ec block, this is not a problem. 
pendingReplicationWithoutTargets increase when schedule work. When 
pendingReplicationWithoutTargets is too big, will not schedule work for this 
node.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org