Chenyu Zheng created HDFS-17542:
-----------------------------------

             Summary: EC: Optimize the EC block reconstruction.
                 Key: HDFS-17542
                 URL: https://issues.apache.org/jira/browse/HDFS-17542
             Project: Hadoop HDFS
          Issue Type: Improvement
            Reporter: Chenyu Zheng
            Assignee: Chenyu Zheng


The current reconstruction process of EC blocks is based on the original 
contiguous blocks. It is mainly implemented through the work constructed by 
computeReconstructionWorkForBlocks. It can be roughly divided into three 
processes:
 * scheduleReconstruction
 * chooseTargets
 * validateReconstructionWork

For ordinary contiguous blocks:

* (1) scheduleReconstruction

Select srcNodes as the source of the copy block according to the status of each 
replica of the block. 

* (2) chooseTargets

Select the target of the copy.

* (3) validateReconstructionWork

Add the copy command to srcNode, srcNode receives the command through 
heartbeat, and executes the block copy from src to target.

For EC blocks:
(1) and (2) are nearly same. However, in (3), block copying or block 
reconstruction may occur, or no work may be generated, such as when some 
storage are busy. If no work is generated, it will lead to the problem 
described in HDFS-17516. Even if no block copying or block reconstruction is 
generated, pendingReconstruction and neededReconstruction will still be updated 
until the block times out, which wastes the scheduling opportunity.
In order to be compatible with the original contiguous blocks and decide the 
specific action in (3), unnecessary liveBlockIndices, liveBusyBlockIndices, and 
excludeReconstructedIndices are introduced. We know many bug is related here. 
These can be avoided.

Improvements:
* Move the work of deciding whether to copy or reconstruct blocks from (3) to 
(1).

Such improvements are more conducive to implementing the explicit specification 
of the reconstruction block index mentioned in HDFS-16874, and do not need to 
pass liveBlockIndices, liveBusyBlockIndice.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

Reply via email to