[jira] [Created] (HDFS-17094) EC: Fix bug in block recovery when there are stale datanodes

Shuyan Zhang (Jira) Tue, 18 Jul 2023 03:25:15 -0700

Shuyan Zhang created HDFS-17094:
-----------------------------------

             Summary: EC: Fix bug in block recovery when there are stale 
datanodes
                 Key: HDFS-17094
                 URL: https://issues.apache.org/jira/browse/HDFS-17094
             Project: Hadoop HDFS
          Issue Type: Bug
            Reporter: Shuyan Zhang



When a block recovery occurs, `RecoveryTaskStriped` in datanode expects 
`rBlock.getLocations()` and `rBlock. getBlockIndices()` to be in one-to-one 
correspondence. However, if there are locations in stale state when NameNode 
handles heartbeat, this correspondence will be disrupted. In detail, there is 
no stale location in `recoveryLocations`, but the block indices array is still 
complete (i.e. contains the indices of all the locations). This will cause 
`BlockRecoveryWorker.RecoveryTaskStriped#recover` to generate a wrong internal 
block ID, and the corresponding datanode cannot find the relica, thus making 
the recovery process fail. This bug needs to be fixed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Created] (HDFS-17094) EC: Fix bug in block recovery when there are stale datanodes

Reply via email to