Shuyan Zhang created HDFS-17094:
-----------------------------------
Summary: EC: Fix bug in block recovery when there are stale
datanodes
Key: HDFS-17094
URL: https://issues.apache.org/jira/browse/HDFS-17094
Project: Hadoop HDFS
Issue Type: Bug
Reporter: Shuyan Zhang
When a block recovery occurs, `RecoveryTaskStriped` in datanode expects
`rBlock.getLocations()` and `rBlock. getBlockIndices()` to be in one-to-one
correspondence. However, if there are locations in stale state when NameNode
handles heartbeat, this correspondence will be disrupted. In detail, there is
no stale location in `recoveryLocations`, but the block indices array is still
complete (i.e. contains the indices of all the locations). This will cause
`BlockRecoveryWorker.RecoveryTaskStriped#recover` to generate a wrong internal
block ID, and the corresponding datanode cannot find the relica, thus making
the recovery process fail. This bug needs to be fixed.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]