chihsuan opened a new pull request, #10540:
URL: https://github.com/apache/ozone/pull/10540
## What changes were proposed in this pull request?
When a replication or EC reconstruction command fails on a datanode
(transient
network issue, busy datanode, etc.), SCM is never told. The pending "ADD"
operation stays in `ContainerReplicaPendingOps` and continues to count
against
the inflight replication accounting until the command's deadline expires,
which
defaults to `hdds.scm.replication.event.timeout` = 12 minutes.
This has two effects:
1. The cluster-wide inflight count
(`ReplicationManager#getInflightReplicationCount`,
gated by `UnderReplicatedProcessor`) fills up with stale entries, so SCM
stops
scheduling new replication even though the datanodes are idle.
2. The specific under-replicated container is not re-scheduled, because the
health
check still sees a pending ADD for that replica.
During decommission this is especially painful: thousands of commands are
issued,
and a small failure rate quickly leaks enough stale entries to stall
progress for
up to 12 minutes at a time.
This PR makes SCM clear a failed replication/reconstruction op proactively,
by
re-introducing the command-status feedback path that was removed in
HDDS-1368,
**without any Protobuf/wire change** (the `CommandStatus` message already
carries
`FAILED`, `cmdId`, and `type`):
- **Datanode** now reports `EXECUTED`/`FAILED` status for
`replicateContainerCommand`
and `reconstructECContainersCommand`, mirroring how `deleteBlocksCommand`
already
reports. `StateContext#addCmdStatus` registers a PENDING entry for these
commands,
`AbstractReplicationTask#getCommandId()` exposes the backing command id,
and
`ReplicationSupervisor.TaskRunner` marks the status when the task
finishes. Tasks
with no backing SCM command (e.g. reconcile) are unaffected.
- **SCM** routes failed statuses to the pending-op store:
`CommandStatusReportHandler`
fires a new `REPLICATION_STATUS` event for failed replicate/reconstruct
commands,
and `StorageContainerManager` wires it to a new
`ContainerReplicaPendingOps#onReplicationCommandFailed(cmdId)`, which
looks the
command up via a new `cmdId -> ContainerID` index and removes the matching
ADD op
(decrementing the inflight counter and freeing the scheduled size), so
both effects
above are resolved immediately instead of after the timeout.
Compatibility degrades gracefully: an old datanode against a new SCM simply
never
sends the failure report and falls back to the existing 12-minute timeout; a
new
datanode against an old SCM has its replication status ignored as before.
Follow-ups (intentionally out of scope here):
- A `MiniOzoneCluster` decommission integration test that induces replication
failures and asserts quota recovery.
- Reporting status on the `TaskRunner` early-return paths (deadline passed /
not in
service / stale term) so PENDING entries are reclaimed sooner; this
matches the
existing `deleteBlocksCommand` behaviour and is tracked separately.
## What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-15327
## How was this patch tested?
New and updated unit tests:
- `TestContainerReplicaPendingOps`: a failed command removes the matching
ADD op and
decrements the inflight counter; an unknown command id is a no-op.
- `TestCommandStatusReportHandler`: a FAILED replication status fires
`REPLICATION_STATUS`.
- `TestStateContext`: replicate/reconstruct commands register a PENDING
status.
- `TestReplicationSupervisor`: a finished task reports `EXECUTED` on success
and
`FAILED` on failure.
Local CI-aligned checks all pass: `checkstyle.sh`, `rat.sh`, `author.sh`.
Generated-by: Claude Code (Claude Opus 4.8)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]