Copilot commented on code in PR #10440:
URL: https://github.com/apache/ozone/pull/10440#discussion_r3420608141
##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/replication/MisReplicationHandler.java:
##########
@@ -157,13 +157,21 @@ public int processAndSendCommands(
.getTargetDatanodes(containerPlacement, requiredNodes,
excludedAndUsedNodes.getUsedNodes(),
excludedAndUsedNodes.getExcludedNodes(), currentContainerSize,
- container);
+ container, replicationManager.getNodeManager());
List<DatanodeDetails> availableSources = sources.stream()
.map(ContainerReplica::getDatanodeDetails)
.collect(Collectors.toList());
- int count = sendReplicateCommands(container, replicasToBeReplicated,
- availableSources, targetDatanodes);
+ int count;
+ try {
+ count = sendReplicateCommands(container, replicasToBeReplicated,
+ availableSources, targetDatanodes);
+ } catch (CommandTargetOverloadedException | NotLeaderException e) {
+ // No commands were dispatched; roll back all recorded
PendingContainerTracker slots.
+ ReplicationManagerUtil.rollbackTargets(
+ targetDatanodes, container, replicationManager.getNodeManager());
+ throw e;
+ }
Review Comment:
The rollback in this catch block assumes that no replication commands were
dispatched, but the current sendReplicateCommands implementations can throw
after sending some commands (eg RatisMisReplicationHandler loops over targets
and can throw mid-loop; ECMisReplicationHandler may send some commands and then
throw at the end if any target was overloaded). Rolling back *all* target
reservations here can under-account pending space for commands that were
successfully sent, causing the PendingContainerTracker to think there is more
free space than there really is.
A safer approach is to roll back only the targets for which a command was
not sent (eg catch per-target inside sendReplicateCommands and rollback just
the failing target(s), or change sendReplicateCommands to report which targets
were successfully dispatched so the caller can rollback the remainder).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]