ivandika3 commented on code in PR #3229:
URL: https://github.com/apache/ozone/pull/3229#discussion_r1901568117


##########
hadoop-hdds/client/src/main/java/org/apache/hadoop/hdds/scm/storage/BlockDataStreamOutput.java:
##########
@@ -406,12 +410,26 @@ private void executePutBlock(boolean close,
       byteBufferList = null;
     }
     waitFuturesComplete();
+    final BlockData blockData = containerBlockData.build();
     if (close) {
-      dataStreamCloseReply = out.closeAsync();
+      final ContainerCommandRequestProto putBlockRequest
+          = ContainerProtocolCalls.getPutBlockRequest(
+              xceiverClient.getPipeline(), blockData, true, token);
+      dataStreamCloseReply = executePutBlockClose(putBlockRequest,
+          PUT_BLOCK_REQUEST_LENGTH_MAX, out);
+      dataStreamCloseReply.whenComplete((reply, e) -> {
+        if (e != null || reply == null || !reply.isSuccess()) {
+          LOG.warn("Failed executePutBlockClose, reply=" + reply, e);
+          try {
+            executePutBlock(true, false);
+          } catch (IOException ex) {
+            throw new CompletionException(ex);
+          }
+        }
+      });

Review Comment:
   > two PutBlockRequest are created by getPutBlockRequest using the same 
parameters
   
   This is true. However, it seems the BCSID will be set on the DN side 
(`ContainerStateMachine`) based on the log index. This BCSID will be part of 
the PutBlock response.
   
   <img width="628" alt="image" 
src="https://github.com/user-attachments/assets/a9343ee6-29b3-4b10-bc9a-0e7f1beecd1a";
 />
   
   The possible issue is that it seems that `putBlockAsync` is the only one 
that will update the `BlockID`'s `blockCommitSequenceId` in the 
`BlockDataStreamOutput` that will be used as the `BlockID`'s 
`blockCommitSequenceId` in `OmKeyLocationInfo'. So if `putBlockAsync` is 
executed in the DN before the `executePutBlock`, the  `blockCommitSequenceId` 
in the committed key in OM will be smaller than the one in the DN replicas.
   
   ## Example
   
   ### Normal case 
   When `putBlockAsync` is processed in DN after the `executePutBlock`
   1.  `executePutBlock` (`ContainerStateMachine#link`) -> BCSID: 100 
   2. `putBlockAsync` (`ContainerStateMachine#applyTransaction`) -> BCSID: 101
   
   The BCSID in Datanode in 101 and committed blockCommitSequenceId in OM is 
101. When client calls `getBlock`, the block will be returned successfully 
since the blockCommitSequenceId = BCSID.
   
   ### Problematic case
   When `putBlockAsync` is executed in DN before the `executPutBlock`
   1.  `putBlockAsync` (`ContainerStateMachine#applyTransaction`) -> BCSID: 100
   2. `executePutBlock` (`ContainerStateMachine#link`) -> BCSID: 101
   
   The BCSID in Datanode is 101, but the committed blockCommitSequenceId in OM 
is 100. Therefore, there might be a difference between the 
blockCommitSequenceId in OM and DN. 
   
   However, I'm not sure if the reordering might happen in the first place.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org
For additional commands, e-mail: issues-h...@ozone.apache.org

Reply via email to