ivandika3 commented on code in PR #3229:
URL: https://github.com/apache/ozone/pull/3229#discussion_r1901587545
##########
hadoop-hdds/client/src/main/java/org/apache/hadoop/hdds/scm/storage/BlockDataStreamOutput.java:
##########
@@ -406,12 +410,26 @@ private void executePutBlock(boolean close,
byteBufferList = null;
}
waitFuturesComplete();
+ final BlockData blockData = containerBlockData.build();
if (close) {
- dataStreamCloseReply = out.closeAsync();
+ final ContainerCommandRequestProto putBlockRequest
+ = ContainerProtocolCalls.getPutBlockRequest(
+ xceiverClient.getPipeline(), blockData, true, token);
+ dataStreamCloseReply = executePutBlockClose(putBlockRequest,
+ PUT_BLOCK_REQUEST_LENGTH_MAX, out);
+ dataStreamCloseReply.whenComplete((reply, e) -> {
+ if (e != null || reply == null || !reply.isSuccess()) {
+ LOG.warn("Failed executePutBlockClose, reply=" + reply, e);
+ try {
+ executePutBlock(true, false);
+ } catch (IOException ex) {
+ throw new CompletionException(ex);
+ }
+ }
+ });
Review Comment:
Context: We encountered intermittent BCSID_MISMATCH and unexpected read size
issues during reads where the `blockCommitSequenceId` of the OM key and the DN
blocks are different. All 3 replicas in the DN have the same exact data,
instead of one replica due to slow DN. We also saw that the block length
between the ones in OM and the DN are also different.
What's odd was that there were previous reads that were successful. However,
after a while the BCSID_MISMATCH issues happen. We're still not sure what
triggered it.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]