[ https://issues.apache.org/jira/browse/HADOOP-19604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18004214#comment-18004214 ]
ASF GitHub Bot commented on HADOOP-19604: ----------------------------------------- bhattmanish98 commented on code in PR #7777: URL: https://github.com/apache/hadoop/pull/7777#discussion_r2195482892 ########## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsBlobBlock.java: ########## @@ -42,23 +48,40 @@ public class AbfsBlobBlock extends AbfsBlock { * @param offset Used to generate blockId based on offset. * @throws IOException exception is thrown. */ - AbfsBlobBlock(AbfsOutputStream outputStream, long offset) throws IOException { + AbfsBlobBlock(AbfsOutputStream outputStream, long offset, int blockIdLength, long blockIndex) throws IOException { super(outputStream, offset); - this.blockId = generateBlockId(offset); + this.blockIndex = blockIndex; + String streamId = outputStream.getStreamID(); + UUID streamIdGuid = UUID.nameUUIDFromBytes(streamId.getBytes(StandardCharsets.UTF_8)); Review Comment: Can streamId be null? streamId.getBytes can raise null pointer exception. Better to handle it, ########## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsBlobClient.java: ########## @@ -982,6 +983,9 @@ public AbfsRestOperation appendBlock(final String path, if (requestParameters.getLeaseId() != null) { requestHeaders.add(new AbfsHttpHeader(X_MS_LEASE_ID, requestParameters.getLeaseId())); } + if (isChecksumValidationEnabled()) { + addCheckSumHeaderForWrite(requestHeaders, requestParameters); Review Comment: NIT: formatting required - there must be one tab in the start ########## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsBlobClient.java: ########## @@ -1032,7 +1036,7 @@ public AbfsRestOperation flush(final String path, final String cachedSasToken, final String leaseId, final ContextEncryptionAdapter contextEncryptionAdapter, - final TracingContext tracingContext) throws AzureBlobFileSystemException { + final TracingContext tracingContext, String blobMd5) throws AzureBlobFileSystemException { Review Comment: Add new argument in the comments @param. Please make this change wherever required. ########## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsClient.java: ########## @@ -878,27 +878,29 @@ public boolean appendSuccessCheckOp(AbfsRestOperation op, final String path, * @param leaseId if there is an active lease on the path. * @param contextEncryptionAdapter to provide encryption context. * @param tracingContext for tracing the server calls. + * @param blobMd5 The Base64-encoded MD5 hash of the blob for data integrity validation. * @return executed rest operation containing response from server. * @throws AzureBlobFileSystemException if rest operation fails. */ public abstract AbfsRestOperation flush(String path, long position, boolean retainUncommittedData, boolean isClose, String cachedSasToken, String leaseId, - ContextEncryptionAdapter contextEncryptionAdapter, TracingContext tracingContext) + ContextEncryptionAdapter contextEncryptionAdapter, TracingContext tracingContext, String blobMd5) throws AzureBlobFileSystemException; /** - * Flush previously uploaded data to a file. - * @param buffer containing blockIds to be flushed. - * @param path on which data has to be flushed. - * @param isClose specify if this is the last flush to the file. - * @param cachedSasToken to be used for the authenticating operation. - * @param leaseId if there is an active lease on the path. - * @param eTag to specify conditional headers. - * @param contextEncryptionAdapter to provide encryption context. - * @param tracingContext for tracing the server calls. - * @return executed rest operation containing response from server. - * @throws AzureBlobFileSystemException if rest operation fails. + * Flushes previously uploaded data to the specified path. Review Comment: NIT - Format can be consistent across places. ########## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/contracts/services/AppendRequestParameters.java: ########## @@ -77,6 +81,7 @@ public AppendRequestParameters(final long position, * @param leaseId leaseId of the blob to be appended * @param isExpectHeaderEnabled true if the expect header is enabled * @param blobParams parameters specific to append operation on Blob Endpoint. + * @param md5 The Base64-encoded MD5 hash of the block for data integrity validation. Review Comment: NIT: Format can be connected - extra space before @param and after md5 ########## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsBlobBlock.java: ########## @@ -42,23 +42,40 @@ public class AbfsBlobBlock extends AbfsBlock { * @param offset Used to generate blockId based on offset. * @throws IOException exception is thrown. */ - AbfsBlobBlock(AbfsOutputStream outputStream, long offset) throws IOException { + AbfsBlobBlock(AbfsOutputStream outputStream, long offset, int blockIdLength, long blockIndex) throws IOException { super(outputStream, offset); - this.blockId = generateBlockId(offset); + this.blockIndex = blockIndex; + String streamId = getOutputStream().getStreamID(); + UUID streamIdGuid = UUID.nameUUIDFromBytes(streamId.getBytes(StandardCharsets.UTF_8)); + this.blockId = generateBlockId(streamIdGuid, blockIdLength); } /** - * Helper method that generates blockId. - * @param position The offset needed to generate blockId. - * @return String representing the block ID generated. + * Generates a Base64-encoded block ID string based on the given position, stream ID, and desired raw length. Review Comment: In the comment, you have mentioned that block id is generated based on position, stream Id and raw length. Where exactly are we using position in this method? Am I missing something here? ########## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsBlobClient.java: ########## @@ -1060,7 +1064,7 @@ public AbfsRestOperation flush(byte[] buffer, final String leaseId, final String eTag, ContextEncryptionAdapter contextEncryptionAdapter, - final TracingContext tracingContext) throws AzureBlobFileSystemException { + final TracingContext tracingContext, String blobMd5) throws AzureBlobFileSystemException { Review Comment: Same as above. ########## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsBlobBlock.java: ########## @@ -42,23 +48,40 @@ public class AbfsBlobBlock extends AbfsBlock { * @param offset Used to generate blockId based on offset. Review Comment: Add newly added parameter in method comment. > ABFS: Fix WASB ABFS compatibility issues > ---------------------------------------- > > Key: HADOOP-19604 > URL: https://issues.apache.org/jira/browse/HADOOP-19604 > Project: Hadoop Common > Issue Type: Sub-task > Affects Versions: 3.4.1 > Reporter: Anmol Asrani > Assignee: Anmol Asrani > Priority: Major > Labels: pull-request-available > Fix For: 3.4.1 > > > Fix WASB ABFS compatibility issues. Fix issues such as:- > # BlockId computation to be consistent across clients for PutBlock and > PutBlockList > # Restrict url encoding of certain json metadata during setXAttr calls. > # Maintain the md5 hash of whole block to validate data integrity during > flush. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org