[ 
https://issues.apache.org/jira/browse/HADOOP-19604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18004214#comment-18004214
 ] 

ASF GitHub Bot commented on HADOOP-19604:
-----------------------------------------

bhattmanish98 commented on code in PR #7777:
URL: https://github.com/apache/hadoop/pull/7777#discussion_r2195482892


##########
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsBlobBlock.java:
##########
@@ -42,23 +48,40 @@ public class AbfsBlobBlock extends AbfsBlock {
    * @param offset       Used to generate blockId based on offset.
    * @throws IOException exception is thrown.
    */
-  AbfsBlobBlock(AbfsOutputStream outputStream, long offset) throws IOException 
{
+  AbfsBlobBlock(AbfsOutputStream outputStream, long offset, int blockIdLength, 
long blockIndex) throws IOException {
     super(outputStream, offset);
-    this.blockId = generateBlockId(offset);
+    this.blockIndex = blockIndex;
+    String streamId = outputStream.getStreamID();
+    UUID streamIdGuid = 
UUID.nameUUIDFromBytes(streamId.getBytes(StandardCharsets.UTF_8));

Review Comment:
   Can streamId be null? streamId.getBytes can raise null pointer exception. 
Better to handle it,



##########
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsBlobClient.java:
##########
@@ -982,6 +983,9 @@ public AbfsRestOperation appendBlock(final String path,
     if (requestParameters.getLeaseId() != null) {
       requestHeaders.add(new AbfsHttpHeader(X_MS_LEASE_ID, 
requestParameters.getLeaseId()));
     }
+    if (isChecksumValidationEnabled()) {
+    addCheckSumHeaderForWrite(requestHeaders, requestParameters);

Review Comment:
   NIT: formatting required - there must be one tab in the start



##########
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsBlobClient.java:
##########
@@ -1032,7 +1036,7 @@ public AbfsRestOperation flush(final String path,
       final String cachedSasToken,
       final String leaseId,
       final ContextEncryptionAdapter contextEncryptionAdapter,
-      final TracingContext tracingContext) throws AzureBlobFileSystemException 
{
+      final TracingContext tracingContext, String blobMd5) throws 
AzureBlobFileSystemException {

Review Comment:
   Add new argument in the comments @param. Please make this change wherever 
required.



##########
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsClient.java:
##########
@@ -878,27 +878,29 @@ public boolean appendSuccessCheckOp(AbfsRestOperation op, 
final String path,
    * @param leaseId if there is an active lease on the path.
    * @param contextEncryptionAdapter to provide encryption context.
    * @param tracingContext for tracing the server calls.
+   * @param blobMd5  The Base64-encoded MD5 hash of the blob for data 
integrity validation.
    * @return executed rest operation containing response from server.
    * @throws AzureBlobFileSystemException if rest operation fails.
    */
   public abstract AbfsRestOperation flush(String path, long position,
       boolean retainUncommittedData, boolean isClose,
       String cachedSasToken, String leaseId,
-      ContextEncryptionAdapter contextEncryptionAdapter, TracingContext 
tracingContext)
+      ContextEncryptionAdapter contextEncryptionAdapter, TracingContext 
tracingContext, String blobMd5)
       throws AzureBlobFileSystemException;
 
   /**
-   * Flush previously uploaded data to a file.
-   * @param buffer containing blockIds to be flushed.
-   * @param path on which data has to be flushed.
-   * @param isClose specify if this is the last flush to the file.
-   * @param cachedSasToken to be used for the authenticating operation.
-   * @param leaseId if there is an active lease on the path.
-   * @param eTag to specify conditional headers.
-   * @param contextEncryptionAdapter to provide encryption context.
-   * @param tracingContext for tracing the server calls.
-   * @return executed rest operation containing response from server.
-   * @throws AzureBlobFileSystemException if rest operation fails.
+   * Flushes previously uploaded data to the specified path.

Review Comment:
   NIT - Format can be consistent across places.



##########
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/contracts/services/AppendRequestParameters.java:
##########
@@ -77,6 +81,7 @@ public AppendRequestParameters(final long position,
    * @param leaseId leaseId of the blob to be appended
    * @param isExpectHeaderEnabled true if the expect header is enabled
    * @param blobParams parameters specific to append operation on Blob 
Endpoint.
+   *  @param md5  The Base64-encoded MD5 hash of the block for data integrity 
validation.

Review Comment:
   NIT: Format can be connected - extra space before @param and after md5



##########
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsBlobBlock.java:
##########
@@ -42,23 +42,40 @@ public class AbfsBlobBlock extends AbfsBlock {
    * @param offset       Used to generate blockId based on offset.
    * @throws IOException exception is thrown.
    */
-  AbfsBlobBlock(AbfsOutputStream outputStream, long offset) throws IOException 
{
+  AbfsBlobBlock(AbfsOutputStream outputStream, long offset, int blockIdLength, 
long blockIndex) throws IOException {
     super(outputStream, offset);
-    this.blockId = generateBlockId(offset);
+    this.blockIndex = blockIndex;
+    String streamId = getOutputStream().getStreamID();
+    UUID streamIdGuid = 
UUID.nameUUIDFromBytes(streamId.getBytes(StandardCharsets.UTF_8));
+    this.blockId = generateBlockId(streamIdGuid, blockIdLength);
   }
 
   /**
-   * Helper method that generates blockId.
-   * @param position The offset needed to generate blockId.
-   * @return String representing the block ID generated.
+   * Generates a Base64-encoded block ID string based on the given position, 
stream ID, and desired raw length.

Review Comment:
   In the comment, you have mentioned that block id is generated based on 
position, stream Id and raw length. Where exactly are we using position in this 
method? Am I missing something here?



##########
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsBlobClient.java:
##########
@@ -1060,7 +1064,7 @@ public AbfsRestOperation flush(byte[] buffer,
       final String leaseId,
       final String eTag,
       ContextEncryptionAdapter contextEncryptionAdapter,
-      final TracingContext tracingContext) throws AzureBlobFileSystemException 
{
+      final TracingContext tracingContext, String blobMd5) throws 
AzureBlobFileSystemException {

Review Comment:
   Same as above.



##########
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsBlobBlock.java:
##########
@@ -42,23 +48,40 @@ public class AbfsBlobBlock extends AbfsBlock {
    * @param offset       Used to generate blockId based on offset.

Review Comment:
   Add newly added parameter in method comment.





> ABFS: Fix WASB ABFS compatibility issues
> ----------------------------------------
>
>                 Key: HADOOP-19604
>                 URL: https://issues.apache.org/jira/browse/HADOOP-19604
>             Project: Hadoop Common
>          Issue Type: Sub-task
>    Affects Versions: 3.4.1
>            Reporter: Anmol Asrani
>            Assignee: Anmol Asrani
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 3.4.1
>
>
> Fix WASB ABFS compatibility issues. Fix issues such as:-
>  # BlockId computation to be consistent across clients for PutBlock and 
> PutBlockList
>  # Restrict url encoding of certain json metadata during setXAttr calls.
>  # Maintain the md5 hash of whole block to validate data integrity during 
> flush.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to