[ https://issues.apache.org/jira/browse/HADOOP-19604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17992438#comment-17992438 ]
ASF GitHub Bot commented on HADOOP-19604: ----------------------------------------- anmolanmol1234 opened a new pull request, #7777: URL: https://github.com/apache/hadoop/pull/7777 Jira :- https://issues.apache.org/jira/browse/HADOOP-19604 1. BlockId computation to be consistent across clients for PutBlock and PutBlockList so made use of blockCount instead of offset. Block IDs were previously derived from the data offset, which could lead to inconsistency across different clients. The change now uses blockCount (i.e., the index of the block) to compute the Block ID, ensuring deterministic and consistent ID generation for both PutBlock and PutBlockList operations across clients. 2. Restrict URL encoding of certain JSON metadata during setXAttr calls. When setting extended attributes (xAttrs), the JSON metadata (hdi_permission) was previously URL-encoded, which could cause unnecessary escaping or compatibility issues. This change ensures that only required metadata are encoded. 3. Maintain the MD5 hash of the whole block to validate data integrity during flush. During flush operations, the MD5 hash of the entire block's data is computed and stored. This hash is later used to validate that the block correctly persisted, ensuring data integrity and helping detect corruption or transmission errors. > ABFS: Fix WASB ABFS compatibility issues > ---------------------------------------- > > Key: HADOOP-19604 > URL: https://issues.apache.org/jira/browse/HADOOP-19604 > Project: Hadoop Common > Issue Type: Sub-task > Affects Versions: 3.4.1 > Reporter: Anmol Asrani > Assignee: Anmol Asrani > Priority: Major > Fix For: 3.4.1 > > > Fix WASB ABFS compatibility issues. Fix issues such as:- > # BlockId computation to be consistent across clients for PutBlock and > PutBlockList > # Restrict url encoding of certain json metadata during setXAttr calls. > # Maintain the md5 hash of whole block to validate data integrity during > flush. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org