[ https://issues.apache.org/jira/browse/HADOOP-18012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17703659#comment-17703659 ]
ASF GitHub Bot commented on HADOOP-18012: ----------------------------------------- steveloughran commented on code in PR #5488: URL: https://github.com/apache/hadoop/pull/5488#discussion_r1144771186 ########## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsClient.java: ########## @@ -519,11 +531,38 @@ public AbfsClientRenameResult renamePath( final String destination, final String continuation, final TracingContext tracingContext, - final String sourceEtag, + String sourceEtag, boolean isMetadataIncompleteState) Review Comment: aa proposed, add a new `boolean isNamespaceEnabled` parameter ########## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsClient.java: ########## @@ -519,11 +531,38 @@ public AbfsClientRenameResult renamePath( final String destination, final String continuation, final TracingContext tracingContext, - final String sourceEtag, + String sourceEtag, boolean isMetadataIncompleteState) throws AzureBlobFileSystemException { final List<AbfsHttpHeader> requestHeaders = createDefaultHeaders(); + // etag passed in, so source is a file + final boolean hasEtag = !isEmpty(sourceEtag); + boolean isDir = !hasEtag; + if (!hasEtag && renameResilience) { Review Comment: and add `&& isNamespaceEnabled` to the condition ########## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/AzureBlobFileSystem.java: ########## @@ -441,11 +441,19 @@ public boolean rename(final Path src, final Path dst) throws IOException { return dstFileStatus.isDirectory() ? false : true; } + boolean isNamespaceEnabled = abfsStore.getIsNamespaceEnabled(tracingContext); + // Non-HNS account need to check dst status on driver side. - if (!abfsStore.getIsNamespaceEnabled(tracingContext) && dstFileStatus == null) { + if (!isNamespaceEnabled && dstFileStatus == null) { dstFileStatus = tryGetFileStatus(qualifiedDstPath, tracingContext); } + // for Non-HNS accounts, rename resiliency cannot be maintained + // as eTags are not preserved in rename Review Comment: don't do it this way. AzureBlobFileSystemStore.getIsNamespaceEnabled() provides the information, so add a new isNamespaceEnabled parameter to abfsclient.renamePath() and use that in the decision making > ABFS: Enable config controlled ETag check for Rename idempotency > ---------------------------------------------------------------- > > Key: HADOOP-18012 > URL: https://issues.apache.org/jira/browse/HADOOP-18012 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure > Affects Versions: 3.3.2 > Reporter: Sneha Vijayarajan > Assignee: Sree Bhattacharyya > Priority: Major > Labels: pull-request-available > > ABFS driver has a handling for rename idempotency which relies on LMT of the > destination file to conclude if the rename was successful or not when source > file is absent and if the rename request had entered retry loop. > This handling is incorrect as LMT of the destination does not change on > rename. > This Jira will track the change to undo the current implementation and add a > new one where for an incoming rename operation, source file eTag is fetched > first and then rename is done only if eTag matches for the source file. > As this is going to be a costly operation given an extra HEAD request is > added to each rename, this implementation will be guarded over a config and > can enabled by customers who have workloads that do multiple renames. > Long term plan to handle rename idempotency without HEAD request is being > discussed. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org