[ https://issues.apache.org/jira/browse/HADOOP-18656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17810324#comment-17810324 ]
ASF GitHub Bot commented on HADOOP-18656: ----------------------------------------- anujmodi2021 commented on code in PR #6409: URL: https://github.com/apache/hadoop/pull/6409#discussion_r1464713380 ########## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsClient.java: ########## @@ -1053,12 +1053,24 @@ public AbfsRestOperation read(final String path, return op; } - public AbfsRestOperation deletePath(final String path, final boolean recursive, final String continuation, + public AbfsRestOperation deletePath(final String path, final boolean recursive, + final String continuation, TracingContext tracingContext) throws AzureBlobFileSystemException { final List<AbfsHttpHeader> requestHeaders = createDefaultHeaders(); - final AbfsUriQueryBuilder abfsUriQueryBuilder = createDefaultUriQueryBuilder(); + + if (abfsConfiguration.isPaginatedDeleteEnabled() && recursive) { + // Change the x-ms-version to "2023-08-03" if its less than that. + if (xMsVersion.compareTo(AUGUST_2023_API_VERSION) < 0) { Review Comment: Nice suggestion... Added a new enum class to define all the versions currently in use ########## hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsClient.java: ########## @@ -1053,12 +1053,24 @@ public AbfsRestOperation read(final String path, return op; } - public AbfsRestOperation deletePath(final String path, final boolean recursive, final String continuation, + public AbfsRestOperation deletePath(final String path, final boolean recursive, + final String continuation, TracingContext tracingContext) throws AzureBlobFileSystemException { final List<AbfsHttpHeader> requestHeaders = createDefaultHeaders(); - final AbfsUriQueryBuilder abfsUriQueryBuilder = createDefaultUriQueryBuilder(); + + if (abfsConfiguration.isPaginatedDeleteEnabled() && recursive) { + // Change the x-ms-version to "2023-08-03" if its less than that. + if (xMsVersion.compareTo(AUGUST_2023_API_VERSION) < 0) { + requestHeaders.removeIf(header -> header.getName().equalsIgnoreCase(X_MS_VERSION)); Review Comment: Taken > ABFS: Support for Pagination in Recursive Directory Delete > ----------------------------------------------------------- > > Key: HADOOP-18656 > URL: https://issues.apache.org/jira/browse/HADOOP-18656 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure > Affects Versions: 3.3.5 > Reporter: Sree Bhattacharyya > Assignee: Anuj Modi > Priority: Minor > Labels: pull-request-available > > Today, when a recursive delete is issued for a large directory in ADLS Gen2 > (HNS) account, the directory deletion happens in O(1) but in backend ACL > Checks are done recursively for each object inside that directory which in > case of large directory could lead to request time out. Pagination is > introduced in the Azure Storage Backend for these ACL checks. > More information on how pagination works can be found on public documentation > of [Azure Delete Path > API|https://learn.microsoft.com/en-us/rest/api/storageservices/datalakestoragegen2/path/delete?view=rest-storageservices-datalakestoragegen2-2019-12-12]. > This PR contains changes to support this from client side. To trigger > pagination, client needs to add a new query parameter "paginated" and set it > to true along with recursive set to true. In return if the directory is > large, server might return a continuation token back to the caller. If caller > gets back a continuation token, it has to call the delete API again with > continuation token along with recursive and pagination set to true. This is > similar to directory delete of FNS account. > Pagination is available only in versions "2023-08-03" onwards. > PR also contains functional tests to verify driver works well with different > combinations of recursive and pagination features for both HNS and FNS > account. > Full E2E testing of pagination requires large dataset to be created and hence > not added as part of driver test suite. But extensive E2E testing has been > performed. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org