[
https://issues.apache.org/jira/browse/HDDS-14600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Siyao Meng reassigned HDDS-14600:
---------------------------------
Assignee: Siyao Meng
> [FSO] Non-recursive dir deletion fails with "Directory is not empty" despite
> all children deleted when double buffer is not flushed
> -----------------------------------------------------------------------------------------------------------------------------------
>
> Key: HDDS-14600
> URL: https://issues.apache.org/jira/browse/HDDS-14600
> Project: Apache Ozone
> Issue Type: Bug
> Components: Ozone Manager
> Reporter: Siyao Meng
> Assignee: Siyao Meng
> Priority: Major
> Labels: pull-request-available
>
> When using File System Optimized (FSO) buckets (with Ratis enabled),
> non-recursive directory deletion can fail with "Directory is not empty" error
> (also visible in OM logs, DIRECTORY_NOT_EMPTY) even after all child
> directories have been successfully deleted. This occurs when the double
> buffer hasn't flushed child delete transactions to the DB yet.
> This issue was discovered when using S3A with Hive CTAS operations but
> affects any client using the filesystem APIs.
> h2. Symptom
> Issue was initially found when using FSO bucket via S3A connector from Hive
> when creating an external table:
> {code:sql|title=SQL}
> create external table ctas_src_ext_part partitioned by (age) as select name,
> age from default.studenttab10k;
> {code}
> {code:java|title=Hive client log}
> ERROR : Job Commit failed with exception
> 'org.apache.hadoop.hive.ql.metadata.HiveException(org.apache.hadoop.fs.s3a.AWSS3IOException:
> Remove S3 Dir Markers on
> s3a://s3hive/iceberg_test_db_hive2/.hive-staging_hive_2026-02-09_19-42-25_275_4928242460862214784-9/_task_tmp.-ext-10002:
> org.apache.hadoop.fs.s3a.impl.MultiObjectDeleteException:
> [S3Error(Key=iceberg_test_db_hive2/.hive-staging_hive_2026-02-09_19-42-25_275_4928242460862214784-9/_task_tmp.-ext-10002/,
> Code=InternalError, Message=Directory is not empty.
> Key:iceberg_test_db_hive2/.hive-staging_hive_2026-02-09_19-42-25_275_4928242460862214784-9/_task_tmp.-ext-10002)]:
> InternalError:
> iceberg_test_db_hive2/.hive-staging_hive_2026-02-09_19-42-25_275_4928242460862214784-9/_task_tmp.-ext-10002/:
> Directory is not empty.
> Key:iceberg_test_db_hive2/.hive-staging_hive_2026-02-09_19-42-25_275_4928242460862214784-9/_task_tmp.-ext-10002
> :
> [S3Error(Key=iceberg_test_db_hive2/.hive-staging_hive_2026-02-09_19-42-25_275_4928242460862214784-9/_task_tmp.-ext-10002/,
> Code=InternalError, Message=Directory is not empty.
> Key:iceberg_test_db_hive2/.hive-staging_hive_2026-02-09_19-42-25_275_4928242460862214784-9/_task_tmp.-ext-10002)])'
> org.apache.hadoop.hive.ql.metadata.HiveException:
> org.apache.hadoop.fs.s3a.AWSS3IOException: Remove S3 Dir Markers on
> s3a://s3hive/iceberg_test_db_hive2/.hive-staging_hive_2026-02-09_19-42-25_275_4928242460862214784-9/_task_tmp.-ext-10002:
> org.apache.hadoop.fs.s3a.impl.MultiObjectDeleteException:
> [S3Error(Key=iceberg_test_db_hive2/.hive-staging_hive_2026-02-09_19-42-25_275_4928242460862214784-9/_task_tmp.-ext-10002/,
> Code=InternalError, Message=Directory is not empty.
> Key:iceberg_test_db_hive2/.hive-staging_hive_2026-02-09_19-42-25_275_4928242460862214784-9/_task_tmp.-ext-10002)]:
> InternalError:
> iceberg_test_db_hive2/.hive-staging_hive_2026-02-09_19-42-25_275_4928242460862214784-9/_task_tmp.-ext-10002/:
> Directory is not empty.
> Key:iceberg_test_db_hive2/.hive-staging_hive_2026-02-09_19-42-25_275_4928242460862214784-9/_task_tmp.-ext-10002
> :
> [S3Error(Key=iceberg_test_db_hive2/.hive-staging_hive_2026-02-09_19-42-25_275_4928242460862214784-9/_task_tmp.-ext-10002/,
> Code=InternalError, Message=Directory is not empty.
> Key:iceberg_test_db_hive2/.hive-staging_hive_2026-02-09_19-42-25_275_4928242460862214784-9/_task_tmp.-ext-10002)]
> at
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.jobCloseOp(FileSinkOperator.java:1629)
> at org.apache.hadoop.hive.ql.exec.Operator.jobClose(Operator.java:797)
> at org.apache.hadoop.hive.ql.exec.Operator.jobClose(Operator.java:802)
> at org.apache.hadoop.hive.ql.exec.tez.TezTask.close(TezTask.java:734)
> at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:394)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213)
> at
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105)
> at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:356)
> at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:329)
> at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:246)
> at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:107)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:804)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:539)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:533)
> at
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:190)
> at
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:235)
> at
> org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:91)
> at
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:342)
> at java.base/java.security.AccessController.doPrivileged(Native Method)
> at java.base/javax.security.auth.Subject.doAs(Subject.java:423)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1910)
> at
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:362)
> at
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
> at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> at
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
> at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> at java.base/java.lang.Thread.run(Thread.java:829)
> Caused by: org.apache.hadoop.fs.s3a.AWSS3IOException: Remove S3 Dir Markers
> on
> s3a://s3hive/iceberg_test_db_hive2/.hive-staging_hive_2026-02-09_19-42-25_275_4928242460862214784-9/_task_tmp.-ext-10002:
> org.apache.hadoop.fs.s3a.impl.MultiObjectDeleteException:
> [S3Error(Key=iceberg_test_db_hive2/.hive-staging_hive_2026-02-09_19-42-25_275_4928242460862214784-9/_task_tmp.-ext-10002/,
> Code=InternalError, Message=Directory is not empty.
> Key:iceberg_test_db_hive2/.hive-staging_hive_2026-02-09_19-42-25_275_4928242460862214784-9/_task_tmp.-ext-10002)]:
> InternalError:
> iceberg_test_db_hive2/.hive-staging_hive_2026-02-09_19-42-25_275_4928242460862214784-9/_task_tmp.-ext-10002/:
> Directory is not empty.
> Key:iceberg_test_db_hive2/.hive-staging_hive_2026-02-09_19-42-25_275_4928242460862214784-9/_task_tmp.-ext-10002
> :
> [S3Error(Key=iceberg_test_db_hive2/.hive-staging_hive_2026-02-09_19-42-25_275_4928242460862214784-9/_task_tmp.-ext-10002/,
> Code=InternalError, Message=Directory is not empty.
> Key:iceberg_test_db_hive2/.hive-staging_hive_2026-02-09_19-42-25_275_4928242460862214784-9/_task_tmp.-ext-10002)]
> at
> org.apache.hadoop.fs.s3a.impl.MultiObjectDeleteException.translateException(MultiObjectDeleteException.java:101)
> at
> org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:347)
> at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:124)
> at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:163)
> at
> org.apache.hadoop.fs.s3a.impl.DeleteOperation.asyncDeleteAction(DeleteOperation.java:443)
> at
> org.apache.hadoop.fs.s3a.impl.DeleteOperation.lambda$submitDelete$2(DeleteOperation.java:401)
> at
> org.apache.hadoop.fs.store.audit.AuditingFunctions.lambda$callableWithinAuditSpan$3(AuditingFunctions.java:117)
> at
> org.apache.hadoop.fs.s3a.impl.CallableSupplier.get(CallableSupplier.java:88)
> at
> java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1700)
> at
> org.apache.hadoop.util.SemaphoredDelegatingExecutor$RunnableWithPermitRelease.run(SemaphoredDelegatingExecutor.java:226)
> at
> org.apache.hadoop.util.SemaphoredDelegatingExecutor$RunnableWithPermitRelease.run(SemaphoredDelegatingExecutor.java:226)
> ... 3 more
> Caused by: org.apache.hadoop.fs.s3a.impl.MultiObjectDeleteException:
> [S3Error(Key=iceberg_test_db_hive2/.hive-staging_hive_2026-02-09_19-42-25_275_4928242460862214784-9/_task_tmp.-ext-10002/,
> Code=InternalError, Message=Directory is not empty.
> Key:iceberg_test_db_hive2/.hive-staging_hive_2026-02-09_19-42-25_275_4928242460862214784-9/_task_tmp.-ext-10002)]
> at
> org.apache.hadoop.fs.s3a.S3AFileSystem.deleteObjects(S3AFileSystem.java:3233)
> at
> org.apache.hadoop.fs.s3a.S3AFileSystem.removeKeysS3(S3AFileSystem.java:3460)
> at
> org.apache.hadoop.fs.s3a.S3AFileSystem.removeKeys(S3AFileSystem.java:3530)
> at
> org.apache.hadoop.fs.s3a.S3AFileSystem$OperationCallbacksImpl.removeKeys(S3AFileSystem.java:2604)
> at
> org.apache.hadoop.fs.s3a.impl.DeleteOperation.lambda$asyncDeleteAction$8(DeleteOperation.java:445)
> at org.apache.hadoop.fs.s3a.Invoker.lambda$once$0(Invoker.java:165)
> at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:122)
> ... 11 more
> {code}
> h2. Root cause analysis
> Hive initiates the call to {{fs.delete()}} on the dir with recursive set to
> true into S3A:
> {code:java|title=https://github.com/apache/hive/blob/98da62c93f198126c78d3352bf3ac6aeacefa53c/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java#L1535-L1536}
> Utilities.FILE_OP_LOGGER.trace("deleting taskTmpPath {}", taskTmpPath);
> fs.delete(taskTmpPath, true);
> {code}
> S3A would list and delete all keys under the dir if recursive=true:
> https://github.com/apache/hadoop/blob/e221231e81cdc4bc34ad69ca7d606a3efa8f79d1/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/DeleteOperation.java#L228
> (Thus even though Ozone S3 Gateway sets recursive=false when calling to OM,
> it would not become an issue at this point:
> https://github.com/apache/ozone/blob/9f9773df69302b0a8ee92a2c59515c37c6ac205d/hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/endpoint/ObjectEndpoint.java#L720
> )
> After adding debug messages around and an attempt to repro the issue locally,
> it turns out {{OMFileRequest#checkSubDirectoryExists}} would still find the
> deleted child dir:
> {code}
> 2026-02-09 15:28:30,989 [main] DEBUG key.OMKeyDeleteRequestWithFSO
> (OMKeyDeleteRequestWithFSO.java:validateAndUpdateCache(168)) - DELETE_KEY:
> Recursive delete requested for directory. Key: parent/child, ObjectID:
> 1770679710931, Transaction: 1000
> 2026-02-09 15:28:30,991 [main] DEBUG key.OMKeyDeleteRequestWithFSO
> (OMKeyDeleteRequestWithFSO.java:validateAndUpdateCache(231)) - Key deleted.
> Volume:084ac152-cbf9-4571-a8dd-b6b40b1e7a8e,
> Bucket:1f5b5722-6fc3-4f6d-aef3-c0a9b500eabc, Key:parent/child
> 2026-02-09 15:28:30,991 [main] DEBUG key.OMKeyDeleteRequestWithFSO
> (OMKeyDeleteRequestWithFSO.java:validateAndUpdateCache(147)) - DELETE_KEY:
> Checking if directory is empty. Key: parent, ObjectID: 1770679710930,
> ParentObjectID: 1770679710919, Transaction: 1001
> 2026-02-09 15:28:30,991 [main] DEBUG file.OMFileRequest
> (OMFileRequest.java:hasChildren(871)) - hasChildren: Checking for children.
> Key: parent, ObjectID: 1770679710930, ParentObjectID: 1770679710919, Volume:
> 084ac152-cbf9-4571-a8dd-b6b40b1e7a8e, Bucket:
> 1f5b5722-6fc3-4f6d-aef3-c0a9b500eabc
> 2026-02-09 15:28:30,991 [main] DEBUG file.OMFileRequest
> (OMFileRequest.java:checkSubDirectoryExists(925)) - checkSubDirectoryExists:
> No children in cache. Key: parent, ObjectID: 1770679710930, Cache entries
> checked: 2
> 2026-02-09 15:28:30,991 [main] DEBUG file.OMFileRequest
> (OMFileRequest.java:checkSubDirectoryExists(936)) - checkSubDirectoryExists:
> Checking DB with seekKey: /1770679710884/1770679710919/1770679710930/
> 2026-02-09 15:28:30,995 [main] WARN file.OMFileRequest
> (OMFileRequest.java:checkSubDirectoryExists(950)) - checkSubDirectoryExists:
> Found child directory in DB. Parent Key: parent, Parent ObjectID:
> 1770679710930, Child Name: child, Child ObjectID: 1770679710931, Child
> ParentObjectID: 1770679710930, DB Key:
> /1770679710884/1770679710919/1770679710930/child
> 2026-02-09 15:28:30,995 [main] WARN file.OMFileRequest
> (OMFileRequest.java:hasChildren(879)) - hasChildren: Found subdirectories.
> Key: parent, ObjectID: 1770679710930
> 2026-02-09 15:28:30,995 [main] WARN key.OMKeyDeleteRequestWithFSO
> (OMKeyDeleteRequestWithFSO.java:validateAndUpdateCache(155)) - DELETE_KEY:
> Directory not empty check FAILED. Key: parent, ObjectID: 1770679710930,
> ParentObjectID: 1770679710919, recursive: false, Transaction: 1001,
> ozonePathKey: /1770679710884/1770679710919/1770679710919/parent
> 2026-02-09 15:28:31,004 [main] ERROR key.OMKeyDeleteRequestWithFSO
> (OMKeyDeleteRequestWithFSO.java:validateAndUpdateCache(236)) - Key delete
> failed. Volume:084ac152-cbf9-4571-a8dd-b6b40b1e7a8e,
> Bucket:1f5b5722-6fc3-4f6d-aef3-c0a9b500eabc, Key:parent.
> DIRECTORY_NOT_EMPTY org.apache.hadoop.ozone.om.exceptions.OMException:
> Directory is not empty. Key:parent
> at
> org.apache.hadoop.ozone.om.request.key.OMKeyDeleteRequestWithFSO.validateAndUpdateCache(OMKeyDeleteRequestWithFSO.java:160)
> at
> org.apache.hadoop.ozone.om.request.OMClientRequest.validateAndUpdateCache(OMClientRequest.java:145)
> at
> org.apache.hadoop.ozone.om.request.key.TestOMKeyDeleteRequestWithFSO.testDeleteParentAfterChildrenDeleted(TestOMKeyDeleteRequestWithFSO.java:414)
> {code}
> At this point, it is clear enough that this most likely is a cache-related
> issue. It didn't take long after that to find that
> {{checkSubDirectoryExists}} (and in extension, {{checkSubFileExists}}) didn't
> properly check for deleted dir/file tombstones in cache.
> h2. Repro
> Proper unit test and integration test would be added in the PR, below is just
> for illustration:
> {code:java}
> Path parent = new Path("/parent");
> Path child = new Path(parent, "child");
> fs.mkdirs(parent);
> fs.mkdirs(child);
> fs.delete(child, false); // Succeeds
> fs.delete(parent, false); // Fails with "Directory is not empty" when double
> buffer didn't flush fast enough after the immediately previous delete() call
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]