voonhous commented on code in PR #7669: URL: https://github.com/apache/hudi/pull/7669#discussion_r1090245158
########## hudi-client/hudi-flink-client/src/main/java/org/apache/hudi/table/action/commit/FlinkDeletePartitionCommitActionExecutor.java: ########## @@ -98,4 +103,42 @@ private List<String> getAllExistingFileIds(String partitionPath) { // because new commit is not complete. it is safe to mark all existing file Ids as old files return table.getSliceView().getLatestFileSlices(partitionPath).map(FileSlice::getFileId).distinct().collect(Collectors.toList()); } + + /** + * Check if there are any pending table service actions (requested + inflight) on a table affecting the partitions to + * be dropped. + * <p> + * This check is to prevent a drop-partition from proceeding should a partition have a table service action in + * the pending stage. If this is allowed to happen, the filegroup that is an input for a table service action, might + * also be a candidate for being replaced. As such, when the table service action and drop-partition commits are + * committed, there will be two commits replacing a single filegroup. + * <p> + * For example, a timeline might have an execution order as such: + * 000.replacecommit.requested (clustering filegroup_1 + filegroup_2 -> filegroup_3) + * 001.replacecommit.requested, 001.replacecommit.inflight, 0001.replacecommit (drop_partition to replace filegroup_1) + * 000.replacecommit.inflight (clustering is executed now) + * 000.replacecommit (clustering completed) + * For an execution order as shown above, 000.replacecommit and 001.replacecommit will both flag filegroup_1 to be replaced. + * This will cause downstream duplicate key errors when a map is being constructed. + */ + private void checkPreconditions() { + List<String> instantsOfOffendingPendingTableServiceAction = new ArrayList<>(); + // ensure that there are no pending inflight clustering/compaction operations involving this partition + SyncableFileSystemView fileSystemView = (SyncableFileSystemView) table.getSliceView(); + + Stream.concat(fileSystemView.getPendingCompactionOperations(), fileSystemView.getPendingLogCompactionOperations()) + .filter(op -> partitions.contains(op.getRight().getPartitionPath())) + .forEach(op -> instantsOfOffendingPendingTableServiceAction.add(op.getLeft())); + + fileSystemView.getFileGroupsInPendingClustering() + .filter(fgIdInstantPair -> partitions.contains(fgIdInstantPair.getLeft().getPartitionPath())) + .forEach(x -> instantsOfOffendingPendingTableServiceAction.add(x.getRight().getTimestamp())); + + if (instantsOfOffendingPendingTableServiceAction.size() > 0) { + throw new HoodieDeletePartitionException("Failed to drop partitions. " Review Comment: Yeap! 1. Users might not have permission perform a list-file operation on the underlying filesystem (hdfs/gcs/s3) 2. If the commit `retain-commits` configuration is set to a somewhat large value, navigating the results of a list-file operation might be difficult. Users could use grep to perform filters, but this may introduce human error 3. We cannot expect users to check the commit timeline everytime they want to perform a DDL on a Hudi table. Hudi should have reasonable safeguards in place to ensure correctness. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org