zhuanshenbsj1 commented on code in PR #8505: URL: https://github.com/apache/hudi/pull/8505#discussion_r1185642315
########## hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieClusteringJob.java: ########## @@ -256,7 +279,16 @@ private int doScheduleAndCluster(JavaSparkContext jsc) throws Exception { LOG.info("The schedule instant time is " + instantTime.get()); LOG.info("Step 2: Do cluster"); Option<HoodieCommitMetadata> metadata = client.cluster(instantTime.get()).getCommitMetadata(); + cleanAfterCluster(client); return UtilHelpers.handleErrors(metadata.get(), instantTime.get()); } } + + private void cleanAfterCluster(SparkRDDWriteClient client) { + client.waitForAsyncServiceCompletion(); + if (client.getConfig().isAutoClean() && !client.getConfig().isAsyncClean()) { Review Comment: > I think we need to trigge a sync clean if it is enabled. IF isAsyncClean is enable, spark offline job will start an async-cleaning in prewrite like flink job. So if isAsyncClean is disable then add a synchronous cleanup -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org