zhuanshenbsj1 commented on code in PR #8505:
URL: https://github.com/apache/hudi/pull/8505#discussion_r1176051746


##########
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/SparkRDDTableServiceClient.java:
##########
@@ -245,6 +246,7 @@ private void completeClustering(HoodieReplaceCommitMetadata 
metadata,
           metrics.updateCommitMetrics(parsedInstant.getTime(), durationInMs, 
metadata, HoodieActiveTimeline.REPLACE_COMMIT_ACTION)
       );
     }
+    waitForAsyncServiceCompletion();
     LOG.info("Clustering successfully on commit " + clusteringCommitTime);

Review Comment:
   Without this change,if config ASYNC_CLEAN = true,AsyncCleanerService will be 
used to do clean in offline job . 
   
![image](https://user-images.githubusercontent.com/34104400/234192874-e369bead-cc4a-4c8e-ab0a-c4791c8bd0ef.png)
   In my unit testing for offline job,if the completion time of the 
compact/cluster job is earlier than the completion time of the sync-cleaning 
job, function BaseHoodieTableServiceClient.close() will force the 
asunc-cleaning job to be closed.
   <img width="372" alt="d0570721dc3b1983191e5451751c1815" 
src="https://user-images.githubusercontent.com/34104400/234192499-dabdbd5f-8df8-476f-9812-72151e5d6873.png";>
   
   So I added this wait and made the entire task wait for clean to complete 
before smoothly exiting.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to