KnightChess opened a new issue, #9101: URL: https://github.com/apache/hudi/issues/9101
code in: https://github.com/apache/hudi/blob/master/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/BaseHoodieWriteClient.java#L215-L272 In this picture, we have submit instance commit success. But in two, when we trigger `mayBeCleanAndArchive`, it throw Exception, and make this job failed, it will retry in job level. But this commit has commit. So the final result is: instance commit success -> job failed and retry, and the success instance will not rollback. ![image](https://github.com/apache/hudi/assets/20125927/eea732b5-62e3-4c0f-924f-06d36a1714c3) Including other places, I think will cause this problem. I think we need catch all exception after we commit instance success or extend the scope of a transaction. I prefer first catch all exception **To Reproduce** Steps to reproduce the behavior: 1. 2. 3. 4. **Expected behavior** A clear and concise description of what you expected to happen. **Environment Description** * Hudi version : 0.13.1 * Spark version : 3.2.0 * Hive version : * Hadoop version : * Storage (HDFS/S3/GCS..) : * Running on Docker? (yes/no) : **Additional context** Add any other context about the problem here. **Stacktrace** ```Add the stacktrace of the error.``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org