zhangyue19921010 commented on pull request #3142: URL: https://github.com/apache/hudi/pull/3142#issuecomment-878004113
Hi @codope Just want to know, is this Async clustering function can handle the following scenarios and losing no data: There are 3 small file group named fg1, fg2 and fg3 contained file slice1, file slice2 and file slices3 separately. When async schedule **start to make a cluster plan but not finished**, there is an inflight or requested commit for fg1 which will create file slice 11 based on file slice1. In other words **file slice11 is creating but not committed** ---> I believe this is this scene is similar to multi writer. What does this async clustering function will do? Will this clustering plan contains file slice1? if contained, I think the new data in file slice11 will be lost. Looking forward to your reply, thanks a lot. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org