[GitHub] [hudi] adityaverma1997 commented on issue #9257: [SUPPORT] Parquet files got cleaned up even when cleaning operation failed hence leading to subsequent failed clustering and cleaning
adityaverma1997 commented on issue #9257: URL: https://github.com/apache/hudi/issues/9257#issuecomment-1702471850 Hi @ad1happy2go , do you get chance to check out on this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] adityaverma1997 commented on issue #9257: [SUPPORT] Parquet files got cleaned up even when cleaning operation failed hence leading to subsequent failed clustering and cleaning
adityaverma1997 commented on issue #9257: URL: https://github.com/apache/hudi/issues/9257#issuecomment-1653270340 Thanks for confirming the same Danny. Have we found any reason behind this cleaner failure issue, would be really helpful if we can resolve this as many of our tables will be running on same set of configurations. Any help would be really appreciated. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] adityaverma1997 commented on issue #9257: [SUPPORT] Parquet files got cleaned up even when cleaning operation failed hence leading to subsequent failed clustering and cleaning
adityaverma1997 commented on issue #9257: URL: https://github.com/apache/hudi/issues/9257#issuecomment-1649440527 Correct me if I am wrong here, though I am running async cleaning but cleaning frequency is controlled by the following hudi configuration, which is: ``` hoodie.clean.max.commits ``` which is set as 10 in my case, so cleaner will get scheduled and executed after every 10th commit. On the other hand, we can retain no of commits when cleaning is executed based on below configuration: ``` hoodie.cleaner.commits.retained ``` I have set it as 2, so it will retain latest 2 commits and clean remaining commits on every cleaning execution. Looking forward for your reply @danny0405 and @ad1happy2go -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] adityaverma1997 commented on issue #9257: [SUPPORT] Parquet files got cleaned up even when cleaning operation failed hence leading to subsequent failed clustering and cleaning
adityaverma1997 commented on issue #9257: URL: https://github.com/apache/hudi/issues/9257#issuecomment-1647925326 @danny0405 I haven't tried the other cleaning strategies, not even default one because I don't want to run cleaner after every commit. Also, in my case I want only last 2-3 commits to retain so thatswhy I changed following properties. ``` "hoodie.clean.max.commits": 10, "hoodie.cleaner.commits.retained": 2 ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] adityaverma1997 commented on issue #9257: [SUPPORT] Parquet files got cleaned up even when cleaning operation failed hence leading to subsequent failed clustering and cleaning
adityaverma1997 commented on issue #9257: URL: https://github.com/apache/hudi/issues/9257#issuecomment-1647320952 Thanks for your reply Danny. I was also considering the same logic for cleaning and clustering but something is going wrong it seems, hope so my configurations are well in place and I'm not messing up with them. Attaching the sheet containing plans for both cleaning and clustering instances which got failed. [HudiPlanRequestedDetails.xlsx](https://github.com/apache/hudi/files/12142635/HudiPlanRequestedDetails.xlsx) Hope it works, let me know if something else required from my end. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org