Re: [I] [SUPPORT] How we can speed up individual file write(HoodieMergeHandle part) [hudi]

2024-04-16 Thread via GitHub
VitoMakarevich commented on issue #10997: URL: https://github.com/apache/hudi/issues/10997#issuecomment-2058398743 Thanks! Yeah, we are certain we run clustering for all partitions which are big enough, it's a big effort to analyze data for optimal clustering settings, that's why I'm asking

Re: [I] [SUPPORT] How we can speed up individual file write(HoodieMergeHandle part) [hudi]

2024-04-15 Thread via GitHub
xushiyan commented on issue #10997: URL: https://github.com/apache/hudi/issues/10997#issuecomment-2058238697 > we have clustering to group rows together, but it's still thousands of files affected. 75th percentile of individual file overwrite(task in the Doing partition and writing data sta

Re: [I] [SUPPORT] How we can speed up individual file write(HoodieMergeHandle part) [hudi]

2024-04-15 Thread via GitHub
VitoMakarevich commented on issue #10997: URL: https://github.com/apache/hudi/issues/10997#issuecomment-2056274777 Hello, thanks for the suggestions! As I said, I'd like to know how I can speed up this individual part, I know it's option to use MOR in theory, but it's impossible for our use

Re: [I] [SUPPORT] How we can speed up individual file write(HoodieMergeHandle part) [hudi]

2024-04-14 Thread via GitHub
xushiyan commented on issue #10997: URL: https://github.com/apache/hudi/issues/10997#issuecomment-209283 +1 to use MOR to balance the ingestion speed and merge cost through compaction. There is also a new feature in 0.13.x https://hudi.apache.org/releases/release-0.13.0#simple-write-exe

Re: [I] [SUPPORT] How we can speed up individual file write(HoodieMergeHandle part) [hudi]

2024-04-14 Thread via GitHub
ad1happy2go commented on issue #10997: URL: https://github.com/apache/hudi/issues/10997#issuecomment-2054086162 @VitoMakarevich Just checking if you have lots of file groups impacted in each batch, then why not use MERGE_ON_READ table. In your current setup, you can only try to optimize