maheshguptags commented on issue #10456: URL: https://github.com/apache/hudi/issues/10456#issuecomment-1884430702
Hi @xicm, I tried below combination with same number record. <img width="681" alt="image" src="https://github.com/apache/hudi/assets/115445723/7398fe1a-914a-44b2-94f5-e7b9fdc9a7c9"> Please find the below details related to filegroups <img width="1486" alt="image" src="https://github.com/apache/hudi/assets/115445723/722a93e3-f588-4694-af7f-bf7b2752bd2e"> After testing it several times I noticed that 8,4 buckets looks good for data size which is <100M. As we know once the number of buckets is set we cannot change it. so I have question related to same. Suppose I took 8 as buckets and the streaming data is constantly growing (100 million per ID), will it affect the performance (considering that the job is streaming)? Thanks Mahesh Gupta -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org