nsivabalan commented on issue #3733:
URL: https://github.com/apache/hudi/issues/3733#issuecomment-933829540


   Hey Dave. 
   so, my understanding is that, just partitioning on year, month and day did 
not work out for you and hence you have to go w/ hour as well? bcoz, 
cardinality of partitions is now very huge( ~ 8750 for 1 year of data). so I 
would expect some perf hit when you have such large no of partitions in 
general. 
   also, are you seeing spikes only in those batches where records are spread 
across older partitions. or once such traffic completes, if you have regular 
traffic which updates only the last few partitions, are the perf back to normal 
? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to