nsivabalan commented on issue #3733: URL: https://github.com/apache/hudi/issues/3733#issuecomment-933829540
Hey Dave. so, my understanding is that, just partitioning on year, month and day did not work out for you and hence you have to go w/ hour as well? bcoz, cardinality of partitions is now very huge( ~ 8750 for 1 year of data). so I would expect some perf hit when you have such large no of partitions in general. also, are you seeing spikes only in those batches where records are spread across older partitions. or once such traffic completes, if you have regular traffic which updates only the last few partitions, are the perf back to normal ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org