More, we improve the performance issuse around DiskBasedMap & kryo at master branch. You also can try build hudi jar use master branch.
best, lamber-ken At 2020-03-10 17:07:58, "selvaraj periyasamy" <selvaraj.periyasamy1...@gmail.com> wrote: Sorry for the partial emails. My company portal don’t allow me to add test code . Am using 0.5.0 version of Hudi Jars built from my local. While running upsert , it takes more than 6 or 7 mins for processing 150k records. Is there any tuning that could reduce the processing time from 6 or 7 mins ? Overwrite just takes less than a min ? Each row has 100 columns . Thanks, Selva On Tue, Mar 10, 2020 at 1:51 AM selvaraj periyasamy <selvaraj.periyasamy1...@gmail.com> wrote: Team, Am using 0.5.0 version of Hudi Jars built from my local. While running upsert , it takes more than 6 or 7 mins for processing 150k records. Below are the code and logs. 20/03/10 07:26:09 INFO IteratorBasedQueueProducer: starting to buffer records 20/03/10 07:26:09 INFO BoundedInMemoryExecutor: starting consumer thread 20/03/10 07:33:59 INFO IteratorBasedQueueProducer: finished buffering records 20/03/10 07:34:00 INFO BoundedInMemoryExecutor: Queue Consumption is done; notifying producer threads 20/03/10 07:26:08 INFO IteratorBasedQueueProducer: starting to buffer records 20/03/10 07:26:08 INFO BoundedInMemoryExecutor: starting consumer thread 20/03/10 07:33:31 INFO IteratorBasedQueueProducer: finished buffering records 20/03/10 07:33:31 INFO BoundedInMemoryExecutor: Queue Consumption is done; notifying producer threads While running insert On Tue, Mar 10, 2020 at 1:45 AM selvaraj periyasamy <selvaraj.periyasamy1...@gmail.com> wrote: Team, Am using 0.5.0 version of Hudi Jars built from my local. While running upsert 20/03/10 07:26:09 INFO IteratorBasedQueueProducer: starting to buffer records 20/03/10 07:26:09 INFO BoundedInMemoryExecutor: starting consumer thread 20/03/10 07:33:59 INFO IteratorBasedQueueProducer: finished buffering records 20/03/10 07:34:00 INFO BoundedInMemoryExecutor: Queue Consumption is done; notifying producer threads 20/03/10 07:26:08 INFO IteratorBasedQueueProducer: starting to buffer records 20/03/10 07:26:08 INFO BoundedInMemoryExecutor: starting consumer thread 20/03/10 07:33:31 INFO IteratorBasedQueueProducer: finished buffering records 20/03/10 07:33:31 INFO BoundedInMemoryExecutor: Queue Consumption is done; notifying producer threads