yihua commented on PR #8856:
URL: https://github.com/apache/hudi/pull/8856#issuecomment-1646713648

   > > @KnightChess do you have any number on the performance improvement on 
updating MDT from this PR?
   > 
   > parallelism compute:
   > 
   > ```java
   > parallelism = Math.max(Math.min(partitionToAppendedFilesList.size(), 
recordsGenerationParams.getBloomIndexParallelism()), 1);
   > ```
   > 
   > @yihua in this picture, total file is more than 5000 with one partitions, 
before it fix, bloom and col stats parallelism is 1 limit by partition number, 
and now, bloom filter is 200, col stat is 10, which from default value 
![image](https://user-images.githubusercontent.com/20125927/254446310-ff4cb9e4-d595-4294-83e7-cb42c73c40ff.png)
   
   Looks good!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to