vinothchandar edited a comment on issue #666: Add support for dynamic bloom filter to increase efficiency of bloom filter for static sizing URL: https://github.com/apache/incubator-hudi/pull/666#issuecomment-490112102 >there is one on size and both of them are close to 2MB, I actually rounded them off to the near megabyte, there may be differences in kilobytes. Can we test with N=`500000` fp=`0.000000001` and 10x/100x that? I think that will produce larger sizes/more fps. I would be surprised if dynamic provides much less fp's with same number of bits. All it must be doing is to use more bits as more entries come in. you can use something like https://krisives.github.io/bloom-calculator/ to design a case around this.. If proven to work, yes we should enable DynamicBloom by default. I think we have to do option 1 right? In option 2 also we 'd be reading old and new files with different filter formats right? do we handle an exception and detect dynamic vs normal bf?
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services