[GitHub] [arrow-rs] alamb commented on issue #3138: audit and create a document for bloom filter configurations

GitBox Tue, 22 Nov 2022 06:57:03 -0800


alamb commented on issue #3138:
URL: https://github.com/apache/arrow-rs/issues/3138#issuecomment-1323808015


   I like the idea of specifying fpp (and it follows the arrow C++model)
   
   > with which we'd assume all unique items
   
   I think that makes sense as the main use case for bloom filters is high 
cardinality / close to unique columns.
   
   Perhaps we can document the case clearly (aka "bloom filters will likely 
only help for almost unique data like "ids" and "uuids", for other types 
sorting /clustering and min/max statistics will work as well if not better)
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow-rs] alamb commented on issue #3138: audit and create a document for bloom filter configurations

Reply via email to