On 2017-03-06 07:04 (-0800), "Thakrar, Jayesh" <jthak...@conversantmedia.com> 
wrote: 
> Thanks Hannu - also considered that option.
> However, that's a trial and error and will have to play with the 
> collision/false-positive fraction.
> And each iteration will most likely result in a compaction storm - so I was 
> hoping for a way to throttle/limit the max off-heap size.
> 
> The reason I was thinking of eliminating bloom filters is because due to 
> application design, we search for data using a partial key (prefix columns),
> hence am thinking of completely eliminating the bloom filters as they do not 
> add any value in such a use case.
> 

If you dont want to use the bloom filters, don't set the FP ratio to 0, set it 
to something like 0.1 or 0.5. A fp ratio of 0 says "no false positives", which 
is only possible with HUGE bloom filters. A high FP ratio (since you're not 
using them) basically says "don't try very hard" which corresponds to small 
arrays, which means low accuracy, but low offheap usage. 

We probably shouldn't even allow 0.0000 FP ratio. 



Reply via email to