Hi Jun,

Some predefined options would also activate bloom filters, e.g.  
PredefinedOptions#SPINNING_DISK_OPTIMIZED_HIGH_MEM, but I think offering 
configurable option is good idea. +1 for this.

When talking about the bloom filter default value, I slight prefer to use full 
format [1] instead of old block format. This is related with FLINK-20496 [2] 
which try to add option to enable partitioned index & filter.

[1] 
https://github.com/facebook/rocksdb/wiki/RocksDB-Bloom-Filter#full-filters-new-format
[2] https://issues.apache.org/jira/browse/FLINK-20496

Best
Yun Tang
________________________________
From: Till Rohrmann <trohrm...@apache.org>
Sent: Monday, February 8, 2021 17:06
To: dev <dev@flink.apache.org>
Subject: Re: Activate bloom filter in RocksDB State Backend via Flink 
configuration

Hi Jun,

Making things easier to use and configure is a good idea. Hence, +1 for
this proposal. Maybe create a JIRA ticket for it.

For the concrete default values it would be nice to hear the opinion of a
RocksDB expert.

Cheers,
Till

On Sun, Feb 7, 2021 at 7:23 PM Jun Qin <qinjunje...@gmail.com> wrote:

> Hi,
>
> Activating bloom filter in the RocksDB state backend improves read
> performance. Currently activating bloom filter can only be done by
> implementing a custom ConfigurableRocksDBOptionsFactory. I think we should
> provide an option to activate bloom filter via Flink configuration.  What
> do you think? If so, what about the following configuration?
>
> state.backend.rocksdb.bloom-filter.enabled: false (default)
> state.backend.rocksdb.bloom-filter.bits-per-key: 10 (default)
> state.backend.rocksdb.bloom-filter.block-based: true (default)
>
>
> Thanks
> Jun

Reply via email to