Hi,

There have been no reports about setting this configuration causing any
issues. I would guess it's off by default because it can increase the
memory usage by an unpredictable amount.

I would say feel free to enable it, from what you've said I also think that
this would improve the performance of your jobs. But make sure to configure
your jobs so that they will be able to accommodate the potential memory
footprint growth. Also please read the following resources to know more
about RocksDBs bloom filter:
https://github.com/facebook/rocksdb/wiki/RocksDB-Bloom-Filter
https://rocksdb.org/blog/2014/09/12/new-bloom-filter-format.html

Regards,
Mate


Kenan Kılıçtepe <kkilict...@gmail.com> ezt írta (időpont: 2023. okt. 20.,
P, 15:50):

> Can someone tell the exact performance effect of enabling bloom filter?
> May enabling it cause some unpredictable performance problems?
>
> I read what it is and how it works and it makes sense but  I also asked
> myself why the default value of state.backend.rocksdb.use-bloom-filter is
> false.
>
> We have a 5 servers flink cluster, processing real time IoT data coming
> from 5 million devices and for a lot of jobs, we keep different states for
> each device.
>
> Sometimes we have performance issues and when I check the flamegraph on
> the test server I always see rocksdb.get() is the blocker. I just want to
> increase rocksdb performance.
>
> Thanks
>
>

Reply via email to