Hi, There have been no reports about setting this configuration causing any issues. I would guess it's off by default because it can increase the memory usage by an unpredictable amount.
I would say feel free to enable it, from what you've said I also think that this would improve the performance of your jobs. But make sure to configure your jobs so that they will be able to accommodate the potential memory footprint growth. Also please read the following resources to know more about RocksDBs bloom filter: https://github.com/facebook/rocksdb/wiki/RocksDB-Bloom-Filter https://rocksdb.org/blog/2014/09/12/new-bloom-filter-format.html Regards, Mate Kenan Kılıçtepe <kkilict...@gmail.com> ezt írta (időpont: 2023. okt. 20., P, 15:50): > Can someone tell the exact performance effect of enabling bloom filter? > May enabling it cause some unpredictable performance problems? > > I read what it is and how it works and it makes sense but I also asked > myself why the default value of state.backend.rocksdb.use-bloom-filter is > false. > > We have a 5 servers flink cluster, processing real time IoT data coming > from 5 million devices and for a lot of jobs, we keep different states for > each device. > > Sometimes we have performance issues and when I check the flamegraph on > the test server I always see rocksdb.get() is the blocker. I just want to > increase rocksdb performance. > > Thanks > >