[ 
https://issues.apache.org/jira/browse/FLINK-32833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17753907#comment-17753907
 ] 

Yun Tang commented on FLINK-32833:
----------------------------------

FLINK-7289 is the original ticket to introduce these options. The purpose is to 
limit the memory usage as the previous Flink's implementation would easily 
cause OOMKilled when using many RocksDB instances within the same slots in the 
k8s environment. If users could only put the data block in the cache and let 
the index & filter block residents in the memory, the flink process would 
easily run out of limited memory (especially considering we would not limit the 
opened SST files), and we cannot say the memory for rocksdb is *managed* 
anymore.
If you have other solutions to manage the RocksDB memory usage without caching 
the index and filter blocks, we should discuss those candidates.

> Rocksdb CacheIndexAndFilterBlocks must be true when using shared memory
> -----------------------------------------------------------------------
>
>                 Key: FLINK-32833
>                 URL: https://issues.apache.org/jira/browse/FLINK-32833
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / State Backends
>    Affects Versions: 1.17.1
>            Reporter: Yue Ma
>            Priority: Major
>
> Currently in RocksDBResourceContainer#getColumnOptions, if sharedResources is 
> used, blockBasedTableConfig will add the following configuration by default.
> {code:java}
> blockBasedTableConfig.setBlockCache(blockCache);
> blockBasedTableConfig.setCacheIndexAndFilterBlocks(true);
> blockBasedTableConfig.setCacheIndexAndFilterBlocksWithHighPriority(true);
> blockBasedTableConfig.setPinL0FilterAndIndexBlocksInCache(true);{code}
> In my understanding, these configurations can help flink better manage the 
> memory of rocksdb and save some memory overhead in some scenarios. But this 
> may not be the best practice, mainly for the following reasons:
> 1. After CacheIndexAndFilterBlocks is set to true, it may cause index and 
> filter miss when reading, resulting in performance degradation.
> 2. These parameters may not be bound together with whether shared memory is 
> used, or some configurations should be supported separately to decide whether 
> to enable these features



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to