[ https://issues.apache.org/jira/browse/FLINK-27504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17533593#comment-17533593 ]
Yun Tang commented on FLINK-27504: ---------------------------------- First of all, the compaction frequecny of RocksDB is not related to TTL configuration. It's only related to RocksDB's configuration. You can try to decrase the [level-0 trigger number|https://github.com/facebook/rocksdb/blob/7b55b508390d792ff31a416b63d474a2a6780588/java/src/main/java/org/rocksdb/ColumnFamilyOptions.java#L403] to '2' and descrease [the base level target size|https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/config/#state-backend-rocksdb-compaction-level-max-size-level-base] to '128MB'. BTW, how do you get the number of state size, via [total-sst-files-size|https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/config/#state-backend-rocksdb-metrics-total-sst-files-size]? > State compaction not happening with sliding window and incremental RocksDB > backend > ---------------------------------------------------------------------------------- > > Key: FLINK-27504 > URL: https://issues.apache.org/jira/browse/FLINK-27504 > Project: Flink > Issue Type: Bug > Components: Runtime / State Backends > Affects Versions: 1.14.4 > Environment: Local Flink cluster on Arch Linux. > Reporter: Alexis Sarda-Espinosa > Priority: Major > Attachments: duration_trend_52ca77c.png, duration_trend_67c76bb.png, > image-2022-05-06-10-34-35-007.png, size_growth_52ca77c.png, > size_growth_67c76bb.png > > > Hello, > I'm trying to estimate an upper bound for RocksDB's state size in my > application. For that purpose, I have created a small job with faster timings > whose code you can find on GitHub: > [https://github.com/asardaes/flink-rocksdb-ttl-test]. You can see some of the > results there, but I summarize here as well: > * Approximately 20 events per second, 10 unique keys for partitioning are > pre-specified. > * Sliding window of 11 seconds with a 1-second slide. > * Allowed lateness of 11 seconds. > * State TTL configured to 1 minute and compaction after 1000 entries. > * Both window-specific and window-global state used. > * Checkpoints every 2 seconds. > * Parallelism of 4 in stateful tasks. > The goal is to let the job run and analyze state compaction behavior with > RocksDB. I should note that global state is cleaned manually inside the > functions, TTL for those is in case some keys are no longer seen in the > actual production environment. > I have been running the job on a local cluster (outside IDE), the > configuration YAML is also available in the repository. After running for > approximately 1.6 days, state size is currently 2.3 GiB (see attachments). I > understand state can retain expired data for a while, but since TTL is 1 > minute, this seems excessive to me. -- This message was sent by Atlassian Jira (v8.20.7#820007)