[ https://issues.apache.org/jira/browse/KAFKA-12748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Guozhang Wang updated KAFKA-12748: ---------------------------------- Labels: rocksdb (was: ) > Explore new RocksDB options to consider enabling by default > ----------------------------------------------------------- > > Key: KAFKA-12748 > URL: https://issues.apache.org/jira/browse/KAFKA-12748 > Project: Kafka > Issue Type: Improvement > Components: streams > Reporter: A. Sophie Blee-Goldman > Priority: Major > Labels: rocksdb > > With the rocksdb version bump comes a lot of new options, some of which look > interesting enough to explore for usage in Streams. We should try setting > these as default options and run the benchmarks to look for any performance > benefit (or decrease). See javadocs for all Options > [here|https://javadoc.io/doc/org.rocksdb/rocksdbjni/latest/org/rocksdb/Options.html] > Options.setAvoidUnnecessaryBlockingIO: > - As the name suggest, avoids blocking/long-latency tasks by scheduling a > background job to do it > Options.setSkipCheckingSstFileSizesOnDbOpen: > - Speeds up startup time if there are many sst files, could mean less > overhead from things like rebalancing where tasks are migrated between > clients or threads. Not sure how many sst files counts as "many", may be less > useful now that we've disabled bulk loading > Options.setBestEffortsRecovery: > - Interesting feature to allow recovering missing files without the use > of the WAL. Could be useful if the on-disk state is corrupted (eg user > deletes a file) without needing to rebuild state from scratch. Though I'd > want to dig in further to understand what exactly it does and does not do. > Not a performance improvement but we should run the benchmarks to make sure > it doesn't make the performance worse. > Options.setWriteDbidToManifest: > - Should be set to true if/when we ever need to rely on the DB id eg for > backups. Also not a performance improvement but we should still benchmark > this. > Options.optimizeForSmallDb: > - This one is definitely not something we should set by default, as > "small" here means under 1GB. But it's probably worth at least calling out in > the docs for those users who know their data set size (per store) is under a > GB -- This message was sent by Atlassian Jira (v8.3.4#803005)