[
https://issues.apache.org/jira/browse/KAFKA-3740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303345#comment-15303345
]
Guozhang Wang commented on KAFKA-3740:
--------------------------------------
Some thoughts on the default config values:
I think there are at least three use cases of RocksDB whose default configs
need to be treated differentially:
1. For pure key-value store with put / get / delete, this is used for KTable
aggregation and KStream aggregation (note that for now windowed KStream
aggregation is using a range query, which is sub-optimal, we should really
change it to multiple gets to avoid flushing the cache).
2. For append-only puts and range queries, used for windowed KStream joins.
3. For update puts and range queries, non-key KTable-KTable joins: we are about
to add this support and am writing up a design proposal for it.
For example, for case 1) it should usually write-heavy, assuming we have a good
cache hit rate on top of RocksDB, then we should consider setting smaller
number of levels config to reduce write amplification; and for 2) and 3), we
should turn off bloom filter by default since it does not help for range
queries.
References:
https://github.com/facebook/rocksdb/wiki/RocksDB-Tuning-Guide
https://vimeo.com/album/2920922/video/98428203
> Add configs for RocksDBStores
> -----------------------------
>
> Key: KAFKA-3740
> URL: https://issues.apache.org/jira/browse/KAFKA-3740
> Project: Kafka
> Issue Type: Bug
> Components: streams
> Reporter: Guozhang Wang
> Assignee: Henry Cai
> Labels: api, newbie
>
> Today most of the rocksDB configs are hard written inside {{RocksDBStore}},
> or the default values are directly used. We need to make them configurable
> for advanced users. For example, some default values may not work perfectly
> for some scenarios:
> https://github.com/HenryCaiHaiying/kafka/commit/ccc4e25b110cd33eea47b40a2f6bf17ba0924576
>
> One way of doing that is to introduce a "RocksDBStoreConfigs" objects similar
> to "StreamsConfig", which defines all related rocksDB options configs, that
> can be passed as key-value pairs to "StreamsConfig".
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)