[ 
https://issues.apache.org/jira/browse/KAFKA-3740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303345#comment-15303345
 ] 

Guozhang Wang commented on KAFKA-3740:
--------------------------------------

Some thoughts on the default config values:

I think there are at least three use cases of RocksDB whose default configs 
need to be treated differentially:

1. For pure key-value store with put / get / delete, this is used for KTable 
aggregation and KStream aggregation (note that for now windowed KStream 
aggregation is using a range query, which is sub-optimal, we should really 
change it to multiple gets to avoid flushing the cache).

2. For append-only puts and range queries, used for windowed KStream joins.

3. For update puts and range queries, non-key KTable-KTable joins: we are about 
to add this support and am writing up a design proposal for it.

For example, for case 1) it should usually write-heavy, assuming we have a good 
cache hit rate on top of RocksDB, then we should consider setting smaller 
number of levels config to reduce write amplification; and for 2) and 3), we 
should turn off bloom filter by default since it does not help for range 
queries.

References:

https://github.com/facebook/rocksdb/wiki/RocksDB-Tuning-Guide
https://vimeo.com/album/2920922/video/98428203

> Add configs for RocksDBStores
> -----------------------------
>
>                 Key: KAFKA-3740
>                 URL: https://issues.apache.org/jira/browse/KAFKA-3740
>             Project: Kafka
>          Issue Type: Bug
>          Components: streams
>            Reporter: Guozhang Wang
>            Assignee: Henry Cai
>              Labels: api, newbie
>
> Today most of the rocksDB configs are hard written inside {{RocksDBStore}}, 
> or the default values are directly used. We need to make them configurable 
> for advanced users. For example, some default values may not work perfectly 
> for some scenarios: 
> https://github.com/HenryCaiHaiying/kafka/commit/ccc4e25b110cd33eea47b40a2f6bf17ba0924576
>  
> One way of doing that is to introduce a "RocksDBStoreConfigs" objects similar 
> to "StreamsConfig", which defines all related rocksDB options configs, that 
> can be passed as key-value pairs to "StreamsConfig".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to