[ https://issues.apache.org/jira/browse/IGNITE-14747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17356500#comment-17356500 ]
Ivan Bessonov commented on IGNITE-14747: ---------------------------------------- Some research results: * RocksDB is pretty easy to integrate. It allows you, among other features, to store arbitrary data sorted, iterate through it and snapshot the state until next restart. * Every DB instance can have multiple "column families" - I view them as partitions and possibly "index.bin" candidates. There's a support for multi-column-family batch-writes, which is good for SQL indexes. There's also "dropColumnFamilies" to evict multiple partitions at once. ** Here we have a potential issue - evicted partitions will still be present in LSM tree until it's fully compacted. That'll take some time, meaning that we will store too much data sometimes on top of duplicated entries in LSM tree. * Every instance has its own WAL. We should consider disabling it, because it will be replaced with rebalancing from RAFT log. * For the first implementation we could create new RocksDB instance for every table. ** Cons: hard to configure memory consumption. As far as I know, we can't force several RockDB instances to use shared memory restrictions. ** Pros: better reads performance. Every cache tree is separate and hence much smaller, giving you less lookups in general. * Usage of the RocksDB for RAFT log. From what I understand, log is basically a cache "long -> value" with auto-incrementing key and extremely rare update operations, almost append-only. This approach may not be very optimal for very simple reason: layer files merging is effectively equal to concatenation, but there's no way to tell it to the engine. This will lead to excessive IO when we don't need it. * Lifecycle - not much to say here. We should start it before starting caches and stop after stopping caches. There should be explicit way to tell partition number to API or something, these details will be decided later. > RocksDB research: configuration, lifecycle, basic integration > ------------------------------------------------------------- > > Key: IGNITE-14747 > URL: https://issues.apache.org/jira/browse/IGNITE-14747 > Project: Ignite > Issue Type: New Feature > Reporter: Sergey Chugunov > Assignee: Ivan Bessonov > Priority: Major > Labels: iep-74, ignite-3 > Fix For: 3.0 > > > In accordance with > [IEP-74|https://cwiki.apache.org/confluence/display/IGNITE/IEP-74+Data+Storage] > first implementation of persistent Storage will be based on RocksDB K-V > storage. > Thus research is needed on how to integrate it into ignite-3 realm. The > following questions should be covered: > # What additional configuration properties are needed. > # How to reconcile lifecycle of RocksDB instance with Ignite node lifecycle. > # How RocksDB abstractions (e.g. partitions) match with Ignite abstractions. > Also scope of tasks to implement basic Storage API over RocksDB should be > defined. -- This message was sent by Atlassian Jira (v8.3.4#803005)