Hi, I saw the documentation and source code of the state management with RocksDB and before I use it, I'm concerned of one thing: Am I right that currently when state is being checkpointed, the whole RocksDB state is snapshotted? There is no incremental, diff snapshotting, is it? If so, this seems to be unfeasible for keeping state counted in tens or hundreds of GBs (and you reach that size of a state, when you want to keep an embedded state of the streaming application instead of going out to Cassandra/Hbase or other DB). It will just cost too much to do snapshots of such large state.
Samza as a good example to compare, writes every state change to Kafka topic, considering it a snapshot in the shape of changelog. Of course in the moment of app restart, recovering the state from the changelog would be too costly, that is why the changelog topic is compacted. Plus, I think Samza does a state snapshot from time to time anyway (but I'm not sure of that). Thanks for answering my doubts, Krzysztof