[ https://issues.apache.org/jira/browse/KAFKA-3184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17105816#comment-17105816 ]
Guozhang Wang commented on KAFKA-3184: -------------------------------------- Hello [~nizhikov] for many state-light applications, it is not worthy having persistent stores; but with in-memory stores since we do not have any persistent checkpoints, upon rolling upgrade or scaling events we always have to re-bootstrap the whole state from beginning and that's blocking the usefulness of in-memory stores. So when I created this ticket about 4 years ago, my main motivation is to make in-memory stores more attractive to be used for certain scenarios where your state is relatively small. Now with a lot of rebalance improvements we've done including KIP-441, I think just allow checkpointing for in-memory state stores locally may not be more interesting. Instead, I think what [~vvcephei] was considering is, to provide a general checkpointing API for state stores in Streams (not only for in-memory but also for persistent stores), where the checkpoint location can be either local disks or remote storage, and here the design scope is primarily on 1) the API design for both checkpointing as well as loading checkpoints into the local state stores, 2) the mechanism of the checkpointing, e.g. whether it should be async? whether it should be executed on separate threads? etc. I think this is as of today a more appealing feature to add, and if you are interested, we should just create a new JIRA for it other than piggy-backing on 3184. > Add Checkpoint for In-memory State Store > ---------------------------------------- > > Key: KAFKA-3184 > URL: https://issues.apache.org/jira/browse/KAFKA-3184 > Project: Kafka > Issue Type: Improvement > Components: streams > Reporter: Guozhang Wang > Assignee: Nikolay Izhikov > Priority: Major > Labels: user-experience > > Currently Kafka Streams does not make a checkpoint of the persistent state > store upon committing, which would be expensive since it is "stopping the > world" and write on disks: for example, RocksDB would require you to copy the > file directory to make a copy naively. > However, for in-memory stores checkpointing maybe doable in an asynchronous > manner hence it can be done quickly. And the benefit of having intermediate > checkpoint is to avoid restoring from scratch if standby tasks are not > present. -- This message was sent by Atlassian Jira (v8.3.4#803005)