[ https://issues.apache.org/jira/browse/KAFKA-13627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17783894#comment-17783894 ]
Matthias J. Sax commented on KAFKA-13627: ----------------------------------------- As you pointed out, the KIP did not make progress for a long time. I don't see any reason why it cannot be resurrected though. > Topology changes shouldn't require a full reset of local state > -------------------------------------------------------------- > > Key: KAFKA-13627 > URL: https://issues.apache.org/jira/browse/KAFKA-13627 > Project: Kafka > Issue Type: Improvement > Components: streams > Affects Versions: 3.1.0 > Reporter: Nicholas Telford > Priority: Major > > [KIP-816|https://cwiki.apache.org/confluence/display/KAFKA/KIP-816%3A+Topology+changes+without+local+state+reset] > When changes are made to a Topology that modifies its structure, users must > use the Application Reset tool to reset the local state of their application > prior to deploying the change. Consequently, these changes require rebuilding > all local state stores from their changelog topics in Kafka. > The time and cost of rebuilding state stores is determined by the size of the > state stores, and their recent write history, as rebuilding a store entails > replaying all recent writes to the store. For applications that have very > large stores, or stores with extremely high write-rates, the time and cost of > rebuilding all state in the application can be prohibitively expensive. This > is a significant barrier to building highly scalable applications with good > availability. > Changes to the Topology that do not directly affect a state store should not > require the local state of that store to be reset/deleted. This would allow > applications to scale to very large data sets, whilst permitting the > application behaviour to evolve over time. > h1. Background > Tasks in a Kafka Streams Topology are logically grouped by “Topic Group'' > (aka. Subtopology). Topic Groups are assigned an ordinal (number), based on > their position in the Topology. This Topic Group ordinal is used as the > prefix for all Task IDs: {{{}<topic-group-ordinal>_<partition-number>{}}}, > e.g. {{2_14}} > If new Topic Groups are added, old Topic Groups are removed, or existing > Topic Groups are re-arranged, this can cause the assignment of ordinals to > change {_}even for Topic Groups that have not been modified{_}. > When the assignment of ordinals to Topic Groups changes, existing Tasks are > invalidated, as they no longer correspond to the correct Topic Groups. Local > state is located in directories that include the Task ID (e.g. > {{{}/state/dir/2_14/mystore/rocksdb/…{}}}), and since the Tasks have all been > invalidated, all existing local state directories are also invalid. > Attempting to start an application that has undergone these ordinal changes, > without first clearing the local state, will cause Kafka Streams to attempt > to use the existing local state for the wrong Tasks. Kafka Streams detects > this discrepancy and prevents the application from starting. -- This message was sent by Atlassian Jira (v8.20.10#820010)