Hello Nick,

Thanks for bringing this up and for the proposed options. I read though
your writeup and here are some of my thoughts:

1) When changing the topology of Kafka Streams, the developer need to first
decide if the whole topology's persisted state (including both the state
store as well as its changelogs, and the repartition topics, and the
source/sink external topics) or part of the persisted state can be reused.
This involves two types of changes:

a) structural change of the topology, such like a new processor node is
added/removed, a new intermediate topic is added/removed etc.
b) semantic change of a processor, such as a numerical filter node changing
its filter threshold etc.

Today both of them are more or less determined by developers manually.
However, though automatically determining on changes of type b) is hard if
not possible, automatic determining on the type of a) is doable since it's
depend on just the information of:
* number of sub-topologies, and their orders (i.e. sequence of ids)
* used state stores and changelog topics within the sub-topology
* used repartition topics
* etc

So let's assume in the long run we can indeed automatically determine if a
topology or part of it (a sub-topology) is structurally the same, what we
can do is to "translate" the old persisted state names to the
new, isomorphic topology's names. Following this thought I'm leaning
towards the direction of option B in your proposal. But since in this KIP
automatic determining structural changes are out of the scope, I feel we
can consider adding some sort of a "migration tool" from an old topology to
new topology by renaming all the persisted states (store dirs and names,
topic names).


Guozhang


On Tue, Jan 25, 2022 at 9:10 AM Nick Telford <nick.telf...@gmail.com> wrote:

> Hi everyone,
>
> I'd like to start a discussion on Kafka Streams KIP-816 (
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-816%3A+Topology+changes+without+local+state+reset
> )
>
> This KIP outlines 3 possible solutions to the problem, and I plan to
> whittle this down to a definitive solution based on this discussion.
>
> Of the 3 proposed solutions:
> * 'A' is probably the "correct" solution, but is also quite a significant
> change.
> * 'B' is the least invasive, but most "hacky" solution.
> * 'C' requires a change to the wire protocol and will likely have
> unintended consequences. C is also the least complete solution, and will
> need significant additional work to make it work.
>
> Please let me know if the Motivation and Background sections need more
> clarity.
>
> Regards,
>
> Nick Telford
>


-- 
-- Guozhang

Reply via email to