ableegoldman opened a new pull request #10788:
URL: https://github.com/apache/kafka/pull/10788


   Pt. 1: [#10609](https://github.com/apache/kafka/pull/10609)
   Pt. 2: [10683](https://github.com/apache/kafka/pull/10683)
   
   In Pt. 3 we implement the `addNamedTopology` API. This can be used to update 
the processing topology of a running Kafka Streams application without 
resetting the app, or even pausing/restarting the process. It's up to the user 
to ensure that this API is called on every instance of an application to ensure 
all clients are able to run the newly added NamedTopology. This should not be 
too much of a burden as it only requires that each client eventually be updated 
by the user, not that they do so in a synchronous or even ordered fashion -- 
under the covers, Streams will take care of keeping the internal state 
consistent while various clients wait to converge on the latest view of the 
full topology.
   
   Internally, Streams will be leveraging rebalances and subscriptions that 
report each client's currently known NamedTopologies. When a new NamedTopology 
is added, a rebalance will be triggered to distribute the tasks that correspond 
to it. To minimize disruption and wasted work, the assignor just computes the 
desired eventual assignment of these new tasks to clients regardless of whether 
the target client has been issued the `addNamedTopology` request yet. If a 
client receives tasks for a NamedTopology it does not yet recognize, it simply 
files them away and continues to process its other topologies. Once it receives 
this new NamedTopology, it checks whether the name matches those of the unknown 
tasks, and  if it does then these tasks will begin to be processed without 
triggering a new rebalance. If the new NamedTopology does not match any unknown 
tasks it has received, then the client must trigger a fresh rebalance for this 
new NamedTopology.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to