[ 
https://issues.apache.org/jira/browse/KAFKA-12775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17344367#comment-17344367
 ] 

Bruno Cadonna commented on KAFKA-12775:
---------------------------------------

[~nhab] Would upgrading all Kafka Streams clients at once be a viable option 
for you? 
It seems to me that the main issue in your case is the rolling upgrade. It 
seems that the group leader that computes the assignment does not know anything 
about the newly added task, which is expected when the group leader does not 
already have the new task, but the normal member already has it. However, if 
you stop all Kafka Streams clients, upgrade them and restart them, you should 
not run into that issue. I guess, a pre-requirement for this is that all 
operators are explicitly named so that the names will not change when operators 
are added.
This will not solve the topology changes issue in general, but at least it 
could work for your specific case.  

> StreamsPartitionAssignor / ClientState throws an exception when a new Task 
> gets added to a KStreams Application in a Backwards-Compatible Topology Change
> ---------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-12775
>                 URL: https://issues.apache.org/jira/browse/KAFKA-12775
>             Project: Kafka
>          Issue Type: Bug
>          Components: streams
>            Reporter: Nico Habermann
>            Priority: Major
>
> KAFKA-6145 and KAFKA-10079 added an [exception if the Partition Assignor 
> tries to look up the lag for a TaskId that seemingly does not 
> exist|https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/processor/internals/assignment/ClientState.java#L325].
>  
> I believe this is a functional regression.
> Before, it was possible for Streams users to make backwards-compatible 
> topology changes and roll those out, without having to do a complete restore 
> or reload.
> For example:
> Existing sample topology:
>  
> {code:java}
> stream1 = stream(topic)
> stream1
>   .map(...)
>   .to(output){code}
> And doing this backwards-compatible change:
> {code:java}
> stream1 = stream(topic)
> ++ table = stream(topic2).through(repartition-topic)/repartition().toTable()
> stream1
>   .map(...)
> ++  .join(table)
>   .to(output){code}
>  
> This effectively creates a new subtopology with a new task for the table 
> repartition.
> In older KStreams versions, it would have been possible to simply roll this 
> change out.
> Since 2.6, rolling this out will crash the stream because the linked 
> exception gets thrown when StreamsPartitionAssignor#getPreviousTasksByLag  
> tries to look up the lag for the new table-repartition-task
>  
> At this time, the only possible way to avoid this exception seems to be 
> deleting all local state and doing a complete restore with the new topology 
> change included.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to