[ 
https://issues.apache.org/jira/browse/KAFKA-7669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16696400#comment-16696400
 ] 

Matthias J. Sax commented on KAFKA-7669:
----------------------------------------

Thanks for reporting this. Note that KafkaStreams requires that all application 
instances execute the exact same topology. All operators get automatically 
assigned names that are use to repartition and changelog topics, too. Those 
name are different if the operators are added in a different order.

Thus, you should rewrite you program in a way that guarantees the ordering.

As an alternative, you can provide names for all stateful operators explicitly 
(need to upgrade to 2.1 for this). For this case, KafkaStreams uses the 
provided names and to name repartition and changelog topics. Thus, order should 
not matter any longer, as long as the final topology graph is the same.

> Stream topology definition is not robust to the ordering changes
> ----------------------------------------------------------------
>
>                 Key: KAFKA-7669
>                 URL: https://issues.apache.org/jira/browse/KAFKA-7669
>             Project: Kafka
>          Issue Type: Wish
>          Components: streams
>    Affects Versions: 2.0.0
>            Reporter: Mateusz Owczarek
>            Priority: Major
>
> It seems that if the user does not guarantee the order of the stream topology 
> definition, he may end up with multiple stream branches having the same 
> internal changelog (and repartition, if created) topic. 
> Let's assume:
> {code:java}
> val initialStream = new StreamsBuilder().stream(sth);
> val someStrings = (1 to 10).map(_.toString)
> val notGuaranteedOrderOfStreams: Map[String, KStream[...]] = 
> someStrings.map(s => s -> initialStream.filter(...)).toMap{code}
> When the user defines now common aggregation logic for the 
> notGuaranteedOrderOfStreams, and runs multiple instances of the application 
> the KSTREAM-AGGREGATE-STATE-STORE topics names will not be unique and will 
> contain results of the different streams from notGuaranteedOrderOfStreams map.
> All of this without a single warning that the topology (or just the order of 
> the topology definition) differs in different instances of the Kafka Streams 
> application.
> Also, I am concerned that ids in "KSTREAM-AGGREGATE-STATE-STORE-id-changelog 
> " match so well for the different application instances (and different 
> topologies).
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to