[ 
https://issues.apache.org/jira/browse/KAFKA-7293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16722943#comment-16722943
 ] 

Matthias J. Sax commented on KAFKA-7293:
----------------------------------------

>From my understanding, nothing would happen (ie, no exception) but the 
>computation will we executed, but compute an incorrect result.

> Merge followed by groupByKey/join might violate co-partioning
> -------------------------------------------------------------
>
>                 Key: KAFKA-7293
>                 URL: https://issues.apache.org/jira/browse/KAFKA-7293
>             Project: Kafka
>          Issue Type: Improvement
>          Components: streams
>            Reporter: Matthias J. Sax
>            Priority: Major
>
> The merge() operations can be applied to input KStreams that have a different 
> number of tasks (ie, input topic partitions). For this case, the input topics 
> are not co-partitioned and thus the result KStream is not partitioned even if 
> each input KStream is partitioned by its own.
> Because, no "repartitionRequired" flag is set on the input KStreams, the flag 
> is also not set on the output KStream. Hence, if a groupByKey() or join() 
> operation is applied the output KStream, we don't insert a repartition topic. 
> However, repartitioning would be required because the KStream is not 
> partitioned.
> We cannot detect this during compile time, because the number or partitions 
> is unknown, and thus, we cannot decide if repartitioning is required or not. 
> However, we can add a runtime check similar to joins() that checks if data is 
> correctly (co-)partitioned and if not, we can raise a runtime exception.
> Note, for merge() in contrast to join(), we should only check for 
> co-partitioning, if the merge() is followed by a groupByKey() or join() 
> operations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to