[jira] [Comment Edited] (KAFKA-12909) Allow users to opt-into spurious left/outer stream-stream join improvement
[ https://issues.apache.org/jira/browse/KAFKA-12909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17522502#comment-17522502 ] Matthias J. Sax edited comment on KAFKA-12909 at 4/21/22 9:13 PM: -- {quote}It makes sense but it is still kind of hard to wrap my head around it. {quote} Assume you have the following input (and a join window of 10): left: right: result: ,100> , 105> The second result record is no a replacement of the first result record. The result is really two records. Doing the same with left-join (and eager emit) you get: result: ,95>, ,100> , 105> – for this case, the first "left-join" is clearly incorrect, right? But how can you know that the second record is an update to the first one, while the third record is _no_ update to the second one? Maybe also check out: [https://www.confluent.io/events/kafka-summit-europe-2021/temporal-joins-in-kafka-streams-and-ksqldb/] was (Author: mjsax): {quote}It makes sense but it is still kind of hard to wrap my head around it. {quote} Assume you have the following input (and a join window of 10): left: right: result: ,100> , 105> The second result record is no a replacement of the first result record. The result is really two records. Doing the same with left-join (and eager emit) you get: result: ,95>, ,100> , 105> – for this case, the first "left-join" is clearly incorrect, right? But how can you know that the second record is an update to the first one, while the third record is _no_ update to the second one? Maybe also check out: https://www.confluent.io/events/kafka-summit-europe-2021/temporal-joins-in-kafka-streams-and-ksqldb/ > Allow users to opt-into spurious left/outer stream-stream join improvement > -- > > Key: KAFKA-12909 > URL: https://issues.apache.org/jira/browse/KAFKA-12909 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Matthias J. Sax >Assignee: Matthias J. Sax >Priority: Blocker > Fix For: 3.1.0 > > > https://issues.apache.org/jira/browse/KAFKA-10847 improves left/outer > stream-stream join, by not emitting left/outer results eagerly, but only > after the grace period passed. > While this change is desired, there is an issue with regard to upgrades: if > users don't specify a grace period, we fall back to a 24h default. Thus, > left/outer join results would only be emitted 24h after the join window end. > This change in behavior could break existing applications when upgrading to > 3.0.0 release. – And even if users do set a grace period explicitly, it's > still unclear if the new delayed output behavior would work for them. > Thus, we propose to disable the fix of KAFAK-10847 by default, and let user > opt-into the fix explicitly instead. > To allow users to enable the fix, we want to piggy-back on KIP-633 > (https://issues.apache.org/jira/browse/KAFKA-8613) that deprecated the > existing `JoinWindows.of()` and `JoinWindows#grace()` methods in favor of > `JoinWindows.ofSizeAndGrace()` – if users don't update their code, we would > keep the fix disabled, and thus, if users upgrade their app nothing changes. > Only if users switch to the new `ofSizeAndGrace()` API, we enable the fix and > thus give users the opportunity to opt-in expliclity and pick an appropriate > grace period for their application. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Comment Edited] (KAFKA-12909) Allow users to opt-into spurious left/outer stream-stream join improvement
[ https://issues.apache.org/jira/browse/KAFKA-12909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17522502#comment-17522502 ] Matthias J. Sax edited comment on KAFKA-12909 at 4/14/22 7:12 PM: -- {quote}It makes sense but it is still kind of hard to wrap my head around it. {quote} Assume you have the following input (and a join window of 10): left: right: result: ,100> , 105> The second result record is no a replacement of the first result record. The result is really two records. Doing the same with left-join (and eager emit) you get: result: ,95>, ,100> , 105> – for this case, the first "left-join" is clearly incorrect, right? But how can you know that the second record is an update to the first one, while the third record is _no_ update to the second one? Maybe also check out: https://www.confluent.io/events/kafka-summit-europe-2021/temporal-joins-in-kafka-streams-and-ksqldb/ was (Author: mjsax): {quote}It makes sense but it is still kind of hard to wrap my head around it. {quote} Assume you have the following input (and a join window of 10): left: right: result: ,100> , 105> The second result record is no a replacement of the first result record. The result is really two records. Doing the same with left-join (and eager emit) you get: result: ,95>, ,100> , 105> – for this case, the first "left-join" is clearly incorrect, right? But how can you know that the second record is an update to the first one, while the third record is _no_ update to the second one? > Allow users to opt-into spurious left/outer stream-stream join improvement > -- > > Key: KAFKA-12909 > URL: https://issues.apache.org/jira/browse/KAFKA-12909 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Matthias J. Sax >Assignee: Matthias J. Sax >Priority: Blocker > Fix For: 3.1.0 > > > https://issues.apache.org/jira/browse/KAFKA-10847 improves left/outer > stream-stream join, by not emitting left/outer results eagerly, but only > after the grace period passed. > While this change is desired, there is an issue with regard to upgrades: if > users don't specify a grace period, we fall back to a 24h default. Thus, > left/outer join results would only be emitted 24h after the join window end. > This change in behavior could break existing applications when upgrading to > 3.0.0 release. – And even if users do set a grace period explicitly, it's > still unclear if the new delayed output behavior would work for them. > Thus, we propose to disable the fix of KAFAK-10847 by default, and let user > opt-into the fix explicitly instead. > To allow users to enable the fix, we want to piggy-back on KIP-633 > (https://issues.apache.org/jira/browse/KAFKA-8613) that deprecated the > existing `JoinWindows.of()` and `JoinWindows#grace()` methods in favor of > `JoinWindows.ofSizeAndGrace()` – if users don't update their code, we would > keep the fix disabled, and thus, if users upgrade their app nothing changes. > Only if users switch to the new `ofSizeAndGrace()` API, we enable the fix and > thus give users the opportunity to opt-in expliclity and pick an appropriate > grace period for their application. -- This message was sent by Atlassian Jira (v8.20.1#820001)