[ https://issues.apache.org/jira/browse/KAFKA-12909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17521985#comment-17521985 ]
Matthias J. Sax commented on KAFKA-12909: ----------------------------------------- This ticket is about left/outer join in particular and the "emit at window close" strategy is only applied to none-matching records. Ie, even if you have a left/outer join, all _inner_ join result of the operation are emitted right away. However, it's not "safe" to emit a left/right join result eagerly, as this record might actually find a join partner later – thus, we need to delay emitting "un-joined" record until the grace period passed to ensure the compute the right result. In the old implementation, we basically did not compute the correct left/outer join result, but a super-set of it. – Your DB argument does not really apply, because the result is a KStream and thus we should only emit _final_ result. If we emit an <k,<v1,null>> eagerly and a second <k,<v1,v2>> later, the second one is _not_ and update to the first one (a KStream has no update semantics) – otherwise we would need to treat all results with the same key as _updates_ but if a record joins twice, the second join result is also not an update to the first one. Does this make sense? > Allow users to opt-into spurious left/outer stream-stream join improvement > -------------------------------------------------------------------------- > > Key: KAFKA-12909 > URL: https://issues.apache.org/jira/browse/KAFKA-12909 > Project: Kafka > Issue Type: Improvement > Components: streams > Reporter: Matthias J. Sax > Assignee: Matthias J. Sax > Priority: Blocker > Fix For: 3.1.0 > > > https://issues.apache.org/jira/browse/KAFKA-10847 improves left/outer > stream-stream join, by not emitting left/outer results eagerly, but only > after the grace period passed. > While this change is desired, there is an issue with regard to upgrades: if > users don't specify a grace period, we fall back to a 24h default. Thus, > left/outer join results would only be emitted 24h after the join window end. > This change in behavior could break existing applications when upgrading to > 3.0.0 release. – And even if users do set a grace period explicitly, it's > still unclear if the new delayed output behavior would work for them. > Thus, we propose to disable the fix of KAFAK-10847 by default, and let user > opt-into the fix explicitly instead. > To allow users to enable the fix, we want to piggy-back on KIP-633 > (https://issues.apache.org/jira/browse/KAFKA-8613) that deprecated the > existing `JoinWindows.of()` and `JoinWindows#grace()` methods in favor of > `JoinWindows.ofSizeAndGrace()` – if users don't update their code, we would > keep the fix disabled, and thus, if users upgrade their app nothing changes. > Only if users switch to the new `ofSizeAndGrace()` API, we enable the fix and > thus give users the opportunity to opt-in expliclity and pick an appropriate > grace period for their application. -- This message was sent by Atlassian Jira (v8.20.1#820001)