[jira] [Comment Edited] (KAFKA-12317) Relax non-null key requirement for left/outer KStream joins
[ https://issues.apache.org/jira/browse/KAFKA-12317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17835506#comment-17835506 ] Florin Akermann edited comment on KAFKA-12317 at 4/9/24 6:50 PM: - [~mjsax] [https://github.com/apache/kafka/pull/15689] I pushed fixes for left/outer stream stream joins documentation. The documentation for the stream/table left/foreign-key joins has already been updated in the original PR. Note, in the table docs it still states. "Input records for the stream with a null value are ignored and do not trigger the join." As far as I can see this is the correct descirption of the behavior for the stream-table operators in question. I plan on creating a separate PR for the join semantics table as those are not incorrect per se. was (Author: aki): [~mjsax] https://github.com/apache/kafka/pull/15689 I pushed fixes for left/outer stream stream joins documentation. The documentation for the stream/table left/foreign-key joins has already been updated in the original PR. Note, in the table docs it still states. "Input records for the stream with a null value are ignored and do not trigger the join." As far as I can see this is the correct descirption of the behavior for the stream-table operators in question. I plan on creating a separate PR for the join semantics table. > Relax non-null key requirement for left/outer KStream joins > --- > > Key: KAFKA-12317 > URL: https://issues.apache.org/jira/browse/KAFKA-12317 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Matthias J. Sax >Assignee: Florin Akermann >Priority: Major > Labels: kip > Fix For: 3.7.0 > > > Currently, for a stream-streams and stream-table/globalTable join > KafkaStreams drops all stream records with a `null`{-}key (`null`-join-key > for stream-globalTable), because for a `null`{-}(join)key the join is > undefined: ie, we don't have an attribute the do the table lookup (we > consider the stream-record as malformed). Note, that we define the semantics > of _left/outer_ join as: keep the stream record if no matching join record > was found. > We could relax the definition of _left_ stream-table/globalTable and > _left/outer_ stream-stream join though, and not drop `null`-(join)key stream > records, and call the ValueJoiner with a `null` "other-side" value instead: > if the stream record key (or join-key) is `null`, we could treat is as > "failed lookup" instead of treating the stream record as corrupted. > If we make this change, users that want to keep the current behavior, can add > a `filter()` before the join to drop `null`-(join)key records from the stream > explicitly. > Note that this change also requires to change the behavior if we insert a > repartition topic before the join: currently, we drop `null`-key record > before writing into the repartition topic (as we know they would be dropped > later anyway). We need to relax this behavior for a left stream-table and > left/outer stream-stream join. User need to be aware (ie, we might need to > put this into the docs and JavaDocs), that records with `null`-key would be > partitioned randomly. > KIP-962: > [https://cwiki.apache.org/confluence/display/KAFKA/KIP-962%3A+Relax+non-null+key+requirement+in+Kafka+Streams] > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-12317) Relax non-null key requirement for left/outer KStream joins
[ https://issues.apache.org/jira/browse/KAFKA-12317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17835506#comment-17835506 ] Florin Akermann commented on KAFKA-12317: - [~mjsax] https://github.com/apache/kafka/pull/15689 I pushed fixes for left/outer stream stream joins documentation. The documentation for the stream/table left/foreign-key joins has already been updated in the original PR. Note, in the table docs it still states. "Input records for the stream with a null value are ignored and do not trigger the join." As far as I can see this is the correct descirption of the behavior for the stream-table operators in question. I plan on creating a separate PR for the join semantics table. > Relax non-null key requirement for left/outer KStream joins > --- > > Key: KAFKA-12317 > URL: https://issues.apache.org/jira/browse/KAFKA-12317 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Matthias J. Sax >Assignee: Florin Akermann >Priority: Major > Labels: kip > Fix For: 3.7.0 > > > Currently, for a stream-streams and stream-table/globalTable join > KafkaStreams drops all stream records with a `null`{-}key (`null`-join-key > for stream-globalTable), because for a `null`{-}(join)key the join is > undefined: ie, we don't have an attribute the do the table lookup (we > consider the stream-record as malformed). Note, that we define the semantics > of _left/outer_ join as: keep the stream record if no matching join record > was found. > We could relax the definition of _left_ stream-table/globalTable and > _left/outer_ stream-stream join though, and not drop `null`-(join)key stream > records, and call the ValueJoiner with a `null` "other-side" value instead: > if the stream record key (or join-key) is `null`, we could treat is as > "failed lookup" instead of treating the stream record as corrupted. > If we make this change, users that want to keep the current behavior, can add > a `filter()` before the join to drop `null`-(join)key records from the stream > explicitly. > Note that this change also requires to change the behavior if we insert a > repartition topic before the join: currently, we drop `null`-key record > before writing into the repartition topic (as we know they would be dropped > later anyway). We need to relax this behavior for a left stream-table and > left/outer stream-stream join. User need to be aware (ie, we might need to > put this into the docs and JavaDocs), that records with `null`-key would be > partitioned randomly. > KIP-962: > [https://cwiki.apache.org/confluence/display/KAFKA/KIP-962%3A+Relax+non-null+key+requirement+in+Kafka+Streams] > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-12317) Relax non-null key requirement for left/outer KStream joins
[ https://issues.apache.org/jira/browse/KAFKA-12317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818138#comment-17818138 ] Florin Akermann commented on KAFKA-12317: - Thanks for the flag! > Would you be interested to do a follow up PR to update the docs? Yes. > Relax non-null key requirement for left/outer KStream joins > --- > > Key: KAFKA-12317 > URL: https://issues.apache.org/jira/browse/KAFKA-12317 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Matthias J. Sax >Assignee: Florin Akermann >Priority: Major > Labels: kip > > Currently, for a stream-streams and stream-table/globalTable join > KafkaStreams drops all stream records with a `null`{-}key (`null`-join-key > for stream-globalTable), because for a `null`{-}(join)key the join is > undefined: ie, we don't have an attribute the do the table lookup (we > consider the stream-record as malformed). Note, that we define the semantics > of _left/outer_ join as: keep the stream record if no matching join record > was found. > We could relax the definition of _left_ stream-table/globalTable and > _left/outer_ stream-stream join though, and not drop `null`-(join)key stream > records, and call the ValueJoiner with a `null` "other-side" value instead: > if the stream record key (or join-key) is `null`, we could treat is as > "failed lookup" instead of treating the stream record as corrupted. > If we make this change, users that want to keep the current behavior, can add > a `filter()` before the join to drop `null`-(join)key records from the stream > explicitly. > Note that this change also requires to change the behavior if we insert a > repartition topic before the join: currently, we drop `null`-key record > before writing into the repartition topic (as we know they would be dropped > later anyway). We need to relax this behavior for a left stream-table and > left/outer stream-stream join. User need to be aware (ie, we might need to > put this into the docs and JavaDocs), that records with `null`-key would be > partitioned randomly. > KIP-962: > [https://cwiki.apache.org/confluence/display/KAFKA/KIP-962%3A+Relax+non-null+key+requirement+in+Kafka+Streams] > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (KAFKA-14049) Relax Non Null Requirement for KStreamGlobalKTable Left Join
[ https://issues.apache.org/jira/browse/KAFKA-14049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818136#comment-17818136 ] Florin Akermann edited comment on KAFKA-14049 at 2/17/24 9:14 AM: -- [~mjsax] Yes. The behavior of the operator changed as part of KIP-962. The following behavior has been asserted. [https://github.com/apache/kafka/blob/trunk/streams/src/test/java/org/apache/kafka/streams/integration/RelaxedNullKeyRequirementJoinTest.java#L106] was (Author: aki): [~mjsax] Yes. The behavior of the operator changed as part of KIP-962. The following behavior has been asserted [https://github.com/apache/kafka/blob/trunk/streams/src/test/java/org/apache/kafka/streams/integration/RelaxedNullKeyRequirementJoinTest.java#L106] > Relax Non Null Requirement for KStreamGlobalKTable Left Join > > > Key: KAFKA-14049 > URL: https://issues.apache.org/jira/browse/KAFKA-14049 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Saumya Gupta >Assignee: Florin Akermann >Priority: Major > > Null Values in the Stream for a Left Join would indicate a Tombstone Message > that needs to propagated if not actually joined with the GlobalKTable > message, hence these messages should not be ignored . -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (KAFKA-14049) Relax Non Null Requirement for KStreamGlobalKTable Left Join
[ https://issues.apache.org/jira/browse/KAFKA-14049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818136#comment-17818136 ] Florin Akermann edited comment on KAFKA-14049 at 2/17/24 9:14 AM: -- [~mjsax] Yes. The behavior of the operator changed as part of KIP-962. The following behavior has been asserted [https://github.com/apache/kafka/blob/trunk/streams/src/test/java/org/apache/kafka/streams/integration/RelaxedNullKeyRequirementJoinTest.java#L106] was (Author: aki): [~mjsax] Yes. The following behavior has been asserted as part of KIP-962 [https://github.com/apache/kafka/blob/trunk/streams/src/test/java/org/apache/kafka/streams/integration/RelaxedNullKeyRequirementJoinTest.java#L106] > Relax Non Null Requirement for KStreamGlobalKTable Left Join > > > Key: KAFKA-14049 > URL: https://issues.apache.org/jira/browse/KAFKA-14049 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Saumya Gupta >Assignee: Florin Akermann >Priority: Major > > Null Values in the Stream for a Left Join would indicate a Tombstone Message > that needs to propagated if not actually joined with the GlobalKTable > message, hence these messages should not be ignored . -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (KAFKA-14049) Relax Non Null Requirement for KStreamGlobalKTable Left Join
[ https://issues.apache.org/jira/browse/KAFKA-14049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818136#comment-17818136 ] Florin Akermann edited comment on KAFKA-14049 at 2/17/24 9:13 AM: -- [~mjsax] Yes. The following behavior has been asserted as part of KIP-962 [https://github.com/apache/kafka/blob/trunk/streams/src/test/java/org/apache/kafka/streams/integration/RelaxedNullKeyRequirementJoinTest.java#L106] was (Author: aki): [~mjsax] Yes. The following behavior has been asserted as part of KIP-962 [https://github.com/apache/kafka/blob/trunk/streams/src/test/java/org/apache/kafka/streams/integration/RelaxedNullKeyRequirementJoinTest.java#L106] So I'd say we can close this item. > Relax Non Null Requirement for KStreamGlobalKTable Left Join > > > Key: KAFKA-14049 > URL: https://issues.apache.org/jira/browse/KAFKA-14049 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Saumya Gupta >Assignee: Florin Akermann >Priority: Major > > Null Values in the Stream for a Left Join would indicate a Tombstone Message > that needs to propagated if not actually joined with the GlobalKTable > message, hence these messages should not be ignored . -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-14049) Relax Non Null Requirement for KStreamGlobalKTable Left Join
[ https://issues.apache.org/jira/browse/KAFKA-14049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Florin Akermann resolved KAFKA-14049. - Resolution: Fixed > Relax Non Null Requirement for KStreamGlobalKTable Left Join > > > Key: KAFKA-14049 > URL: https://issues.apache.org/jira/browse/KAFKA-14049 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Saumya Gupta >Assignee: Florin Akermann >Priority: Major > > Null Values in the Stream for a Left Join would indicate a Tombstone Message > that needs to propagated if not actually joined with the GlobalKTable > message, hence these messages should not be ignored . -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-14049) Relax Non Null Requirement for KStreamGlobalKTable Left Join
[ https://issues.apache.org/jira/browse/KAFKA-14049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818136#comment-17818136 ] Florin Akermann commented on KAFKA-14049: - [~mjsax] Yes. The following behavior has been asserted as part of KIP-962 [https://github.com/apache/kafka/blob/trunk/streams/src/test/java/org/apache/kafka/streams/integration/RelaxedNullKeyRequirementJoinTest.java#L106] So I'd say we can close this item. > Relax Non Null Requirement for KStreamGlobalKTable Left Join > > > Key: KAFKA-14049 > URL: https://issues.apache.org/jira/browse/KAFKA-14049 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Saumya Gupta >Assignee: Florin Akermann >Priority: Major > > Null Values in the Stream for a Left Join would indicate a Tombstone Message > that needs to propagated if not actually joined with the GlobalKTable > message, hence these messages should not be ignored . -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] (KAFKA-16123) KStreamKStreamJoinProcessor does not drop late records.
[ https://issues.apache.org/jira/browse/KAFKA-16123 ] Florin Akermann deleted comment on KAFKA-16123: - was (Author: aki): I now committed a 'generalized fix' proposal > KStreamKStreamJoinProcessor does not drop late records. > --- > > Key: KAFKA-16123 > URL: https://issues.apache.org/jira/browse/KAFKA-16123 > Project: Kafka > Issue Type: Bug > Components: streams >Reporter: Florin Akermann >Assignee: Florin Akermann >Priority: Major > > Issue illustration: [https://github.com/apache/kafka/pull/15314/files] > Suggested fix: https://github.com/apache/kafka/pull/15189 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] (KAFKA-16123) KStreamKStreamJoinProcessor does not drop late records.
[ https://issues.apache.org/jira/browse/KAFKA-16123 ] Florin Akermann deleted comment on KAFKA-16123: - was (Author: aki): This might actually be a general issue and not just for null-key records? In other words, the problem already existed prior to KIP-962 for keyed records. See [https://github.com/apache/kafka/pull/15314/files] I'll generalize the PR to cover the key records as well. > KStreamKStreamJoinProcessor does not drop late records. > --- > > Key: KAFKA-16123 > URL: https://issues.apache.org/jira/browse/KAFKA-16123 > Project: Kafka > Issue Type: Bug > Components: streams >Reporter: Florin Akermann >Assignee: Florin Akermann >Priority: Major > > Issue illustration: [https://github.com/apache/kafka/pull/15314/files] > Suggested fix: https://github.com/apache/kafka/pull/15189 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-16123) KStreamKStreamJoinProcessor does not drop late records.
[ https://issues.apache.org/jira/browse/KAFKA-16123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Florin Akermann updated KAFKA-16123: Description: Issue illustration: [https://github.com/apache/kafka/pull/15314/files] Suggested fix: https://github.com/apache/kafka/pull/15189 was:Issue illustration: [https://github.com/apache/kafka/pull/15314/files] > KStreamKStreamJoinProcessor does not drop late records. > --- > > Key: KAFKA-16123 > URL: https://issues.apache.org/jira/browse/KAFKA-16123 > Project: Kafka > Issue Type: Bug > Components: streams >Reporter: Florin Akermann >Assignee: Florin Akermann >Priority: Major > > Issue illustration: [https://github.com/apache/kafka/pull/15314/files] > Suggested fix: https://github.com/apache/kafka/pull/15189 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-16123) KStreamKStreamJoinProcessor does not drop late records.
[ https://issues.apache.org/jira/browse/KAFKA-16123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Florin Akermann updated KAFKA-16123: Description: Issue illustration: [https://github.com/apache/kafka/pull/15314/files] (was: As part of KIP-962 the non-null key requirements have been relaxed for left and outer joins. However, the implementation forwards null-key records for left/outer joins unconditionally of the join window. Old title: KStreamKStreamJoinProcessor forwards null-key records for left/outer joins unconditionally of the join window) > KStreamKStreamJoinProcessor does not drop late records. > --- > > Key: KAFKA-16123 > URL: https://issues.apache.org/jira/browse/KAFKA-16123 > Project: Kafka > Issue Type: Bug > Components: streams >Reporter: Florin Akermann >Assignee: Florin Akermann >Priority: Major > > Issue illustration: [https://github.com/apache/kafka/pull/15314/files] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-16123) KStreamKStreamJoinProcessor does not drop late records.
[ https://issues.apache.org/jira/browse/KAFKA-16123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Florin Akermann updated KAFKA-16123: Summary: KStreamKStreamJoinProcessor does not drop late records. (was: KStreamKStreamJoinProcessor forwards null-key records for left/outer joins unconditionally of the join window.) > KStreamKStreamJoinProcessor does not drop late records. > --- > > Key: KAFKA-16123 > URL: https://issues.apache.org/jira/browse/KAFKA-16123 > Project: Kafka > Issue Type: Bug > Components: streams >Reporter: Florin Akermann >Assignee: Florin Akermann >Priority: Major > > As part of KIP-962 the non-null key requirements have been relaxed for left > and outer joins. > However, the implementation forwards null-key records for left/outer joins > unconditionally of the join window. > Old title: > KStreamKStreamJoinProcessor forwards null-key records for left/outer joins > unconditionally of the join window -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-16123) KStreamKStreamJoinProcessor forwards null-key records for left/outer joins unconditionally of the join window.
[ https://issues.apache.org/jira/browse/KAFKA-16123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Florin Akermann updated KAFKA-16123: Description: As part of KIP-962 the non-null key requirements have been relaxed for left and outer joins. However, the implementation forwards null-key records for left/outer joins unconditionally of the join window. Old title: KStreamKStreamJoinProcessor forwards null-key records for left/outer joins unconditionally of the join window was: As part of KIP-962 the non-null key requirements have been relaxed for left and outer joins. However, the implementation forwards null-key records for left/outer joins unconditionally of the join window. > KStreamKStreamJoinProcessor forwards null-key records for left/outer joins > unconditionally of the join window. > -- > > Key: KAFKA-16123 > URL: https://issues.apache.org/jira/browse/KAFKA-16123 > Project: Kafka > Issue Type: Bug > Components: streams >Reporter: Florin Akermann >Assignee: Florin Akermann >Priority: Major > > As part of KIP-962 the non-null key requirements have been relaxed for left > and outer joins. > However, the implementation forwards null-key records for left/outer joins > unconditionally of the join window. > Old title: > KStreamKStreamJoinProcessor forwards null-key records for left/outer joins > unconditionally of the join window -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (KAFKA-16123) KStreamKStreamJoinProcessor forwards null-key records for left/outer joins unconditionally of the join window.
[ https://issues.apache.org/jira/browse/KAFKA-16123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17816355#comment-17816355 ] Florin Akermann edited comment on KAFKA-16123 at 2/10/24 10:18 PM: --- I now committed a 'generalized fix' proposal was (Author: aki): I now committed a 'generalized fix' > KStreamKStreamJoinProcessor forwards null-key records for left/outer joins > unconditionally of the join window. > -- > > Key: KAFKA-16123 > URL: https://issues.apache.org/jira/browse/KAFKA-16123 > Project: Kafka > Issue Type: Bug > Components: streams >Reporter: Florin Akermann >Assignee: Florin Akermann >Priority: Major > > As part of KIP-962 the non-null key requirements have been relaxed for left > and outer joins. > However, the implementation forwards null-key records for left/outer joins > unconditionally of the join window. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-16123) KStreamKStreamJoinProcessor forwards null-key records for left/outer joins unconditionally of the join window.
[ https://issues.apache.org/jira/browse/KAFKA-16123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17816355#comment-17816355 ] Florin Akermann commented on KAFKA-16123: - I now committed a 'generalized fix' > KStreamKStreamJoinProcessor forwards null-key records for left/outer joins > unconditionally of the join window. > -- > > Key: KAFKA-16123 > URL: https://issues.apache.org/jira/browse/KAFKA-16123 > Project: Kafka > Issue Type: Bug > Components: streams >Reporter: Florin Akermann >Assignee: Florin Akermann >Priority: Major > > As part of KIP-962 the non-null key requirements have been relaxed for left > and outer joins. > However, the implementation forwards null-key records for left/outer joins > unconditionally of the join window. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (KAFKA-16123) KStreamKStreamJoinProcessor forwards null-key records for left/outer joins unconditionally of the join window.
[ https://issues.apache.org/jira/browse/KAFKA-16123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17814001#comment-17814001 ] Florin Akermann edited comment on KAFKA-16123 at 2/10/24 10:46 AM: --- This might actually be a general issue and not just for null-key records? In other words, the problem already existed prior to KIP-962 for keyed records. See [https://github.com/apache/kafka/pull/15314/files] I'll generalize the PR to cover the key records as well. was (Author: aki): This might actually be a general issue and not just for null-key records? In other words, the problem already existed prior to KIP-962 for non keyed records. See [https://github.com/apache/kafka/pull/15314/files] I'll generalize the PR to cover the key records as well. > KStreamKStreamJoinProcessor forwards null-key records for left/outer joins > unconditionally of the join window. > -- > > Key: KAFKA-16123 > URL: https://issues.apache.org/jira/browse/KAFKA-16123 > Project: Kafka > Issue Type: Bug > Components: streams >Reporter: Florin Akermann >Assignee: Florin Akermann >Priority: Major > > As part of KIP-962 the non-null key requirements have been relaxed for left > and outer joins. > However, the implementation forwards null-key records for left/outer joins > unconditionally of the join window. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (KAFKA-16123) KStreamKStreamJoinProcessor forwards null-key records for left/outer joins unconditionally of the join window.
[ https://issues.apache.org/jira/browse/KAFKA-16123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17814001#comment-17814001 ] Florin Akermann edited comment on KAFKA-16123 at 2/10/24 10:45 AM: --- This might actually be a general issue and not just for null-key records? In other words, the problem already existed prior to KIP-962 for non null-key records. See [https://github.com/apache/kafka/pull/15314/files] I'll generalize the PR to cover the key records as well. was (Author: aki): This might actually be a general issue and not just for null-key records? In other words, the problem already existed prior to KIP-962 for non null-key records. See [https://github.com/apache/kafka/pull/15314/files] > KStreamKStreamJoinProcessor forwards null-key records for left/outer joins > unconditionally of the join window. > -- > > Key: KAFKA-16123 > URL: https://issues.apache.org/jira/browse/KAFKA-16123 > Project: Kafka > Issue Type: Bug > Components: streams >Reporter: Florin Akermann >Assignee: Florin Akermann >Priority: Major > > As part of KIP-962 the non-null key requirements have been relaxed for left > and outer joins. > However, the implementation forwards null-key records for left/outer joins > unconditionally of the join window. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (KAFKA-16123) KStreamKStreamJoinProcessor forwards null-key records for left/outer joins unconditionally of the join window.
[ https://issues.apache.org/jira/browse/KAFKA-16123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17814001#comment-17814001 ] Florin Akermann edited comment on KAFKA-16123 at 2/10/24 10:46 AM: --- This might actually be a general issue and not just for null-key records? In other words, the problem already existed prior to KIP-962 for non keyed records. See [https://github.com/apache/kafka/pull/15314/files] I'll generalize the PR to cover the key records as well. was (Author: aki): This might actually be a general issue and not just for null-key records? In other words, the problem already existed prior to KIP-962 for non null-key records. See [https://github.com/apache/kafka/pull/15314/files] I'll generalize the PR to cover the key records as well. > KStreamKStreamJoinProcessor forwards null-key records for left/outer joins > unconditionally of the join window. > -- > > Key: KAFKA-16123 > URL: https://issues.apache.org/jira/browse/KAFKA-16123 > Project: Kafka > Issue Type: Bug > Components: streams >Reporter: Florin Akermann >Assignee: Florin Akermann >Priority: Major > > As part of KIP-962 the non-null key requirements have been relaxed for left > and outer joins. > However, the implementation forwards null-key records for left/outer joins > unconditionally of the join window. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (KAFKA-16123) KStreamKStreamJoinProcessor forwards null-key records for left/outer joins unconditionally of the join window.
[ https://issues.apache.org/jira/browse/KAFKA-16123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17814001#comment-17814001 ] Florin Akermann edited comment on KAFKA-16123 at 2/3/24 11:21 PM: -- This might actually be a general issue and not just for null-key records? In other words, the problem already existed prior to KIP-962 for non null-key records. See [https://github.com/apache/kafka/pull/15314/files] was (Author: aki): This might actually be a general issue and not just for null-key records? In other words, the problem already existed prior to KIP-962. See [https://github.com/apache/kafka/pull/15314/files] > KStreamKStreamJoinProcessor forwards null-key records for left/outer joins > unconditionally of the join window. > -- > > Key: KAFKA-16123 > URL: https://issues.apache.org/jira/browse/KAFKA-16123 > Project: Kafka > Issue Type: Bug > Components: streams >Reporter: Florin Akermann >Assignee: Florin Akermann >Priority: Major > > As part of KIP-962 the non-null key requirements have been relaxed for left > and outer joins. > However, the implementation forwards null-key records for left/outer joins > unconditionally of the join window. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-16123) KStreamKStreamJoinProcessor forwards null-key records for left/outer joins unconditionally of the join window.
[ https://issues.apache.org/jira/browse/KAFKA-16123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Florin Akermann updated KAFKA-16123: Summary: KStreamKStreamJoinProcessor forwards null-key records for left/outer joins unconditionally of the join window. (was: KStreamKStreamJoinProcessor forwards null records for left/outer joins unconditionally of the join window.) > KStreamKStreamJoinProcessor forwards null-key records for left/outer joins > unconditionally of the join window. > -- > > Key: KAFKA-16123 > URL: https://issues.apache.org/jira/browse/KAFKA-16123 > Project: Kafka > Issue Type: Bug > Components: streams >Reporter: Florin Akermann >Assignee: Florin Akermann >Priority: Major > > As part of KIP-962 the non-null key requirements have been relaxed for left > and outer joins. > However, the implementation forwards null-key records for left/outer joins > unconditionally of the join window. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (KAFKA-16123) KStreamKStreamJoinProcessor forwards null records for left/outer joins unconditionally of the join window.
[ https://issues.apache.org/jira/browse/KAFKA-16123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17814001#comment-17814001 ] Florin Akermann edited comment on KAFKA-16123 at 2/3/24 11:19 PM: -- This might actually be a general issue and not just for null-key records? In other words, the problem already existed prior to KIP-962. See [https://github.com/apache/kafka/pull/15314/files] was (Author: aki): This might actually be a general issue and not just for null-key records? In other words, the problem existed prior to KIP-962. See [https://github.com/apache/kafka/pull/15314/files] > KStreamKStreamJoinProcessor forwards null records for left/outer joins > unconditionally of the join window. > -- > > Key: KAFKA-16123 > URL: https://issues.apache.org/jira/browse/KAFKA-16123 > Project: Kafka > Issue Type: Bug > Components: streams >Reporter: Florin Akermann >Assignee: Florin Akermann >Priority: Major > > As part of KIP-962 the non-null key requirements have been relaxed for left > and outer joins. > However, the implementation forwards null-key records for left/outer joins > unconditionally of the join window. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (KAFKA-16123) KStreamKStreamJoinProcessor forwards null records for left/outer joins unconditionally of the join window.
[ https://issues.apache.org/jira/browse/KAFKA-16123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17814001#comment-17814001 ] Florin Akermann edited comment on KAFKA-16123 at 2/3/24 11:19 PM: -- This might actually be a general issue and not just for null-key records? In other words, the problem existed prior to KIP-962. See [https://github.com/apache/kafka/pull/15314/files] was (Author: aki): This might actually be a general issue and not just for null-key records? In other words the problem existed prior to KIP-962. See [https://github.com/apache/kafka/pull/15314/files] > KStreamKStreamJoinProcessor forwards null records for left/outer joins > unconditionally of the join window. > -- > > Key: KAFKA-16123 > URL: https://issues.apache.org/jira/browse/KAFKA-16123 > Project: Kafka > Issue Type: Bug > Components: streams >Reporter: Florin Akermann >Assignee: Florin Akermann >Priority: Major > > As part of KIP-962 the non-null key requirements have been relaxed for left > and outer joins. > However, the implementation forwards null-key records for left/outer joins > unconditionally of the join window. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (KAFKA-16123) KStreamKStreamJoinProcessor forwards null records for left/outer joins unconditionally of the join window.
[ https://issues.apache.org/jira/browse/KAFKA-16123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17814001#comment-17814001 ] Florin Akermann edited comment on KAFKA-16123 at 2/3/24 11:18 PM: -- This might actually be a general issue and not just for null-key records? In other words the problem existed prior to KIP-962. See [https://github.com/apache/kafka/pull/15314/files] was (Author: aki): This might actually be a general issue and not just for null-key records? In other words the problem existed prior to KIP-962. See [https://github.com/apache/kafka/pull/15314/files] > KStreamKStreamJoinProcessor forwards null records for left/outer joins > unconditionally of the join window. > -- > > Key: KAFKA-16123 > URL: https://issues.apache.org/jira/browse/KAFKA-16123 > Project: Kafka > Issue Type: Bug > Components: streams >Reporter: Florin Akermann >Assignee: Florin Akermann >Priority: Major > > As part of KIP-962 the non-null key requirements have been relaxed for left > and outer joins. > However, the implementation forwards null records for left/outer joins > unconditionally of the join window. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-16123) KStreamKStreamJoinProcessor forwards null records for left/outer joins unconditionally of the join window.
[ https://issues.apache.org/jira/browse/KAFKA-16123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Florin Akermann updated KAFKA-16123: Description: As part of KIP-962 the non-null key requirements have been relaxed for left and outer joins. However, the implementation forwards null-key records for left/outer joins unconditionally of the join window. was: As part of KIP-962 the non-null key requirements have been relaxed for left and outer joins. However, the implementation forwards null records for left/outer joins unconditionally of the join window. > KStreamKStreamJoinProcessor forwards null records for left/outer joins > unconditionally of the join window. > -- > > Key: KAFKA-16123 > URL: https://issues.apache.org/jira/browse/KAFKA-16123 > Project: Kafka > Issue Type: Bug > Components: streams >Reporter: Florin Akermann >Assignee: Florin Akermann >Priority: Major > > As part of KIP-962 the non-null key requirements have been relaxed for left > and outer joins. > However, the implementation forwards null-key records for left/outer joins > unconditionally of the join window. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-16123) KStreamKStreamJoinProcessor forwards null records for left/outer joins unconditionally of the join window.
[ https://issues.apache.org/jira/browse/KAFKA-16123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17814001#comment-17814001 ] Florin Akermann commented on KAFKA-16123: - This might actually be a general issue and not just for null-key records? In other words the problem existed prior to KIP-962. See [https://github.com/apache/kafka/pull/15314/files] > KStreamKStreamJoinProcessor forwards null records for left/outer joins > unconditionally of the join window. > -- > > Key: KAFKA-16123 > URL: https://issues.apache.org/jira/browse/KAFKA-16123 > Project: Kafka > Issue Type: Bug > Components: streams >Reporter: Florin Akermann >Assignee: Florin Akermann >Priority: Major > > As part of KIP-962 the non-null key requirements have been relaxed for left > and outer joins. > However, the implementation forwards null records for left/outer joins > unconditionally of the join window. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (KAFKA-16123) KStreamKStreamJoinProcessor forwards null records for left/outer joins unconditionally of the join window.
[ https://issues.apache.org/jira/browse/KAFKA-16123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Florin Akermann updated KAFKA-16123: Description: As part of KIP-962 the non-null key requirements have been relaxed for left and outer joins. However, the implementation forwards null records for left/outer joins unconditionally of the join window. was: As part of KIP-962 the non-null key requirements have been relaxed for left and outer joins. However, the implementation forwards null records for left/outer joins unconditionally of the join window. > KStreamKStreamJoinProcessor forwards null records for left/outer joins > unconditionally of the join window. > -- > > Key: KAFKA-16123 > URL: https://issues.apache.org/jira/browse/KAFKA-16123 > Project: Kafka > Issue Type: Bug >Reporter: Florin Akermann >Assignee: Florin Akermann >Priority: Major > > As part of KIP-962 the non-null key requirements have been relaxed for left > and outer joins. > However, the implementation forwards null records for left/outer joins > unconditionally of the join window. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KAFKA-16123) KStreamKStreamJoinProcessor forwards null records for left/outer joins unconditionally of the join window.
Florin Akermann created KAFKA-16123: --- Summary: KStreamKStreamJoinProcessor forwards null records for left/outer joins unconditionally of the join window. Key: KAFKA-16123 URL: https://issues.apache.org/jira/browse/KAFKA-16123 Project: Kafka Issue Type: Bug Reporter: Florin Akermann Assignee: Florin Akermann As part of KIP-962 the non-null key requirements have been relaxed for left and outer joins. However, the implementation forwards null records for left/outer joins unconditionally of the join window. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-12317) Relax non-null key requirement for left/outer KStream joins
[ https://issues.apache.org/jira/browse/KAFKA-12317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Florin Akermann resolved KAFKA-12317. - Resolution: Fixed > Relax non-null key requirement for left/outer KStream joins > --- > > Key: KAFKA-12317 > URL: https://issues.apache.org/jira/browse/KAFKA-12317 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Matthias J. Sax >Assignee: Florin Akermann >Priority: Major > Labels: kip > > Currently, for a stream-streams and stream-table/globalTable join > KafkaStreams drops all stream records with a `null`{-}key (`null`-join-key > for stream-globalTable), because for a `null`{-}(join)key the join is > undefined: ie, we don't have an attribute the do the table lookup (we > consider the stream-record as malformed). Note, that we define the semantics > of _left/outer_ join as: keep the stream record if no matching join record > was found. > We could relax the definition of _left_ stream-table/globalTable and > _left/outer_ stream-stream join though, and not drop `null`-(join)key stream > records, and call the ValueJoiner with a `null` "other-side" value instead: > if the stream record key (or join-key) is `null`, we could treat is as > "failed lookup" instead of treating the stream record as corrupted. > If we make this change, users that want to keep the current behavior, can add > a `filter()` before the join to drop `null`-(join)key records from the stream > explicitly. > Note that this change also requires to change the behavior if we insert a > repartition topic before the join: currently, we drop `null`-key record > before writing into the repartition topic (as we know they would be dropped > later anyway). We need to relax this behavior for a left stream-table and > left/outer stream-stream join. User need to be aware (ie, we might need to > put this into the docs and JavaDocs), that records with `null`-key would be > partitioned randomly. > KIP-962: > [https://cwiki.apache.org/confluence/display/KAFKA/KIP-962%3A+Relax+non-null+key+requirement+in+Kafka+Streams] > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (KAFKA-14748) Relax non-null FK left-join requirement
[ https://issues.apache.org/jira/browse/KAFKA-14748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Florin Akermann resolved KAFKA-14748. - Resolution: Fixed > Relax non-null FK left-join requirement > --- > > Key: KAFKA-14748 > URL: https://issues.apache.org/jira/browse/KAFKA-14748 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Matthias J. Sax >Assignee: Florin Akermann >Priority: Major > Labels: kip > > Kafka Streams enforces a strict non-null-key policy in the DSL across all > key-dependent operations (like aggregations and joins). > This also applies to FK-joins, in particular to the ForeignKeyExtractor. If > it returns `null`, it's treated as invalid. For left-joins, it might make > sense to still accept a `null`, and add the left-hand record with an empty > right-hand-side to the result. > KIP-962: > [https://cwiki.apache.org/confluence/display/KAFKA/KIP-962%3A+Relax+non-null+key+requirement+in+Kafka+Streams] > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (KAFKA-14748) Relax non-null FK left-join requirement
[ https://issues.apache.org/jira/browse/KAFKA-14748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746242#comment-17746242 ] Florin Akermann edited comment on KAFKA-14748 at 8/13/23 10:49 AM: --- Hi, I will have a go at this. I hope that's ok. was (Author: aki): Hi, I will have a go at this. I hope that's ok. If I haven't drafted any KIP or PR within three weeks then I am ok to be removed again. > Relax non-null FK left-join requirement > --- > > Key: KAFKA-14748 > URL: https://issues.apache.org/jira/browse/KAFKA-14748 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Matthias J. Sax >Assignee: Florin Akermann >Priority: Major > Labels: kip > > Kafka Streams enforces a strict non-null-key policy in the DSL across all > key-dependent operations (like aggregations and joins). > This also applies to FK-joins, in particular to the ForeignKeyExtractor. If > it returns `null`, it's treated as invalid. For left-joins, it might make > sense to still accept a `null`, and add the left-hand record with an empty > right-hand-side to the result. > KIP-962: > [https://cwiki.apache.org/confluence/display/KAFKA/KIP-962%3A+Relax+non-null+key+requirement+in+Kafka+Streams] > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (KAFKA-12317) Relax non-null key requirement for left/outer KStream joins
[ https://issues.apache.org/jira/browse/KAFKA-12317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746244#comment-17746244 ] Florin Akermann edited comment on KAFKA-12317 at 8/13/23 10:48 AM: --- Hi, I will have a go at this. I hope that's ok. was (Author: aki): Hi, I will have a go at this. I hope that's ok. If I haven't drafted any KIP or PR within three weeks then I am ok to be removed again. > Relax non-null key requirement for left/outer KStream joins > --- > > Key: KAFKA-12317 > URL: https://issues.apache.org/jira/browse/KAFKA-12317 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Matthias J. Sax >Assignee: Florin Akermann >Priority: Major > Labels: kip > > Currently, for a stream-streams and stream-table/globalTable join > KafkaStreams drops all stream records with a `null`{-}key (`null`-join-key > for stream-globalTable), because for a `null`{-}(join)key the join is > undefined: ie, we don't have an attribute the do the table lookup (we > consider the stream-record as malformed). Note, that we define the semantics > of _left/outer_ join as: keep the stream record if no matching join record > was found. > We could relax the definition of _left_ stream-table/globalTable and > _left/outer_ stream-stream join though, and not drop `null`-(join)key stream > records, and call the ValueJoiner with a `null` "other-side" value instead: > if the stream record key (or join-key) is `null`, we could treat is as > "failed lookup" instead of treating the stream record as corrupted. > If we make this change, users that want to keep the current behavior, can add > a `filter()` before the join to drop `null`-(join)key records from the stream > explicitly. > Note that this change also requires to change the behavior if we insert a > repartition topic before the join: currently, we drop `null`-key record > before writing into the repartition topic (as we know they would be dropped > later anyway). We need to relax this behavior for a left stream-table and > left/outer stream-stream join. User need to be aware (ie, we might need to > put this into the docs and JavaDocs), that records with `null`-key would be > partitioned randomly. > KIP-962: > [https://cwiki.apache.org/confluence/display/KAFKA/KIP-962%3A+Relax+non-null+key+requirement+in+Kafka+Streams] > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-13197) KStream-GlobalKTable join semantics don't match documentation
[ https://issues.apache.org/jira/browse/KAFKA-13197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17753349#comment-17753349 ] Florin Akermann commented on KAFKA-13197: - [~twbecker], thanks for raising this. The documentation has been fixed. KIP-962 will allow for the old behavior again: 'no longer drop records when KeyValueMapper returns 'null' and call ValueJoiner with 'null' for right value'. > KStream-GlobalKTable join semantics don't match documentation > - > > Key: KAFKA-13197 > URL: https://issues.apache.org/jira/browse/KAFKA-13197 > Project: Kafka > Issue Type: Bug > Components: documentation, streams >Affects Versions: 2.7.0 >Reporter: Tommy Becker >Assignee: Florin Akermann >Priority: Major > Fix For: 3.6.0, 3.5.2 > > > As part of KAFKA-10277, the behavior of KStream-GlobalKTable joins was > changed. It appears the change was intended to merely relax a requirement but > it actually broke backwards compatibility. Although it does allow {{null}} > keys and values in the KStream to be joined, it now excludes {{null}} results > of the {{KeyValueMapper}}. We have an application which can return {{null}} > from the {{KeyValueMapper}} for non-null keys in the KStream, and relies on > these nulls being passed to the {{ValueJoiner}}. Indeed the javadoc still > explicitly says this is done: > {quote}If a KStream input record key or value is null the record will not be > included in the join operation and thus no output record will be added to the > resulting KStream. > If keyValueMapper returns null implying no match exists, a null value will > be provided to ValueJoiner. > {quote} > Both these statements are incorrect. > I think the new behavior is worse than the previous/documented behavior. It > feels more reasonable to have a non-null stream record map to a null join key > (our use-case is event-enhancement where the incoming record doesn't have the > join field), than the reverse. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-14049) Relax Non Null Requirement for KStreamGlobalKTable Left Join
[ https://issues.apache.org/jira/browse/KAFKA-14049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17752738#comment-17752738 ] Florin Akermann commented on KAFKA-14049: - Same for me. I simply assigned it to me as it was linked to https://issues.apache.org/jira/browse/KAFKA-12317. [~SAUMYAG] could you elaborate? Else I suggest closing this issue once https://issues.apache.org/jira/browse/KAFKA-12317 is resolved. > Relax Non Null Requirement for KStreamGlobalKTable Left Join > > > Key: KAFKA-14049 > URL: https://issues.apache.org/jira/browse/KAFKA-14049 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Saumya Gupta >Assignee: Florin Akermann >Priority: Major > > Null Values in the Stream for a Left Join would indicate a Tombstone Message > that needs to propagated if not actually joined with the GlobalKTable > message, hence these messages should not be ignored . -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-12317) Relax non-null key requirement for left/outer KStream joins
[ https://issues.apache.org/jira/browse/KAFKA-12317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17752407#comment-17752407 ] Florin Akermann commented on KAFKA-12317: - Drafted a PR: [https://github.com/apache/kafka/pull/14174] > Relax non-null key requirement for left/outer KStream joins > --- > > Key: KAFKA-12317 > URL: https://issues.apache.org/jira/browse/KAFKA-12317 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Matthias J. Sax >Assignee: Florin Akermann >Priority: Major > > Currently, for a stream-streams and stream-table/globalTable join > KafkaStreams drops all stream records with a `null`-key (`null`-join-key for > stream-globalTable), because for a `null`-(join)key the join is undefined: > ie, we don't have an attribute the do the table lookup (we consider the > stream-record as malformed). Note, that we define the semantics of > _left/outer_ join as: keep the stream record if no matching join record was > found. > We could relax the definition of _left_ stream-table/globalTable and > _left/outer_ stream-stream join though, and not drop `null`-(join)key stream > records, and call the ValueJoiner with a `null` "other-side" value instead: > if the stream record key (or join-key) is `null`, we could treat is as > "failed lookup" instead of treating the stream record as corrupted. > If we make this change, users that want to keep the current behavior, can add > a `filter()` before the join to drop `null`-(join)key records from the stream > explicitly. > Note that this change also requires to change the behavior if we insert a > repartition topic before the join: currently, we drop `null`-key record > before writing into the repartition topic (as we know they would be dropped > later anyway). We need to relax this behavior for a left stream-table and > left/outer stream-stream join. User need to be aware (ie, we might need to > put this into the docs and JavaDocs), that records with `null`-key would be > partitioned randomly. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-12317) Relax non-null key requirement for left/outer KStream joins
[ https://issues.apache.org/jira/browse/KAFKA-12317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17749322#comment-17749322 ] Florin Akermann commented on KAFKA-12317: - [~guozhang] [~mjsax] KIP: https://cwiki.apache.org/confluence/display/KAFKA/KIP-962%3A+Relax+non-null+key+requirement+in+Kafka+Streams > Relax non-null key requirement for left/outer KStream joins > --- > > Key: KAFKA-12317 > URL: https://issues.apache.org/jira/browse/KAFKA-12317 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Matthias J. Sax >Assignee: Florin Akermann >Priority: Major > > Currently, for a stream-streams and stream-table/globalTable join > KafkaStreams drops all stream records with a `null`-key (`null`-join-key for > stream-globalTable), because for a `null`-(join)key the join is undefined: > ie, we don't have an attribute the do the table lookup (we consider the > stream-record as malformed). Note, that we define the semantics of > _left/outer_ join as: keep the stream record if no matching join record was > found. > We could relax the definition of _left_ stream-table/globalTable and > _left/outer_ stream-stream join though, and not drop `null`-(join)key stream > records, and call the ValueJoiner with a `null` "other-side" value instead: > if the stream record key (or join-key) is `null`, we could treat is as > "failed lookup" instead of treating the stream record as corrupted. > If we make this change, users that want to keep the current behavior, can add > a `filter()` before the join to drop `null`-(join)key records from the stream > explicitly. > Note that this change also requires to change the behavior if we insert a > repartition topic before the join: currently, we drop `null`-key record > before writing into the repartition topic (as we know they would be dropped > later anyway). We need to relax this behavior for a left stream-table and > left/outer stream-stream join. User need to be aware (ie, we might need to > put this into the docs and JavaDocs), that records with `null`-key would be > partitioned randomly. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-14748) Relax non-null FK left-join requirement
[ https://issues.apache.org/jira/browse/KAFKA-14748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17749325#comment-17749325 ] Florin Akermann commented on KAFKA-14748: - Thank you [~guozhang]. Here is the KIP: [https://cwiki.apache.org/confluence/display/KAFKA/KIP-962%3A+Relax+non-null+key+requirement+in+Kafka+Streams] > Relax non-null FK left-join requirement > --- > > Key: KAFKA-14748 > URL: https://issues.apache.org/jira/browse/KAFKA-14748 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Matthias J. Sax >Assignee: Florin Akermann >Priority: Major > > Kafka Streams enforces a strict non-null-key policy in the DSL across all > key-dependent operations (like aggregations and joins). > This also applies to FK-joins, in particular to the ForeignKeyExtractor. If > it returns `null`, it's treated as invalid. For left-joins, it might make > sense to still accept a `null`, and add the left-hand record with an empty > right-hand-side to the result. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (KAFKA-14748) Relax non-null FK left-join requirement
[ https://issues.apache.org/jira/browse/KAFKA-14748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17747670#comment-17747670 ] Florin Akermann edited comment on KAFKA-14748 at 7/29/23 12:55 PM: --- I have created a PR for this particular issue: foreignKeyExtractor only. [https://github.com/apache/kafka/pull/14107] I am ok with first considering merging when I have addressed the 3 related issues as well. Nonetheless it would be great to have some feedback. Just so I know whether I have understood the scope of this particular item (KAFKA-14748) and what I should keep in mind for the other incoming merge requests. was (Author: aki): I have created a PR for this particular issue. [https://github.com/apache/kafka/pull/14107] I am ok with first considering merging when I have addressed the 3 related issues as well. Nonetheless it would be great to have some feedback. Just so I know whether I have understood the scope of this particular item (KAFKA-14748) and what I should keep in mind for the other incoming merge requests. > Relax non-null FK left-join requirement > --- > > Key: KAFKA-14748 > URL: https://issues.apache.org/jira/browse/KAFKA-14748 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Matthias J. Sax >Assignee: Florin Akermann >Priority: Major > > Kafka Streams enforces a strict non-null-key policy in the DSL across all > key-dependent operations (like aggregations and joins). > This also applies to FK-joins, in particular to the ForeignKeyExtractor. If > it returns `null`, it's treated as invalid. For left-joins, it might make > sense to still accept a `null`, and add the left-hand record with an empty > right-hand-side to the result. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (KAFKA-14748) Relax non-null FK left-join requirement
[ https://issues.apache.org/jira/browse/KAFKA-14748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17747670#comment-17747670 ] Florin Akermann edited comment on KAFKA-14748 at 7/26/23 9:54 PM: -- I have created a PR for this particular issue. [https://github.com/apache/kafka/pull/14107] I am ok with first considering merging when I have addressed the 3 related issues as well. Nonetheless it would be great to have some feedback. Just so I know whether I have understood the scope of this particular item (KAFKA-14748) and what I should keep in mind for the other incoming merge requests. was (Author: aki): I have created a PR for this particular issue. [https://github.com/apache/kafka/pull/14107] I am ok with first considering merging when I have addressed the 4 related issues as well. Nonetheless it would be great to have some feedback. Just so I know whether I have understood the scope of this particular item (KAFKA-14748) and what I should keep in mind for the other incoming merge requests. > Relax non-null FK left-join requirement > --- > > Key: KAFKA-14748 > URL: https://issues.apache.org/jira/browse/KAFKA-14748 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Matthias J. Sax >Assignee: Florin Akermann >Priority: Major > > Kafka Streams enforces a strict non-null-key policy in the DSL across all > key-dependent operations (like aggregations and joins). > This also applies to FK-joins, in particular to the ForeignKeyExtractor. If > it returns `null`, it's treated as invalid. For left-joins, it might make > sense to still accept a `null`, and add the left-hand record with an empty > right-hand-side to the result. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-12317) Relax non-null key requirement for left/outer KStream joins
[ https://issues.apache.org/jira/browse/KAFKA-12317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17747671#comment-17747671 ] Florin Akermann commented on KAFKA-12317: - Thanks [~mjsax], I have assigned myself to the rest of them (KAFKA-13197, KAFKA-14748, KAFKA-14049). KAFKA-12845 is in status resolved so I assume this one is no longer relevant. > Relax non-null key requirement for left/outer KStream joins > --- > > Key: KAFKA-12317 > URL: https://issues.apache.org/jira/browse/KAFKA-12317 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Matthias J. Sax >Assignee: Florin Akermann >Priority: Major > > Currently, for a stream-streams and stream-table/globalTable join > KafkaStreams drops all stream records with a `null`-key (`null`-join-key for > stream-globalTable), because for a `null`-(join)key the join is undefined: > ie, we don't have an attribute the do the table lookup (we consider the > stream-record as malformed). Note, that we define the semantics of > _left/outer_ join as: keep the stream record if no matching join record was > found. > We could relax the definition of _left_ stream-table/globalTable and > _left/outer_ stream-stream join though, and not drop `null`-(join)key stream > records, and call the ValueJoiner with a `null` "other-side" value instead: > if the stream record key (or join-key) is `null`, we could treat is as > "failed lookup" instead of treating the stream record as corrupted. > If we make this change, users that want to keep the current behavior, can add > a `filter()` before the join to drop `null`-(join)key records from the stream > explicitly. > Note that this change also requires to change the behavior if we insert a > repartition topic before the join: currently, we drop `null`-key record > before writing into the repartition topic (as we know they would be dropped > later anyway). We need to relax this behavior for a left stream-table and > left/outer stream-stream join. User need to be aware (ie, we might need to > put this into the docs and JavaDocs), that records with `null`-key would be > partitioned randomly. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-14748) Relax non-null FK left-join requirement
[ https://issues.apache.org/jira/browse/KAFKA-14748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17747670#comment-17747670 ] Florin Akermann commented on KAFKA-14748: - I have created a PR for this particular issue. [https://github.com/apache/kafka/pull/14107] I am ok with first considering merging when I have addressed the 4 related issues as well. Nonetheless it would be great to have some feedback. Just so I know whether I have understood the scope of this particular item (KAFKA-14748) and what I should keep in mind for the other incoming merge requests. > Relax non-null FK left-join requirement > --- > > Key: KAFKA-14748 > URL: https://issues.apache.org/jira/browse/KAFKA-14748 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Matthias J. Sax >Assignee: Florin Akermann >Priority: Major > > Kafka Streams enforces a strict non-null-key policy in the DSL across all > key-dependent operations (like aggregations and joins). > This also applies to FK-joins, in particular to the ForeignKeyExtractor. If > it returns `null`, it's treated as invalid. For left-joins, it might make > sense to still accept a `null`, and add the left-hand record with an empty > right-hand-side to the result. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-14049) Relax Non Null Requirement for KStreamGlobalKTable Left Join
[ https://issues.apache.org/jira/browse/KAFKA-14049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17747259#comment-17747259 ] Florin Akermann commented on KAFKA-14049: - Hi, [~pnee] I work on https://issues.apache.org/jira/browse/KAFKA-12317 and https://issues.apache.org/jira/browse/KAFKA-14748 do u mind if I take over this item as well. I assume it is ok as there hasn't been any activity since April. If I haven't drafted any KIP or PR within three weeks then I am ok to be removed again. > Relax Non Null Requirement for KStreamGlobalKTable Left Join > > > Key: KAFKA-14049 > URL: https://issues.apache.org/jira/browse/KAFKA-14049 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Saumya Gupta >Assignee: Philip Nee >Priority: Major > Labels: beginner, newbie > > Null Values in the Stream for a Left Join would indicate a Tombstone Message > that needs to propagated if not actually joined with the GlobalKTable > message, hence these messages should not be ignored . -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (KAFKA-14049) Relax Non Null Requirement for KStreamGlobalKTable Left Join
[ https://issues.apache.org/jira/browse/KAFKA-14049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Florin Akermann reassigned KAFKA-14049: --- Assignee: Florin Akermann (was: Philip Nee) > Relax Non Null Requirement for KStreamGlobalKTable Left Join > > > Key: KAFKA-14049 > URL: https://issues.apache.org/jira/browse/KAFKA-14049 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Saumya Gupta >Assignee: Florin Akermann >Priority: Major > Labels: beginner, newbie > > Null Values in the Stream for a Left Join would indicate a Tombstone Message > that needs to propagated if not actually joined with the GlobalKTable > message, hence these messages should not be ignored . -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (KAFKA-13197) KStream-GlobalKTable join semantics don't match documentation
[ https://issues.apache.org/jira/browse/KAFKA-13197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Florin Akermann reassigned KAFKA-13197: --- Assignee: Florin Akermann > KStream-GlobalKTable join semantics don't match documentation > - > > Key: KAFKA-13197 > URL: https://issues.apache.org/jira/browse/KAFKA-13197 > Project: Kafka > Issue Type: Bug >Affects Versions: 2.7.0 >Reporter: Tommy Becker >Assignee: Florin Akermann >Priority: Major > > As part of KAFKA-10277, the behavior of KStream-GlobalKTable joins was > changed. It appears the change was intended to merely relax a requirement but > it actually broke backwards compatibility. Although it does allow {{null}} > keys and values in the KStream to be joined, it now excludes {{null}} results > of the {{KeyValueMapper}}. We have an application which can return {{null}} > from the {{KeyValueMapper}} for non-null keys in the KStream, and relies on > these nulls being passed to the {{ValueJoiner}}. Indeed the javadoc still > explicitly says this is done: > {quote}If a KStream input record key or value is null the record will not be > included in the join operation and thus no output record will be added to the > resulting KStream. > If keyValueMapper returns null implying no match exists, a null value will > be provided to ValueJoiner. > {quote} > Both these statements are incorrect. > I think the new behavior is worse than the previous/documented behavior. It > feels more reasonable to have a non-null stream record map to a null join key > (our use-case is event-enhancement where the incoming record doesn't have the > join field), than the reverse. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (KAFKA-12317) Relax non-null key requirement for left/outer KStream joins
[ https://issues.apache.org/jira/browse/KAFKA-12317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746244#comment-17746244 ] Florin Akermann edited comment on KAFKA-12317 at 7/24/23 7:38 AM: -- Hi, I will have a go at this. I hope that's ok. If I haven't drafted any KIP or PR within three weeks then I am ok to be removed again. was (Author: aki): Hi, I will have a go a this if ok. If I haven't drafted any KIP or PR within three weeks then I am ok to be removed again. > Relax non-null key requirement for left/outer KStream joins > --- > > Key: KAFKA-12317 > URL: https://issues.apache.org/jira/browse/KAFKA-12317 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Matthias J. Sax >Assignee: Florin Akermann >Priority: Major > > Currently, for a stream-streams and stream-table/globalTable join > KafkaStreams drops all stream records with a `null`-key (`null`-join-key for > stream-globalTable), because for a `null`-(join)key the join is undefined: > ie, we don't have an attribute the do the table lookup (we consider the > stream-record as malformed). Note, that we define the semantics of > _left/outer_ join as: keep the stream record if no matching join record was > found. > We could relax the definition of _left_ stream-table/globalTable and > _left/outer_ stream-stream join though, and not drop `null`-(join)key stream > records, and call the ValueJoiner with a `null` "other-side" value instead: > if the stream record key (or join-key) is `null`, we could treat is as > "failed lookup" instead of treating the stream record as corrupted. > If we make this change, users that want to keep the current behavior, can add > a `filter()` before the join to drop `null`-(join)key records from the stream > explicitly. > Note that this change also requires to change the behavior if we insert a > repartition topic before the join: currently, we drop `null`-key record > before writing into the repartition topic (as we know they would be dropped > later anyway). We need to relax this behavior for a left stream-table and > left/outer stream-stream join. User need to be aware (ie, we might need to > put this into the docs and JavaDocs), that records with `null`-key would be > partitioned randomly. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (KAFKA-14748) Relax non-null FK left-join requirement
[ https://issues.apache.org/jira/browse/KAFKA-14748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746242#comment-17746242 ] Florin Akermann edited comment on KAFKA-14748 at 7/24/23 7:37 AM: -- Hi, I will have a go at this. I hope that's ok. If I haven't drafted any KIP or PR within three weeks then I am ok to be removed again. was (Author: aki): Hi, I will have a go at this if ok. If I haven't drafted any KIP or PR within three weeks then I am ok to be removed again. > Relax non-null FK left-join requirement > --- > > Key: KAFKA-14748 > URL: https://issues.apache.org/jira/browse/KAFKA-14748 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Matthias J. Sax >Assignee: Florin Akermann >Priority: Major > > Kafka Streams enforces a strict non-null-key policy in the DSL across all > key-dependent operations (like aggregations and joins). > This also applies to FK-joins, in particular to the ForeignKeyExtractor. If > it returns `null`, it's treated as invalid. For left-joins, it might make > sense to still accept a `null`, and add the left-hand record with an empty > right-hand-side to the result. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (KAFKA-14748) Relax non-null FK left-join requirement
[ https://issues.apache.org/jira/browse/KAFKA-14748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746242#comment-17746242 ] Florin Akermann edited comment on KAFKA-14748 at 7/24/23 7:37 AM: -- Hi, I will have a go at this if ok. If I haven't drafted any KIP or PR within three weeks then I am ok to be removed again. was (Author: aki): Hi, I will have a go a this if ok. If I haven't drafted any KIP or PR within three weeks then I am ok to be removed again. > Relax non-null FK left-join requirement > --- > > Key: KAFKA-14748 > URL: https://issues.apache.org/jira/browse/KAFKA-14748 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Matthias J. Sax >Assignee: Florin Akermann >Priority: Major > > Kafka Streams enforces a strict non-null-key policy in the DSL across all > key-dependent operations (like aggregations and joins). > This also applies to FK-joins, in particular to the ForeignKeyExtractor. If > it returns `null`, it's treated as invalid. For left-joins, it might make > sense to still accept a `null`, and add the left-hand record with an empty > right-hand-side to the result. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-14748) Relax non-null FK left-join requirement
[ https://issues.apache.org/jira/browse/KAFKA-14748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746242#comment-17746242 ] Florin Akermann commented on KAFKA-14748: - Hi, I will have a go a this if ok. If I haven't drafted any KIP or PR within three weeks then I am ok to be removed again. > Relax non-null FK left-join requirement > --- > > Key: KAFKA-14748 > URL: https://issues.apache.org/jira/browse/KAFKA-14748 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Matthias J. Sax >Priority: Major > > Kafka Streams enforces a strict non-null-key policy in the DSL across all > key-dependent operations (like aggregations and joins). > This also applies to FK-joins, in particular to the ForeignKeyExtractor. If > it returns `null`, it's treated as invalid. For left-joins, it might make > sense to still accept a `null`, and add the left-hand record with an empty > right-hand-side to the result. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (KAFKA-12317) Relax non-null key requirement for left/outer KStream joins
[ https://issues.apache.org/jira/browse/KAFKA-12317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Florin Akermann reassigned KAFKA-12317: --- Assignee: Florin Akermann > Relax non-null key requirement for left/outer KStream joins > --- > > Key: KAFKA-12317 > URL: https://issues.apache.org/jira/browse/KAFKA-12317 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Matthias J. Sax >Assignee: Florin Akermann >Priority: Major > > Currently, for a stream-streams and stream-table/globalTable join > KafkaStreams drops all stream records with a `null`-key (`null`-join-key for > stream-globalTable), because for a `null`-(join)key the join is undefined: > ie, we don't have an attribute the do the table lookup (we consider the > stream-record as malformed). Note, that we define the semantics of > _left/outer_ join as: keep the stream record if no matching join record was > found. > We could relax the definition of _left_ stream-table/globalTable and > _left/outer_ stream-stream join though, and not drop `null`-(join)key stream > records, and call the ValueJoiner with a `null` "other-side" value instead: > if the stream record key (or join-key) is `null`, we could treat is as > "failed lookup" instead of treating the stream record as corrupted. > If we make this change, users that want to keep the current behavior, can add > a `filter()` before the join to drop `null`-(join)key records from the stream > explicitly. > Note that this change also requires to change the behavior if we insert a > repartition topic before the join: currently, we drop `null`-key record > before writing into the repartition topic (as we know they would be dropped > later anyway). We need to relax this behavior for a left stream-table and > left/outer stream-stream join. User need to be aware (ie, we might need to > put this into the docs and JavaDocs), that records with `null`-key would be > partitioned randomly. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (KAFKA-14748) Relax non-null FK left-join requirement
[ https://issues.apache.org/jira/browse/KAFKA-14748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Florin Akermann reassigned KAFKA-14748: --- Assignee: Florin Akermann > Relax non-null FK left-join requirement > --- > > Key: KAFKA-14748 > URL: https://issues.apache.org/jira/browse/KAFKA-14748 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Matthias J. Sax >Assignee: Florin Akermann >Priority: Major > > Kafka Streams enforces a strict non-null-key policy in the DSL across all > key-dependent operations (like aggregations and joins). > This also applies to FK-joins, in particular to the ForeignKeyExtractor. If > it returns `null`, it's treated as invalid. For left-joins, it might make > sense to still accept a `null`, and add the left-hand record with an empty > right-hand-side to the result. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-12317) Relax non-null key requirement for left/outer KStream joins
[ https://issues.apache.org/jira/browse/KAFKA-12317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746244#comment-17746244 ] Florin Akermann commented on KAFKA-12317: - Hi, I will have a go a this if ok. If I haven't drafted any KIP or PR within three weeks then I am ok to be removed again. > Relax non-null key requirement for left/outer KStream joins > --- > > Key: KAFKA-12317 > URL: https://issues.apache.org/jira/browse/KAFKA-12317 > Project: Kafka > Issue Type: Improvement > Components: streams >Reporter: Matthias J. Sax >Priority: Major > > Currently, for a stream-streams and stream-table/globalTable join > KafkaStreams drops all stream records with a `null`-key (`null`-join-key for > stream-globalTable), because for a `null`-(join)key the join is undefined: > ie, we don't have an attribute the do the table lookup (we consider the > stream-record as malformed). Note, that we define the semantics of > _left/outer_ join as: keep the stream record if no matching join record was > found. > We could relax the definition of _left_ stream-table/globalTable and > _left/outer_ stream-stream join though, and not drop `null`-(join)key stream > records, and call the ValueJoiner with a `null` "other-side" value instead: > if the stream record key (or join-key) is `null`, we could treat is as > "failed lookup" instead of treating the stream record as corrupted. > If we make this change, users that want to keep the current behavior, can add > a `filter()` before the join to drop `null`-(join)key records from the stream > explicitly. > Note that this change also requires to change the behavior if we insert a > repartition topic before the join: currently, we drop `null`-key record > before writing into the repartition topic (as we know they would be dropped > later anyway). We need to relax this behavior for a left stream-table and > left/outer stream-stream join. User need to be aware (ie, we might need to > put this into the docs and JavaDocs), that records with `null`-key would be > partitioned randomly. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-13602) Allow to broadcast a result record
[ https://issues.apache.org/jira/browse/KAFKA-13602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17558760#comment-17558760 ] Florin Akermann commented on KAFKA-13602: - Ok cool, thanks for the quick reply. > Allow to broadcast a result record > -- > > Key: KAFKA-13602 > URL: https://issues.apache.org/jira/browse/KAFKA-13602 > Project: Kafka > Issue Type: New Feature > Components: streams >Reporter: Matthias J. Sax >Assignee: Sagar Rao >Priority: Major > Labels: needs-kip, newbie++ > > From time to time, users ask how they can send a record to more than one > partition in a sink topic. Currently, this is only possible by replicate the > message N times before the sink and use a custom partitioner to write the N > messages into the N different partitions. > It might be worth to make this easier and add a new feature for it. There are > multiple options: > * extend `to()` / `addSink()` with a "broadcast" option/config > * add `toAllPartitions()` / `addBroadcastSink()` methods > * allow StreamPartitioner to return `-1` for "all partitions" > * extend `StreamPartitioner` to allow returning more than one partition (ie > a list/collection of integers instead of a single int) > The first three options imply that a "full broadcast" is supported only, so > it's less flexible. On the other hand, it's easier to use (especially the > first two options are easy as they do not require to implement a custom > partitioner). > The last option would be most flexible and also allow for a "partial > broadcast" (aka multi-cast) pattern. It might also be possible to combine two > options, or maye even a totally different idea. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (KAFKA-13602) Allow to broadcast a result record
[ https://issues.apache.org/jira/browse/KAFKA-13602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17558751#comment-17558751 ] Florin Akermann commented on KAFKA-13602: - Hi [~sagarrao] , [~mjsax] May I have a go at this? > Allow to broadcast a result record > -- > > Key: KAFKA-13602 > URL: https://issues.apache.org/jira/browse/KAFKA-13602 > Project: Kafka > Issue Type: New Feature > Components: streams >Reporter: Matthias J. Sax >Assignee: Sagar Rao >Priority: Major > Labels: needs-kip, newbie++ > > From time to time, users ask how they can send a record to more than one > partition in a sink topic. Currently, this is only possible by replicate the > message N times before the sink and use a custom partitioner to write the N > messages into the N different partitions. > It might be worth to make this easier and add a new feature for it. There are > multiple options: > * extend `to()` / `addSink()` with a "broadcast" option/config > * add `toAllPartitions()` / `addBroadcastSink()` methods > * allow StreamPartitioner to return `-1` for "all partitions" > * extend `StreamPartitioner` to allow returning more than one partition (ie > a list/collection of integers instead of a single int) > The first three options imply that a "full broadcast" is supported only, so > it's less flexible. On the other hand, it's easier to use (especially the > first two options are easy as they do not require to implement a custom > partitioner). > The last option would be most flexible and also allow for a "partial > broadcast" (aka multi-cast) pattern. It might also be possible to combine two > options, or maye even a totally different idea. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Assigned] (KAFKA-13602) Allow to broadcast a result record
[ https://issues.apache.org/jira/browse/KAFKA-13602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Florin Akermann reassigned KAFKA-13602: --- Assignee: (was: Florin Akermann) > Allow to broadcast a result record > -- > > Key: KAFKA-13602 > URL: https://issues.apache.org/jira/browse/KAFKA-13602 > Project: Kafka > Issue Type: New Feature > Components: streams >Reporter: Matthias J. Sax >Priority: Major > Labels: needs-kip, newbie++ > > From time to time, users ask how they can send a record to more than one > partition in a sink topic. Currently, this is only possible by replicate the > message N times before the sink and use a custom partitioner to write the N > messages into the N different partitions. > It might be worth to make this easier and add a new feature for it. There are > multiple options: > * extend `to()` / `addSink()` with a "broadcast" option/config > * add `toAllPartitions()` / `addBroadcastSink()` methods > * allow StreamPartitioner to return `-1` for "all partitions" > * extend `StreamPartitioner` to allow returning more than one partition (ie > a list/collection of integers instead of a single int) > The first three options imply that a "full broadcast" is supported only, so > it's less flexible. On the other hand, it's easier to use (especially the > first two options are easy as they do not require to implement a custom > partitioner). > The last option would be most flexible and also allow for a "partial > broadcast" (aka multi-cast) pattern. It might also be possible to combine two > options, or maye even a totally different idea. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (KAFKA-13602) Allow to broadcast a result record
[ https://issues.apache.org/jira/browse/KAFKA-13602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Florin Akermann reassigned KAFKA-13602: --- Assignee: Florin Akermann > Allow to broadcast a result record > -- > > Key: KAFKA-13602 > URL: https://issues.apache.org/jira/browse/KAFKA-13602 > Project: Kafka > Issue Type: New Feature > Components: streams >Reporter: Matthias J. Sax >Assignee: Florin Akermann >Priority: Major > Labels: needs-kip, newbie++ > > From time to time, users ask how they can send a record to more than one > partition in a sink topic. Currently, this is only possible by replicate the > message N times before the sink and use a custom partitioner to write the N > messages into the N different partitions. > It might be worth to make this easier and add a new feature for it. There are > multiple options: > * extend `to()` / `addSink()` with a "broadcast" option/config > * add `toAllPartitions()` / `addBroadcastSink()` methods > * allow StreamPartitioner to return `-1` for "all partitions" > * extend `StreamPartitioner` to allow returning more than one partition (ie > a list/collection of integers instead of a single int) > The first three options imply that a "full broadcast" is supported only, so > it's less flexible. On the other hand, it's easier to use (especially the > first two options are easy as they do not require to implement a custom > partitioner). > The last option would be most flexible and also allow for a "partial > broadcast" (aka multi-cast) pattern. It might also be possible to combine two > options, or maye even a totally different idea. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (KAFKA-13351) Add possibility to write kafka headers in Kafka Console Producer
[ https://issues.apache.org/jira/browse/KAFKA-13351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17442917#comment-17442917 ] Florin Akermann commented on KAFKA-13351: - [~habdank] I created a kip and started a discussion thread: https://lists.apache.org/thread/pw0oqbk855vj0d63gfg3q3h3p2p1zopw > Add possibility to write kafka headers in Kafka Console Producer > > > Key: KAFKA-13351 > URL: https://issues.apache.org/jira/browse/KAFKA-13351 > Project: Kafka > Issue Type: Wish > Components: tools >Affects Versions: 2.8.1 >Reporter: Seweryn Habdank-Wojewodzki >Assignee: Florin Akermann >Priority: Major > > Dears, > Currently there is an asymetry between Kafka Console Consumer and Producer. > Kafka Consumer can display headers (KAFKA-6733), but Kafka Producer cannot > produce them. > It would be good to unify this and add possibility to Kafka Console Producer > to produce them. > Similar ticket is here: KAFKA-6574, but it is very old and does not > represents current state of the software. > Please consider this. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (KAFKA-13351) Add possibility to write kafka headers in Kafka Console Producer
[ https://issues.apache.org/jira/browse/KAFKA-13351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17436525#comment-17436525 ] Florin Akermann commented on KAFKA-13351: - Hi, I made a PR for it. However I cannot set the jira-status to "Patch Available". I guess it is because I am not assigned to the Jira yet. [~mjsax] I have seen you comment on many other Jiras. I guess you are comitter. Could you assign me to the Jira. Which comitters should I tag on this PR? > Add possibility to write kafka headers in Kafka Console Producer > > > Key: KAFKA-13351 > URL: https://issues.apache.org/jira/browse/KAFKA-13351 > Project: Kafka > Issue Type: Wish >Affects Versions: 2.8.1 >Reporter: Seweryn Habdank-Wojewodzki >Priority: Major > > Dears, > Currently there is an asymetry between Kafka Console Consumer and Producer. > Kafka Consumer can display headers (KAFKA-6733), but Kafka Producer cannot > produce them. > It would be good to unify this and add possibility to Kafka Console Producer > to produce them. > Similar ticket is here: KAFKA-6574, but it is very old and does not > represents current state of the software. > Please consider this. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-13351) Add possibility to write kafka headers in Kafka Console Producer
[ https://issues.apache.org/jira/browse/KAFKA-13351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17436205#comment-17436205 ] Florin Akermann commented on KAFKA-13351: - Hi, I would like to pick this up. > Add possibility to write kafka headers in Kafka Console Producer > > > Key: KAFKA-13351 > URL: https://issues.apache.org/jira/browse/KAFKA-13351 > Project: Kafka > Issue Type: Wish >Affects Versions: 2.8.1 >Reporter: Seweryn Habdank-Wojewodzki >Priority: Major > > Dears, > Currently there is an asymetry between Kafka Console Consumer and Producer. > Kafka Consumer can display headers (KAFKA-6733), but Kafka Producer cannot > produce them. > It would be good to unify this and add possibility to Kafka Console Producer > to produce them. > Similar ticket is here: KAFKA-6574, but it is very old and does not > represents current state of the software. > Please consider this. -- This message was sent by Atlassian Jira (v8.3.4#803005)