[jira] [Comment Edited] (KAFKA-12317) Relax non-null key requirement for left/outer KStream joins

2024-04-09 Thread Florin Akermann (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-12317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17835506#comment-17835506
 ] 

Florin Akermann edited comment on KAFKA-12317 at 4/9/24 6:50 PM:
-

[~mjsax] 
[https://github.com/apache/kafka/pull/15689]
I pushed fixes for left/outer stream stream joins documentation.
The documentation for the stream/table left/foreign-key joins has already been 
updated in the original PR.

Note, in the table docs it still states.
"Input records for the stream with a null value are ignored and do not trigger 
the join."
As far as I can see this is the correct descirption of the behavior for the 
stream-table operators in question.

I plan on creating a separate PR for the join semantics table as those are not 
incorrect per se.


was (Author: aki):
[~mjsax] 
https://github.com/apache/kafka/pull/15689
I pushed fixes for left/outer stream stream joins documentation.
The documentation for the stream/table left/foreign-key joins has already been 
updated in the original PR.

Note, in the table docs it still states.
"Input records for the stream with a null value are ignored and do not trigger 
the join."
As far as I can see this is the correct descirption of the behavior for the 
stream-table operators in question.

I plan on creating a separate PR for the join semantics table.

> Relax non-null key requirement for left/outer KStream joins
> ---
>
> Key: KAFKA-12317
> URL: https://issues.apache.org/jira/browse/KAFKA-12317
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Matthias J. Sax
>Assignee: Florin Akermann
>Priority: Major
>  Labels: kip
> Fix For: 3.7.0
>
>
> Currently, for a stream-streams and stream-table/globalTable join 
> KafkaStreams drops all stream records with a `null`{-}key (`null`-join-key 
> for stream-globalTable), because for a `null`{-}(join)key the join is 
> undefined: ie, we don't have an attribute the do the table lookup (we 
> consider the stream-record as malformed). Note, that we define the semantics 
> of _left/outer_ join as: keep the stream record if no matching join record 
> was found.
> We could relax the definition of _left_ stream-table/globalTable and 
> _left/outer_ stream-stream join though, and not drop `null`-(join)key stream 
> records, and call the ValueJoiner with a `null` "other-side" value instead: 
> if the stream record key (or join-key) is `null`, we could treat is as 
> "failed lookup" instead of treating the stream record as corrupted.
> If we make this change, users that want to keep the current behavior, can add 
> a `filter()` before the join to drop `null`-(join)key records from the stream 
> explicitly.
> Note that this change also requires to change the behavior if we insert a 
> repartition topic before the join: currently, we drop `null`-key record 
> before writing into the repartition topic (as we know they would be dropped 
> later anyway). We need to relax this behavior for a left stream-table and 
> left/outer stream-stream join. User need to be aware (ie, we might need to 
> put this into the docs and JavaDocs), that records with `null`-key would be 
> partitioned randomly.
> KIP-962: 
> [https://cwiki.apache.org/confluence/display/KAFKA/KIP-962%3A+Relax+non-null+key+requirement+in+Kafka+Streams]
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-12317) Relax non-null key requirement for left/outer KStream joins

2024-04-09 Thread Florin Akermann (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-12317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17835506#comment-17835506
 ] 

Florin Akermann commented on KAFKA-12317:
-

[~mjsax] 
https://github.com/apache/kafka/pull/15689
I pushed fixes for left/outer stream stream joins documentation.
The documentation for the stream/table left/foreign-key joins has already been 
updated in the original PR.

Note, in the table docs it still states.
"Input records for the stream with a null value are ignored and do not trigger 
the join."
As far as I can see this is the correct descirption of the behavior for the 
stream-table operators in question.

I plan on creating a separate PR for the join semantics table.

> Relax non-null key requirement for left/outer KStream joins
> ---
>
> Key: KAFKA-12317
> URL: https://issues.apache.org/jira/browse/KAFKA-12317
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Matthias J. Sax
>Assignee: Florin Akermann
>Priority: Major
>  Labels: kip
> Fix For: 3.7.0
>
>
> Currently, for a stream-streams and stream-table/globalTable join 
> KafkaStreams drops all stream records with a `null`{-}key (`null`-join-key 
> for stream-globalTable), because for a `null`{-}(join)key the join is 
> undefined: ie, we don't have an attribute the do the table lookup (we 
> consider the stream-record as malformed). Note, that we define the semantics 
> of _left/outer_ join as: keep the stream record if no matching join record 
> was found.
> We could relax the definition of _left_ stream-table/globalTable and 
> _left/outer_ stream-stream join though, and not drop `null`-(join)key stream 
> records, and call the ValueJoiner with a `null` "other-side" value instead: 
> if the stream record key (or join-key) is `null`, we could treat is as 
> "failed lookup" instead of treating the stream record as corrupted.
> If we make this change, users that want to keep the current behavior, can add 
> a `filter()` before the join to drop `null`-(join)key records from the stream 
> explicitly.
> Note that this change also requires to change the behavior if we insert a 
> repartition topic before the join: currently, we drop `null`-key record 
> before writing into the repartition topic (as we know they would be dropped 
> later anyway). We need to relax this behavior for a left stream-table and 
> left/outer stream-stream join. User need to be aware (ie, we might need to 
> put this into the docs and JavaDocs), that records with `null`-key would be 
> partitioned randomly.
> KIP-962: 
> [https://cwiki.apache.org/confluence/display/KAFKA/KIP-962%3A+Relax+non-null+key+requirement+in+Kafka+Streams]
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-12317) Relax non-null key requirement for left/outer KStream joins

2024-02-17 Thread Florin Akermann (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-12317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818138#comment-17818138
 ] 

Florin Akermann commented on KAFKA-12317:
-

Thanks for the flag!

> Would you be interested to do a follow up PR to update the docs?

Yes.

> Relax non-null key requirement for left/outer KStream joins
> ---
>
> Key: KAFKA-12317
> URL: https://issues.apache.org/jira/browse/KAFKA-12317
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Matthias J. Sax
>Assignee: Florin Akermann
>Priority: Major
>  Labels: kip
>
> Currently, for a stream-streams and stream-table/globalTable join 
> KafkaStreams drops all stream records with a `null`{-}key (`null`-join-key 
> for stream-globalTable), because for a `null`{-}(join)key the join is 
> undefined: ie, we don't have an attribute the do the table lookup (we 
> consider the stream-record as malformed). Note, that we define the semantics 
> of _left/outer_ join as: keep the stream record if no matching join record 
> was found.
> We could relax the definition of _left_ stream-table/globalTable and 
> _left/outer_ stream-stream join though, and not drop `null`-(join)key stream 
> records, and call the ValueJoiner with a `null` "other-side" value instead: 
> if the stream record key (or join-key) is `null`, we could treat is as 
> "failed lookup" instead of treating the stream record as corrupted.
> If we make this change, users that want to keep the current behavior, can add 
> a `filter()` before the join to drop `null`-(join)key records from the stream 
> explicitly.
> Note that this change also requires to change the behavior if we insert a 
> repartition topic before the join: currently, we drop `null`-key record 
> before writing into the repartition topic (as we know they would be dropped 
> later anyway). We need to relax this behavior for a left stream-table and 
> left/outer stream-stream join. User need to be aware (ie, we might need to 
> put this into the docs and JavaDocs), that records with `null`-key would be 
> partitioned randomly.
> KIP-962: 
> [https://cwiki.apache.org/confluence/display/KAFKA/KIP-962%3A+Relax+non-null+key+requirement+in+Kafka+Streams]
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (KAFKA-14049) Relax Non Null Requirement for KStreamGlobalKTable Left Join

2024-02-17 Thread Florin Akermann (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-14049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818136#comment-17818136
 ] 

Florin Akermann edited comment on KAFKA-14049 at 2/17/24 9:14 AM:
--

[~mjsax] 

Yes.
The behavior of the operator changed as part of KIP-962. The following behavior 
has been asserted.
[https://github.com/apache/kafka/blob/trunk/streams/src/test/java/org/apache/kafka/streams/integration/RelaxedNullKeyRequirementJoinTest.java#L106]


was (Author: aki):
[~mjsax] 

Yes.
The behavior of the operator changed as part of KIP-962. The following behavior 
has been asserted 
[https://github.com/apache/kafka/blob/trunk/streams/src/test/java/org/apache/kafka/streams/integration/RelaxedNullKeyRequirementJoinTest.java#L106]

> Relax Non Null Requirement for KStreamGlobalKTable Left Join
> 
>
> Key: KAFKA-14049
> URL: https://issues.apache.org/jira/browse/KAFKA-14049
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Saumya Gupta
>Assignee: Florin Akermann
>Priority: Major
>
> Null Values in the Stream for a Left Join would indicate a Tombstone Message 
> that needs to propagated if not actually joined with the GlobalKTable 
> message, hence these messages should not be ignored .



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (KAFKA-14049) Relax Non Null Requirement for KStreamGlobalKTable Left Join

2024-02-17 Thread Florin Akermann (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-14049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818136#comment-17818136
 ] 

Florin Akermann edited comment on KAFKA-14049 at 2/17/24 9:14 AM:
--

[~mjsax] 

Yes.
The behavior of the operator changed as part of KIP-962. The following behavior 
has been asserted 
[https://github.com/apache/kafka/blob/trunk/streams/src/test/java/org/apache/kafka/streams/integration/RelaxedNullKeyRequirementJoinTest.java#L106]


was (Author: aki):
[~mjsax] 

Yes.
The following behavior has been asserted as part of KIP-962
[https://github.com/apache/kafka/blob/trunk/streams/src/test/java/org/apache/kafka/streams/integration/RelaxedNullKeyRequirementJoinTest.java#L106]

> Relax Non Null Requirement for KStreamGlobalKTable Left Join
> 
>
> Key: KAFKA-14049
> URL: https://issues.apache.org/jira/browse/KAFKA-14049
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Saumya Gupta
>Assignee: Florin Akermann
>Priority: Major
>
> Null Values in the Stream for a Left Join would indicate a Tombstone Message 
> that needs to propagated if not actually joined with the GlobalKTable 
> message, hence these messages should not be ignored .



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (KAFKA-14049) Relax Non Null Requirement for KStreamGlobalKTable Left Join

2024-02-17 Thread Florin Akermann (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-14049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818136#comment-17818136
 ] 

Florin Akermann edited comment on KAFKA-14049 at 2/17/24 9:13 AM:
--

[~mjsax] 

Yes.
The following behavior has been asserted as part of KIP-962
[https://github.com/apache/kafka/blob/trunk/streams/src/test/java/org/apache/kafka/streams/integration/RelaxedNullKeyRequirementJoinTest.java#L106]


was (Author: aki):
[~mjsax] 

Yes.
The following behavior has been asserted as part of KIP-962
[https://github.com/apache/kafka/blob/trunk/streams/src/test/java/org/apache/kafka/streams/integration/RelaxedNullKeyRequirementJoinTest.java#L106]

So I'd say we can close this item.

> Relax Non Null Requirement for KStreamGlobalKTable Left Join
> 
>
> Key: KAFKA-14049
> URL: https://issues.apache.org/jira/browse/KAFKA-14049
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Saumya Gupta
>Assignee: Florin Akermann
>Priority: Major
>
> Null Values in the Stream for a Left Join would indicate a Tombstone Message 
> that needs to propagated if not actually joined with the GlobalKTable 
> message, hence these messages should not be ignored .



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-14049) Relax Non Null Requirement for KStreamGlobalKTable Left Join

2024-02-17 Thread Florin Akermann (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-14049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Florin Akermann resolved KAFKA-14049.
-
Resolution: Fixed

> Relax Non Null Requirement for KStreamGlobalKTable Left Join
> 
>
> Key: KAFKA-14049
> URL: https://issues.apache.org/jira/browse/KAFKA-14049
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Saumya Gupta
>Assignee: Florin Akermann
>Priority: Major
>
> Null Values in the Stream for a Left Join would indicate a Tombstone Message 
> that needs to propagated if not actually joined with the GlobalKTable 
> message, hence these messages should not be ignored .



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-14049) Relax Non Null Requirement for KStreamGlobalKTable Left Join

2024-02-17 Thread Florin Akermann (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-14049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17818136#comment-17818136
 ] 

Florin Akermann commented on KAFKA-14049:
-

[~mjsax] 

Yes.
The following behavior has been asserted as part of KIP-962
[https://github.com/apache/kafka/blob/trunk/streams/src/test/java/org/apache/kafka/streams/integration/RelaxedNullKeyRequirementJoinTest.java#L106]

So I'd say we can close this item.

> Relax Non Null Requirement for KStreamGlobalKTable Left Join
> 
>
> Key: KAFKA-14049
> URL: https://issues.apache.org/jira/browse/KAFKA-14049
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Saumya Gupta
>Assignee: Florin Akermann
>Priority: Major
>
> Null Values in the Stream for a Left Join would indicate a Tombstone Message 
> that needs to propagated if not actually joined with the GlobalKTable 
> message, hence these messages should not be ignored .



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] (KAFKA-16123) KStreamKStreamJoinProcessor does not drop late records.

2024-02-10 Thread Florin Akermann (Jira)


[ https://issues.apache.org/jira/browse/KAFKA-16123 ]


Florin Akermann deleted comment on KAFKA-16123:
-

was (Author: aki):
I now committed a 'generalized fix' proposal

> KStreamKStreamJoinProcessor does not drop late records.
> ---
>
> Key: KAFKA-16123
> URL: https://issues.apache.org/jira/browse/KAFKA-16123
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Florin Akermann
>Assignee: Florin Akermann
>Priority: Major
>
> Issue illustration: [https://github.com/apache/kafka/pull/15314/files]
> Suggested fix: https://github.com/apache/kafka/pull/15189



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] (KAFKA-16123) KStreamKStreamJoinProcessor does not drop late records.

2024-02-10 Thread Florin Akermann (Jira)


[ https://issues.apache.org/jira/browse/KAFKA-16123 ]


Florin Akermann deleted comment on KAFKA-16123:
-

was (Author: aki):
This might actually be a general issue and not just for null-key records?
In other words, the problem already existed prior to KIP-962 for keyed records.

See [https://github.com/apache/kafka/pull/15314/files]

I'll generalize the PR to cover the key records as well.

> KStreamKStreamJoinProcessor does not drop late records.
> ---
>
> Key: KAFKA-16123
> URL: https://issues.apache.org/jira/browse/KAFKA-16123
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Florin Akermann
>Assignee: Florin Akermann
>Priority: Major
>
> Issue illustration: [https://github.com/apache/kafka/pull/15314/files]
> Suggested fix: https://github.com/apache/kafka/pull/15189



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-16123) KStreamKStreamJoinProcessor does not drop late records.

2024-02-10 Thread Florin Akermann (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Florin Akermann updated KAFKA-16123:

Description: 
Issue illustration: [https://github.com/apache/kafka/pull/15314/files]
Suggested fix: https://github.com/apache/kafka/pull/15189

  was:Issue illustration: [https://github.com/apache/kafka/pull/15314/files]


> KStreamKStreamJoinProcessor does not drop late records.
> ---
>
> Key: KAFKA-16123
> URL: https://issues.apache.org/jira/browse/KAFKA-16123
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Florin Akermann
>Assignee: Florin Akermann
>Priority: Major
>
> Issue illustration: [https://github.com/apache/kafka/pull/15314/files]
> Suggested fix: https://github.com/apache/kafka/pull/15189



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-16123) KStreamKStreamJoinProcessor does not drop late records.

2024-02-10 Thread Florin Akermann (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Florin Akermann updated KAFKA-16123:

Description: Issue illustration: 
[https://github.com/apache/kafka/pull/15314/files]  (was: As part of KIP-962 
the non-null key requirements have been relaxed for left and outer joins.
However, the implementation forwards null-key records for left/outer joins 
unconditionally of the join window.

Old title:
KStreamKStreamJoinProcessor forwards null-key records for left/outer joins 
unconditionally of the join window)

> KStreamKStreamJoinProcessor does not drop late records.
> ---
>
> Key: KAFKA-16123
> URL: https://issues.apache.org/jira/browse/KAFKA-16123
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Florin Akermann
>Assignee: Florin Akermann
>Priority: Major
>
> Issue illustration: [https://github.com/apache/kafka/pull/15314/files]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-16123) KStreamKStreamJoinProcessor does not drop late records.

2024-02-10 Thread Florin Akermann (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Florin Akermann updated KAFKA-16123:

Summary: KStreamKStreamJoinProcessor does not drop late records.  (was: 
KStreamKStreamJoinProcessor forwards null-key records for left/outer joins 
unconditionally of the join window.)

> KStreamKStreamJoinProcessor does not drop late records.
> ---
>
> Key: KAFKA-16123
> URL: https://issues.apache.org/jira/browse/KAFKA-16123
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Florin Akermann
>Assignee: Florin Akermann
>Priority: Major
>
> As part of KIP-962 the non-null key requirements have been relaxed for left 
> and outer joins.
> However, the implementation forwards null-key records for left/outer joins 
> unconditionally of the join window.
> Old title:
> KStreamKStreamJoinProcessor forwards null-key records for left/outer joins 
> unconditionally of the join window



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-16123) KStreamKStreamJoinProcessor forwards null-key records for left/outer joins unconditionally of the join window.

2024-02-10 Thread Florin Akermann (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Florin Akermann updated KAFKA-16123:

Description: 
As part of KIP-962 the non-null key requirements have been relaxed for left and 
outer joins.
However, the implementation forwards null-key records for left/outer joins 
unconditionally of the join window.

Old title:
KStreamKStreamJoinProcessor forwards null-key records for left/outer joins 
unconditionally of the join window

  was:
As part of KIP-962 the non-null key requirements have been relaxed for left and 
outer joins.
However, the implementation forwards null-key records for left/outer joins 
unconditionally of the join window.


> KStreamKStreamJoinProcessor forwards null-key records for left/outer joins 
> unconditionally of the join window.
> --
>
> Key: KAFKA-16123
> URL: https://issues.apache.org/jira/browse/KAFKA-16123
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Florin Akermann
>Assignee: Florin Akermann
>Priority: Major
>
> As part of KIP-962 the non-null key requirements have been relaxed for left 
> and outer joins.
> However, the implementation forwards null-key records for left/outer joins 
> unconditionally of the join window.
> Old title:
> KStreamKStreamJoinProcessor forwards null-key records for left/outer joins 
> unconditionally of the join window



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (KAFKA-16123) KStreamKStreamJoinProcessor forwards null-key records for left/outer joins unconditionally of the join window.

2024-02-10 Thread Florin Akermann (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-16123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17816355#comment-17816355
 ] 

Florin Akermann edited comment on KAFKA-16123 at 2/10/24 10:18 PM:
---

I now committed a 'generalized fix' proposal


was (Author: aki):
I now committed a 'generalized fix' 

> KStreamKStreamJoinProcessor forwards null-key records for left/outer joins 
> unconditionally of the join window.
> --
>
> Key: KAFKA-16123
> URL: https://issues.apache.org/jira/browse/KAFKA-16123
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Florin Akermann
>Assignee: Florin Akermann
>Priority: Major
>
> As part of KIP-962 the non-null key requirements have been relaxed for left 
> and outer joins.
> However, the implementation forwards null-key records for left/outer joins 
> unconditionally of the join window.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-16123) KStreamKStreamJoinProcessor forwards null-key records for left/outer joins unconditionally of the join window.

2024-02-10 Thread Florin Akermann (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-16123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17816355#comment-17816355
 ] 

Florin Akermann commented on KAFKA-16123:
-

I now committed a 'generalized fix' 

> KStreamKStreamJoinProcessor forwards null-key records for left/outer joins 
> unconditionally of the join window.
> --
>
> Key: KAFKA-16123
> URL: https://issues.apache.org/jira/browse/KAFKA-16123
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Florin Akermann
>Assignee: Florin Akermann
>Priority: Major
>
> As part of KIP-962 the non-null key requirements have been relaxed for left 
> and outer joins.
> However, the implementation forwards null-key records for left/outer joins 
> unconditionally of the join window.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (KAFKA-16123) KStreamKStreamJoinProcessor forwards null-key records for left/outer joins unconditionally of the join window.

2024-02-10 Thread Florin Akermann (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-16123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17814001#comment-17814001
 ] 

Florin Akermann edited comment on KAFKA-16123 at 2/10/24 10:46 AM:
---

This might actually be a general issue and not just for null-key records?
In other words, the problem already existed prior to KIP-962 for keyed records.

See [https://github.com/apache/kafka/pull/15314/files]

I'll generalize the PR to cover the key records as well.


was (Author: aki):
This might actually be a general issue and not just for null-key records?
In other words, the problem already existed prior to KIP-962 for non keyed 
records.

See [https://github.com/apache/kafka/pull/15314/files]

I'll generalize the PR to cover the key records as well.

> KStreamKStreamJoinProcessor forwards null-key records for left/outer joins 
> unconditionally of the join window.
> --
>
> Key: KAFKA-16123
> URL: https://issues.apache.org/jira/browse/KAFKA-16123
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Florin Akermann
>Assignee: Florin Akermann
>Priority: Major
>
> As part of KIP-962 the non-null key requirements have been relaxed for left 
> and outer joins.
> However, the implementation forwards null-key records for left/outer joins 
> unconditionally of the join window.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (KAFKA-16123) KStreamKStreamJoinProcessor forwards null-key records for left/outer joins unconditionally of the join window.

2024-02-10 Thread Florin Akermann (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-16123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17814001#comment-17814001
 ] 

Florin Akermann edited comment on KAFKA-16123 at 2/10/24 10:45 AM:
---

This might actually be a general issue and not just for null-key records?
In other words, the problem already existed prior to KIP-962 for non null-key 
records.

See [https://github.com/apache/kafka/pull/15314/files]

I'll generalize the PR to cover the key records as well.


was (Author: aki):
This might actually be a general issue and not just for null-key records?
In other words, the problem already existed prior to KIP-962 for non null-key 
records.

See [https://github.com/apache/kafka/pull/15314/files]

> KStreamKStreamJoinProcessor forwards null-key records for left/outer joins 
> unconditionally of the join window.
> --
>
> Key: KAFKA-16123
> URL: https://issues.apache.org/jira/browse/KAFKA-16123
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Florin Akermann
>Assignee: Florin Akermann
>Priority: Major
>
> As part of KIP-962 the non-null key requirements have been relaxed for left 
> and outer joins.
> However, the implementation forwards null-key records for left/outer joins 
> unconditionally of the join window.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (KAFKA-16123) KStreamKStreamJoinProcessor forwards null-key records for left/outer joins unconditionally of the join window.

2024-02-10 Thread Florin Akermann (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-16123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17814001#comment-17814001
 ] 

Florin Akermann edited comment on KAFKA-16123 at 2/10/24 10:46 AM:
---

This might actually be a general issue and not just for null-key records?
In other words, the problem already existed prior to KIP-962 for non keyed 
records.

See [https://github.com/apache/kafka/pull/15314/files]

I'll generalize the PR to cover the key records as well.


was (Author: aki):
This might actually be a general issue and not just for null-key records?
In other words, the problem already existed prior to KIP-962 for non null-key 
records.

See [https://github.com/apache/kafka/pull/15314/files]

I'll generalize the PR to cover the key records as well.

> KStreamKStreamJoinProcessor forwards null-key records for left/outer joins 
> unconditionally of the join window.
> --
>
> Key: KAFKA-16123
> URL: https://issues.apache.org/jira/browse/KAFKA-16123
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Florin Akermann
>Assignee: Florin Akermann
>Priority: Major
>
> As part of KIP-962 the non-null key requirements have been relaxed for left 
> and outer joins.
> However, the implementation forwards null-key records for left/outer joins 
> unconditionally of the join window.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (KAFKA-16123) KStreamKStreamJoinProcessor forwards null-key records for left/outer joins unconditionally of the join window.

2024-02-03 Thread Florin Akermann (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-16123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17814001#comment-17814001
 ] 

Florin Akermann edited comment on KAFKA-16123 at 2/3/24 11:21 PM:
--

This might actually be a general issue and not just for null-key records?
In other words, the problem already existed prior to KIP-962 for non null-key 
records.

See [https://github.com/apache/kafka/pull/15314/files]


was (Author: aki):
This might actually be a general issue and not just for null-key records?
In other words, the problem already existed prior to KIP-962.

See [https://github.com/apache/kafka/pull/15314/files]

> KStreamKStreamJoinProcessor forwards null-key records for left/outer joins 
> unconditionally of the join window.
> --
>
> Key: KAFKA-16123
> URL: https://issues.apache.org/jira/browse/KAFKA-16123
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Florin Akermann
>Assignee: Florin Akermann
>Priority: Major
>
> As part of KIP-962 the non-null key requirements have been relaxed for left 
> and outer joins.
> However, the implementation forwards null-key records for left/outer joins 
> unconditionally of the join window.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-16123) KStreamKStreamJoinProcessor forwards null-key records for left/outer joins unconditionally of the join window.

2024-02-03 Thread Florin Akermann (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Florin Akermann updated KAFKA-16123:

Summary: KStreamKStreamJoinProcessor forwards null-key records for 
left/outer joins unconditionally of the join window.  (was: 
KStreamKStreamJoinProcessor forwards null records for left/outer joins 
unconditionally of the join window.)

> KStreamKStreamJoinProcessor forwards null-key records for left/outer joins 
> unconditionally of the join window.
> --
>
> Key: KAFKA-16123
> URL: https://issues.apache.org/jira/browse/KAFKA-16123
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Florin Akermann
>Assignee: Florin Akermann
>Priority: Major
>
> As part of KIP-962 the non-null key requirements have been relaxed for left 
> and outer joins.
> However, the implementation forwards null-key records for left/outer joins 
> unconditionally of the join window.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (KAFKA-16123) KStreamKStreamJoinProcessor forwards null records for left/outer joins unconditionally of the join window.

2024-02-03 Thread Florin Akermann (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-16123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17814001#comment-17814001
 ] 

Florin Akermann edited comment on KAFKA-16123 at 2/3/24 11:19 PM:
--

This might actually be a general issue and not just for null-key records?
In other words, the problem already existed prior to KIP-962.

See [https://github.com/apache/kafka/pull/15314/files]


was (Author: aki):
This might actually be a general issue and not just for null-key records?
In other words, the problem existed prior to KIP-962.

See [https://github.com/apache/kafka/pull/15314/files]

> KStreamKStreamJoinProcessor forwards null records for left/outer joins 
> unconditionally of the join window.
> --
>
> Key: KAFKA-16123
> URL: https://issues.apache.org/jira/browse/KAFKA-16123
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Florin Akermann
>Assignee: Florin Akermann
>Priority: Major
>
> As part of KIP-962 the non-null key requirements have been relaxed for left 
> and outer joins.
> However, the implementation forwards null-key records for left/outer joins 
> unconditionally of the join window.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (KAFKA-16123) KStreamKStreamJoinProcessor forwards null records for left/outer joins unconditionally of the join window.

2024-02-03 Thread Florin Akermann (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-16123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17814001#comment-17814001
 ] 

Florin Akermann edited comment on KAFKA-16123 at 2/3/24 11:19 PM:
--

This might actually be a general issue and not just for null-key records?
In other words, the problem existed prior to KIP-962.

See [https://github.com/apache/kafka/pull/15314/files]


was (Author: aki):
This might actually be a general issue and not just for null-key records?
In other words the problem existed prior to KIP-962.

See [https://github.com/apache/kafka/pull/15314/files]

> KStreamKStreamJoinProcessor forwards null records for left/outer joins 
> unconditionally of the join window.
> --
>
> Key: KAFKA-16123
> URL: https://issues.apache.org/jira/browse/KAFKA-16123
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Florin Akermann
>Assignee: Florin Akermann
>Priority: Major
>
> As part of KIP-962 the non-null key requirements have been relaxed for left 
> and outer joins.
> However, the implementation forwards null-key records for left/outer joins 
> unconditionally of the join window.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (KAFKA-16123) KStreamKStreamJoinProcessor forwards null records for left/outer joins unconditionally of the join window.

2024-02-03 Thread Florin Akermann (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-16123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17814001#comment-17814001
 ] 

Florin Akermann edited comment on KAFKA-16123 at 2/3/24 11:18 PM:
--

This might actually be a general issue and not just for null-key records?
In other words the problem existed prior to KIP-962.

See [https://github.com/apache/kafka/pull/15314/files]


was (Author: aki):
This might actually be a general issue and not just for null-key records?
In other words the problem existed prior to KIP-962.

See [https://github.com/apache/kafka/pull/15314/files]

> KStreamKStreamJoinProcessor forwards null records for left/outer joins 
> unconditionally of the join window.
> --
>
> Key: KAFKA-16123
> URL: https://issues.apache.org/jira/browse/KAFKA-16123
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Florin Akermann
>Assignee: Florin Akermann
>Priority: Major
>
> As part of KIP-962 the non-null key requirements have been relaxed for left 
> and outer joins.
> However, the implementation forwards null records for left/outer joins 
> unconditionally of the join window.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-16123) KStreamKStreamJoinProcessor forwards null records for left/outer joins unconditionally of the join window.

2024-02-03 Thread Florin Akermann (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Florin Akermann updated KAFKA-16123:

Description: 
As part of KIP-962 the non-null key requirements have been relaxed for left and 
outer joins.
However, the implementation forwards null-key records for left/outer joins 
unconditionally of the join window.

  was:
As part of KIP-962 the non-null key requirements have been relaxed for left and 
outer joins.
However, the implementation forwards null records for left/outer joins 
unconditionally of the join window.


> KStreamKStreamJoinProcessor forwards null records for left/outer joins 
> unconditionally of the join window.
> --
>
> Key: KAFKA-16123
> URL: https://issues.apache.org/jira/browse/KAFKA-16123
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Florin Akermann
>Assignee: Florin Akermann
>Priority: Major
>
> As part of KIP-962 the non-null key requirements have been relaxed for left 
> and outer joins.
> However, the implementation forwards null-key records for left/outer joins 
> unconditionally of the join window.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-16123) KStreamKStreamJoinProcessor forwards null records for left/outer joins unconditionally of the join window.

2024-02-03 Thread Florin Akermann (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-16123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17814001#comment-17814001
 ] 

Florin Akermann commented on KAFKA-16123:
-

This might actually be a general issue and not just for null-key records?
In other words the problem existed prior to KIP-962.

See [https://github.com/apache/kafka/pull/15314/files]

> KStreamKStreamJoinProcessor forwards null records for left/outer joins 
> unconditionally of the join window.
> --
>
> Key: KAFKA-16123
> URL: https://issues.apache.org/jira/browse/KAFKA-16123
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Florin Akermann
>Assignee: Florin Akermann
>Priority: Major
>
> As part of KIP-962 the non-null key requirements have been relaxed for left 
> and outer joins.
> However, the implementation forwards null records for left/outer joins 
> unconditionally of the join window.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (KAFKA-16123) KStreamKStreamJoinProcessor forwards null records for left/outer joins unconditionally of the join window.

2024-01-13 Thread Florin Akermann (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Florin Akermann updated KAFKA-16123:

Description: 
As part of KIP-962 the non-null key requirements have been relaxed for left and 
outer joins.
However, the implementation forwards null records for left/outer joins 
unconditionally of the join window.

  was:
As part of KIP-962 the non-null key requirements have been relaxed for left and 
outer joins.
However, the implementation forwards null records for left/outer joins 
unconditionally of
the join window.


> KStreamKStreamJoinProcessor forwards null records for left/outer joins 
> unconditionally of the join window.
> --
>
> Key: KAFKA-16123
> URL: https://issues.apache.org/jira/browse/KAFKA-16123
> Project: Kafka
>  Issue Type: Bug
>Reporter: Florin Akermann
>Assignee: Florin Akermann
>Priority: Major
>
> As part of KIP-962 the non-null key requirements have been relaxed for left 
> and outer joins.
> However, the implementation forwards null records for left/outer joins 
> unconditionally of the join window.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-16123) KStreamKStreamJoinProcessor forwards null records for left/outer joins unconditionally of the join window.

2024-01-13 Thread Florin Akermann (Jira)
Florin Akermann created KAFKA-16123:
---

 Summary: KStreamKStreamJoinProcessor forwards null records for 
left/outer joins unconditionally of the join window.
 Key: KAFKA-16123
 URL: https://issues.apache.org/jira/browse/KAFKA-16123
 Project: Kafka
  Issue Type: Bug
Reporter: Florin Akermann
Assignee: Florin Akermann


As part of KIP-962 the non-null key requirements have been relaxed for left and 
outer joins.
However, the implementation forwards null records for left/outer joins 
unconditionally of
the join window.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-12317) Relax non-null key requirement for left/outer KStream joins

2023-12-09 Thread Florin Akermann (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-12317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Florin Akermann resolved KAFKA-12317.
-
Resolution: Fixed

> Relax non-null key requirement for left/outer KStream joins
> ---
>
> Key: KAFKA-12317
> URL: https://issues.apache.org/jira/browse/KAFKA-12317
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Matthias J. Sax
>Assignee: Florin Akermann
>Priority: Major
>  Labels: kip
>
> Currently, for a stream-streams and stream-table/globalTable join 
> KafkaStreams drops all stream records with a `null`{-}key (`null`-join-key 
> for stream-globalTable), because for a `null`{-}(join)key the join is 
> undefined: ie, we don't have an attribute the do the table lookup (we 
> consider the stream-record as malformed). Note, that we define the semantics 
> of _left/outer_ join as: keep the stream record if no matching join record 
> was found.
> We could relax the definition of _left_ stream-table/globalTable and 
> _left/outer_ stream-stream join though, and not drop `null`-(join)key stream 
> records, and call the ValueJoiner with a `null` "other-side" value instead: 
> if the stream record key (or join-key) is `null`, we could treat is as 
> "failed lookup" instead of treating the stream record as corrupted.
> If we make this change, users that want to keep the current behavior, can add 
> a `filter()` before the join to drop `null`-(join)key records from the stream 
> explicitly.
> Note that this change also requires to change the behavior if we insert a 
> repartition topic before the join: currently, we drop `null`-key record 
> before writing into the repartition topic (as we know they would be dropped 
> later anyway). We need to relax this behavior for a left stream-table and 
> left/outer stream-stream join. User need to be aware (ie, we might need to 
> put this into the docs and JavaDocs), that records with `null`-key would be 
> partitioned randomly.
> KIP-962: 
> [https://cwiki.apache.org/confluence/display/KAFKA/KIP-962%3A+Relax+non-null+key+requirement+in+Kafka+Streams]
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-14748) Relax non-null FK left-join requirement

2023-12-09 Thread Florin Akermann (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-14748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Florin Akermann resolved KAFKA-14748.
-
Resolution: Fixed

> Relax non-null FK left-join requirement
> ---
>
> Key: KAFKA-14748
> URL: https://issues.apache.org/jira/browse/KAFKA-14748
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Matthias J. Sax
>Assignee: Florin Akermann
>Priority: Major
>  Labels: kip
>
> Kafka Streams enforces a strict non-null-key policy in the DSL across all 
> key-dependent operations (like aggregations and joins).
> This also applies to FK-joins, in particular to the ForeignKeyExtractor. If 
> it returns `null`, it's treated as invalid. For left-joins, it might make 
> sense to still accept a `null`, and add the left-hand record with an empty 
> right-hand-side to the result.
> KIP-962: 
> [https://cwiki.apache.org/confluence/display/KAFKA/KIP-962%3A+Relax+non-null+key+requirement+in+Kafka+Streams]
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (KAFKA-14748) Relax non-null FK left-join requirement

2023-08-13 Thread Florin Akermann (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-14748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746242#comment-17746242
 ] 

Florin Akermann edited comment on KAFKA-14748 at 8/13/23 10:49 AM:
---

Hi, I will have a go at this. I hope that's ok.


was (Author: aki):
Hi, I will have a go at this. I hope that's ok.
If I haven't drafted any KIP or PR within three weeks then I am ok to be 
removed again.

> Relax non-null FK left-join requirement
> ---
>
> Key: KAFKA-14748
> URL: https://issues.apache.org/jira/browse/KAFKA-14748
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Matthias J. Sax
>Assignee: Florin Akermann
>Priority: Major
>  Labels: kip
>
> Kafka Streams enforces a strict non-null-key policy in the DSL across all 
> key-dependent operations (like aggregations and joins).
> This also applies to FK-joins, in particular to the ForeignKeyExtractor. If 
> it returns `null`, it's treated as invalid. For left-joins, it might make 
> sense to still accept a `null`, and add the left-hand record with an empty 
> right-hand-side to the result.
> KIP-962: 
> [https://cwiki.apache.org/confluence/display/KAFKA/KIP-962%3A+Relax+non-null+key+requirement+in+Kafka+Streams]
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (KAFKA-12317) Relax non-null key requirement for left/outer KStream joins

2023-08-13 Thread Florin Akermann (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-12317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746244#comment-17746244
 ] 

Florin Akermann edited comment on KAFKA-12317 at 8/13/23 10:48 AM:
---

Hi, I will have a go at this. I hope that's ok.


was (Author: aki):
Hi, I will have a go at this. I hope that's ok.
If I haven't drafted any KIP or PR within three weeks then I am ok to be 
removed again.

> Relax non-null key requirement for left/outer KStream joins
> ---
>
> Key: KAFKA-12317
> URL: https://issues.apache.org/jira/browse/KAFKA-12317
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Matthias J. Sax
>Assignee: Florin Akermann
>Priority: Major
>  Labels: kip
>
> Currently, for a stream-streams and stream-table/globalTable join 
> KafkaStreams drops all stream records with a `null`{-}key (`null`-join-key 
> for stream-globalTable), because for a `null`{-}(join)key the join is 
> undefined: ie, we don't have an attribute the do the table lookup (we 
> consider the stream-record as malformed). Note, that we define the semantics 
> of _left/outer_ join as: keep the stream record if no matching join record 
> was found.
> We could relax the definition of _left_ stream-table/globalTable and 
> _left/outer_ stream-stream join though, and not drop `null`-(join)key stream 
> records, and call the ValueJoiner with a `null` "other-side" value instead: 
> if the stream record key (or join-key) is `null`, we could treat is as 
> "failed lookup" instead of treating the stream record as corrupted.
> If we make this change, users that want to keep the current behavior, can add 
> a `filter()` before the join to drop `null`-(join)key records from the stream 
> explicitly.
> Note that this change also requires to change the behavior if we insert a 
> repartition topic before the join: currently, we drop `null`-key record 
> before writing into the repartition topic (as we know they would be dropped 
> later anyway). We need to relax this behavior for a left stream-table and 
> left/outer stream-stream join. User need to be aware (ie, we might need to 
> put this into the docs and JavaDocs), that records with `null`-key would be 
> partitioned randomly.
> KIP-962: 
> [https://cwiki.apache.org/confluence/display/KAFKA/KIP-962%3A+Relax+non-null+key+requirement+in+Kafka+Streams]
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-13197) KStream-GlobalKTable join semantics don't match documentation

2023-08-11 Thread Florin Akermann (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-13197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17753349#comment-17753349
 ] 

Florin Akermann commented on KAFKA-13197:
-

[~twbecker], thanks for raising this. The documentation has been fixed.
KIP-962 will allow for the old behavior again: 'no longer drop records when 
KeyValueMapper returns 'null' and call ValueJoiner with 'null' for right value'.

> KStream-GlobalKTable join semantics don't match documentation
> -
>
> Key: KAFKA-13197
> URL: https://issues.apache.org/jira/browse/KAFKA-13197
> Project: Kafka
>  Issue Type: Bug
>  Components: documentation, streams
>Affects Versions: 2.7.0
>Reporter: Tommy Becker
>Assignee: Florin Akermann
>Priority: Major
> Fix For: 3.6.0, 3.5.2
>
>
> As part of KAFKA-10277, the behavior of KStream-GlobalKTable joins was 
> changed. It appears the change was intended to merely relax a requirement but 
> it actually broke backwards compatibility. Although it does allow {{null}} 
> keys and values in the KStream to be joined, it now excludes {{null}} results 
> of the {{KeyValueMapper}}. We have an application which can return {{null}} 
> from the {{KeyValueMapper}} for non-null keys in the KStream, and relies on 
> these nulls being passed to the {{ValueJoiner}}. Indeed the javadoc still 
> explicitly says this is done:
> {quote}If a KStream input record key or value is null the record will not be 
> included in the join operation and thus no output record will be added to the 
> resulting KStream.
>  If keyValueMapper returns null implying no match exists, a null value will 
> be provided to ValueJoiner.
> {quote}
> Both these statements are incorrect.
> I think the new behavior is worse than the previous/documented behavior. It 
> feels more reasonable to have a non-null stream record map to a null join key 
> (our use-case is event-enhancement where the incoming record doesn't have the 
> join field), than the reverse.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-14049) Relax Non Null Requirement for KStreamGlobalKTable Left Join

2023-08-10 Thread Florin Akermann (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-14049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17752738#comment-17752738
 ] 

Florin Akermann commented on KAFKA-14049:
-

Same for me. I simply assigned it to me as it was linked to 
https://issues.apache.org/jira/browse/KAFKA-12317.
[~SAUMYAG] could you elaborate? 
Else I suggest closing this issue once 
https://issues.apache.org/jira/browse/KAFKA-12317 is resolved.

> Relax Non Null Requirement for KStreamGlobalKTable Left Join
> 
>
> Key: KAFKA-14049
> URL: https://issues.apache.org/jira/browse/KAFKA-14049
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Saumya Gupta
>Assignee: Florin Akermann
>Priority: Major
>
> Null Values in the Stream for a Left Join would indicate a Tombstone Message 
> that needs to propagated if not actually joined with the GlobalKTable 
> message, hence these messages should not be ignored .



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-12317) Relax non-null key requirement for left/outer KStream joins

2023-08-09 Thread Florin Akermann (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-12317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17752407#comment-17752407
 ] 

Florin Akermann commented on KAFKA-12317:
-

Drafted a PR: [https://github.com/apache/kafka/pull/14174]

> Relax non-null key requirement for left/outer KStream joins
> ---
>
> Key: KAFKA-12317
> URL: https://issues.apache.org/jira/browse/KAFKA-12317
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Matthias J. Sax
>Assignee: Florin Akermann
>Priority: Major
>
> Currently, for a stream-streams and stream-table/globalTable join 
> KafkaStreams drops all stream records with a `null`-key (`null`-join-key for 
> stream-globalTable), because for a `null`-(join)key the join is undefined: 
> ie, we don't have an attribute the do the table lookup (we consider the 
> stream-record as malformed). Note, that we define the semantics of 
> _left/outer_ join as: keep the stream record if no matching join record was 
> found.
> We could relax the definition of _left_ stream-table/globalTable and 
> _left/outer_ stream-stream join though, and not drop `null`-(join)key stream 
> records, and call the ValueJoiner with a `null` "other-side" value instead: 
> if the stream record key (or join-key) is `null`, we could treat is as 
> "failed lookup" instead of treating the stream record as corrupted.
> If we make this change, users that want to keep the current behavior, can add 
> a `filter()` before the join to drop `null`-(join)key records from the stream 
> explicitly.
> Note that this change also requires to change the behavior if we insert a 
> repartition topic before the join: currently, we drop `null`-key record 
> before writing into the repartition topic (as we know they would be dropped 
> later anyway). We need to relax this behavior for a left stream-table and 
> left/outer stream-stream join. User need to be aware (ie, we might need to 
> put this into the docs and JavaDocs), that records with `null`-key would be 
> partitioned randomly.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-12317) Relax non-null key requirement for left/outer KStream joins

2023-07-31 Thread Florin Akermann (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-12317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17749322#comment-17749322
 ] 

Florin Akermann commented on KAFKA-12317:
-

[~guozhang] [~mjsax] 
KIP: 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-962%3A+Relax+non-null+key+requirement+in+Kafka+Streams

> Relax non-null key requirement for left/outer KStream joins
> ---
>
> Key: KAFKA-12317
> URL: https://issues.apache.org/jira/browse/KAFKA-12317
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Matthias J. Sax
>Assignee: Florin Akermann
>Priority: Major
>
> Currently, for a stream-streams and stream-table/globalTable join 
> KafkaStreams drops all stream records with a `null`-key (`null`-join-key for 
> stream-globalTable), because for a `null`-(join)key the join is undefined: 
> ie, we don't have an attribute the do the table lookup (we consider the 
> stream-record as malformed). Note, that we define the semantics of 
> _left/outer_ join as: keep the stream record if no matching join record was 
> found.
> We could relax the definition of _left_ stream-table/globalTable and 
> _left/outer_ stream-stream join though, and not drop `null`-(join)key stream 
> records, and call the ValueJoiner with a `null` "other-side" value instead: 
> if the stream record key (or join-key) is `null`, we could treat is as 
> "failed lookup" instead of treating the stream record as corrupted.
> If we make this change, users that want to keep the current behavior, can add 
> a `filter()` before the join to drop `null`-(join)key records from the stream 
> explicitly.
> Note that this change also requires to change the behavior if we insert a 
> repartition topic before the join: currently, we drop `null`-key record 
> before writing into the repartition topic (as we know they would be dropped 
> later anyway). We need to relax this behavior for a left stream-table and 
> left/outer stream-stream join. User need to be aware (ie, we might need to 
> put this into the docs and JavaDocs), that records with `null`-key would be 
> partitioned randomly.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-14748) Relax non-null FK left-join requirement

2023-07-31 Thread Florin Akermann (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-14748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17749325#comment-17749325
 ] 

Florin Akermann commented on KAFKA-14748:
-

Thank you [~guozhang].
Here is the KIP: 
[https://cwiki.apache.org/confluence/display/KAFKA/KIP-962%3A+Relax+non-null+key+requirement+in+Kafka+Streams]

> Relax non-null FK left-join requirement
> ---
>
> Key: KAFKA-14748
> URL: https://issues.apache.org/jira/browse/KAFKA-14748
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Matthias J. Sax
>Assignee: Florin Akermann
>Priority: Major
>
> Kafka Streams enforces a strict non-null-key policy in the DSL across all 
> key-dependent operations (like aggregations and joins).
> This also applies to FK-joins, in particular to the ForeignKeyExtractor. If 
> it returns `null`, it's treated as invalid. For left-joins, it might make 
> sense to still accept a `null`, and add the left-hand record with an empty 
> right-hand-side to the result.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (KAFKA-14748) Relax non-null FK left-join requirement

2023-07-29 Thread Florin Akermann (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-14748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17747670#comment-17747670
 ] 

Florin Akermann edited comment on KAFKA-14748 at 7/29/23 12:55 PM:
---

I have created a PR for this particular issue: foreignKeyExtractor only.
[https://github.com/apache/kafka/pull/14107]

I am ok with first considering merging when I have addressed the 3 related 
issues as well.
Nonetheless it would be great to have some feedback. Just so I know whether I 
have  understood the scope of this particular item (KAFKA-14748) and what I 
should keep in mind for the other incoming merge requests.


was (Author: aki):
I have created a PR for this particular issue.
[https://github.com/apache/kafka/pull/14107]

I am ok with first considering merging when I have addressed the 3 related 
issues as well.
Nonetheless it would be great to have some feedback. Just so I know whether I 
have  understood the scope of this particular item (KAFKA-14748) and what I 
should keep in mind for the other incoming merge requests.

> Relax non-null FK left-join requirement
> ---
>
> Key: KAFKA-14748
> URL: https://issues.apache.org/jira/browse/KAFKA-14748
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Matthias J. Sax
>Assignee: Florin Akermann
>Priority: Major
>
> Kafka Streams enforces a strict non-null-key policy in the DSL across all 
> key-dependent operations (like aggregations and joins).
> This also applies to FK-joins, in particular to the ForeignKeyExtractor. If 
> it returns `null`, it's treated as invalid. For left-joins, it might make 
> sense to still accept a `null`, and add the left-hand record with an empty 
> right-hand-side to the result.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (KAFKA-14748) Relax non-null FK left-join requirement

2023-07-26 Thread Florin Akermann (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-14748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17747670#comment-17747670
 ] 

Florin Akermann edited comment on KAFKA-14748 at 7/26/23 9:54 PM:
--

I have created a PR for this particular issue.
[https://github.com/apache/kafka/pull/14107]

I am ok with first considering merging when I have addressed the 3 related 
issues as well.
Nonetheless it would be great to have some feedback. Just so I know whether I 
have  understood the scope of this particular item (KAFKA-14748) and what I 
should keep in mind for the other incoming merge requests.


was (Author: aki):
I have created a PR for this particular issue.
[https://github.com/apache/kafka/pull/14107]

I am ok with first considering merging when I have addressed the 4 related 
issues as well.
Nonetheless it would be great to have some feedback. Just so I know whether I 
have  understood the scope of this particular item (KAFKA-14748) and what I 
should keep in mind for the other incoming merge requests.

> Relax non-null FK left-join requirement
> ---
>
> Key: KAFKA-14748
> URL: https://issues.apache.org/jira/browse/KAFKA-14748
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Matthias J. Sax
>Assignee: Florin Akermann
>Priority: Major
>
> Kafka Streams enforces a strict non-null-key policy in the DSL across all 
> key-dependent operations (like aggregations and joins).
> This also applies to FK-joins, in particular to the ForeignKeyExtractor. If 
> it returns `null`, it's treated as invalid. For left-joins, it might make 
> sense to still accept a `null`, and add the left-hand record with an empty 
> right-hand-side to the result.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-12317) Relax non-null key requirement for left/outer KStream joins

2023-07-26 Thread Florin Akermann (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-12317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17747671#comment-17747671
 ] 

Florin Akermann commented on KAFKA-12317:
-

Thanks [~mjsax],

I have assigned myself to the rest of them (KAFKA-13197, KAFKA-14748, 
KAFKA-14049).
KAFKA-12845 is in status resolved so I assume this one is no longer relevant.

> Relax non-null key requirement for left/outer KStream joins
> ---
>
> Key: KAFKA-12317
> URL: https://issues.apache.org/jira/browse/KAFKA-12317
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Matthias J. Sax
>Assignee: Florin Akermann
>Priority: Major
>
> Currently, for a stream-streams and stream-table/globalTable join 
> KafkaStreams drops all stream records with a `null`-key (`null`-join-key for 
> stream-globalTable), because for a `null`-(join)key the join is undefined: 
> ie, we don't have an attribute the do the table lookup (we consider the 
> stream-record as malformed). Note, that we define the semantics of 
> _left/outer_ join as: keep the stream record if no matching join record was 
> found.
> We could relax the definition of _left_ stream-table/globalTable and 
> _left/outer_ stream-stream join though, and not drop `null`-(join)key stream 
> records, and call the ValueJoiner with a `null` "other-side" value instead: 
> if the stream record key (or join-key) is `null`, we could treat is as 
> "failed lookup" instead of treating the stream record as corrupted.
> If we make this change, users that want to keep the current behavior, can add 
> a `filter()` before the join to drop `null`-(join)key records from the stream 
> explicitly.
> Note that this change also requires to change the behavior if we insert a 
> repartition topic before the join: currently, we drop `null`-key record 
> before writing into the repartition topic (as we know they would be dropped 
> later anyway). We need to relax this behavior for a left stream-table and 
> left/outer stream-stream join. User need to be aware (ie, we might need to 
> put this into the docs and JavaDocs), that records with `null`-key would be 
> partitioned randomly.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-14748) Relax non-null FK left-join requirement

2023-07-26 Thread Florin Akermann (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-14748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17747670#comment-17747670
 ] 

Florin Akermann commented on KAFKA-14748:
-

I have created a PR for this particular issue.
[https://github.com/apache/kafka/pull/14107]

I am ok with first considering merging when I have addressed the 4 related 
issues as well.
Nonetheless it would be great to have some feedback. Just so I know whether I 
have  understood the scope of this particular item (KAFKA-14748) and what I 
should keep in mind for the other incoming merge requests.

> Relax non-null FK left-join requirement
> ---
>
> Key: KAFKA-14748
> URL: https://issues.apache.org/jira/browse/KAFKA-14748
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Matthias J. Sax
>Assignee: Florin Akermann
>Priority: Major
>
> Kafka Streams enforces a strict non-null-key policy in the DSL across all 
> key-dependent operations (like aggregations and joins).
> This also applies to FK-joins, in particular to the ForeignKeyExtractor. If 
> it returns `null`, it's treated as invalid. For left-joins, it might make 
> sense to still accept a `null`, and add the left-hand record with an empty 
> right-hand-side to the result.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-14049) Relax Non Null Requirement for KStreamGlobalKTable Left Join

2023-07-25 Thread Florin Akermann (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-14049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17747259#comment-17747259
 ] 

Florin Akermann commented on KAFKA-14049:
-

Hi, [~pnee] 

I work on https://issues.apache.org/jira/browse/KAFKA-12317 and 
https://issues.apache.org/jira/browse/KAFKA-14748 do u mind if I take over this 
item as well.

I assume it is ok as there hasn't been any activity since April.
If I haven't drafted any KIP or PR within three weeks then I am ok to be 
removed again.

> Relax Non Null Requirement for KStreamGlobalKTable Left Join
> 
>
> Key: KAFKA-14049
> URL: https://issues.apache.org/jira/browse/KAFKA-14049
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Saumya Gupta
>Assignee: Philip Nee
>Priority: Major
>  Labels: beginner, newbie
>
> Null Values in the Stream for a Left Join would indicate a Tombstone Message 
> that needs to propagated if not actually joined with the GlobalKTable 
> message, hence these messages should not be ignored .



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (KAFKA-14049) Relax Non Null Requirement for KStreamGlobalKTable Left Join

2023-07-25 Thread Florin Akermann (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-14049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Florin Akermann reassigned KAFKA-14049:
---

Assignee: Florin Akermann  (was: Philip Nee)

> Relax Non Null Requirement for KStreamGlobalKTable Left Join
> 
>
> Key: KAFKA-14049
> URL: https://issues.apache.org/jira/browse/KAFKA-14049
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Saumya Gupta
>Assignee: Florin Akermann
>Priority: Major
>  Labels: beginner, newbie
>
> Null Values in the Stream for a Left Join would indicate a Tombstone Message 
> that needs to propagated if not actually joined with the GlobalKTable 
> message, hence these messages should not be ignored .



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (KAFKA-13197) KStream-GlobalKTable join semantics don't match documentation

2023-07-25 Thread Florin Akermann (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Florin Akermann reassigned KAFKA-13197:
---

Assignee: Florin Akermann

> KStream-GlobalKTable join semantics don't match documentation
> -
>
> Key: KAFKA-13197
> URL: https://issues.apache.org/jira/browse/KAFKA-13197
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 2.7.0
>Reporter: Tommy Becker
>Assignee: Florin Akermann
>Priority: Major
>
> As part of KAFKA-10277, the behavior of KStream-GlobalKTable joins was 
> changed. It appears the change was intended to merely relax a requirement but 
> it actually broke backwards compatibility. Although it does allow {{null}} 
> keys and values in the KStream to be joined, it now excludes {{null}} results 
> of the {{KeyValueMapper}}. We have an application which can return {{null}} 
> from the {{KeyValueMapper}} for non-null keys in the KStream, and relies on 
> these nulls being passed to the {{ValueJoiner}}. Indeed the javadoc still 
> explicitly says this is done:
> {quote}If a KStream input record key or value is null the record will not be 
> included in the join operation and thus no output record will be added to the 
> resulting KStream.
>  If keyValueMapper returns null implying no match exists, a null value will 
> be provided to ValueJoiner.
> {quote}
> Both these statements are incorrect.
> I think the new behavior is worse than the previous/documented behavior. It 
> feels more reasonable to have a non-null stream record map to a null join key 
> (our use-case is event-enhancement where the incoming record doesn't have the 
> join field), than the reverse.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (KAFKA-12317) Relax non-null key requirement for left/outer KStream joins

2023-07-24 Thread Florin Akermann (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-12317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746244#comment-17746244
 ] 

Florin Akermann edited comment on KAFKA-12317 at 7/24/23 7:38 AM:
--

Hi, I will have a go at this. I hope that's ok.
If I haven't drafted any KIP or PR within three weeks then I am ok to be 
removed again.


was (Author: aki):
Hi, I will have a go a this if ok.
If I haven't drafted any KIP or PR within three weeks then I am ok to be 
removed again.

> Relax non-null key requirement for left/outer KStream joins
> ---
>
> Key: KAFKA-12317
> URL: https://issues.apache.org/jira/browse/KAFKA-12317
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Matthias J. Sax
>Assignee: Florin Akermann
>Priority: Major
>
> Currently, for a stream-streams and stream-table/globalTable join 
> KafkaStreams drops all stream records with a `null`-key (`null`-join-key for 
> stream-globalTable), because for a `null`-(join)key the join is undefined: 
> ie, we don't have an attribute the do the table lookup (we consider the 
> stream-record as malformed). Note, that we define the semantics of 
> _left/outer_ join as: keep the stream record if no matching join record was 
> found.
> We could relax the definition of _left_ stream-table/globalTable and 
> _left/outer_ stream-stream join though, and not drop `null`-(join)key stream 
> records, and call the ValueJoiner with a `null` "other-side" value instead: 
> if the stream record key (or join-key) is `null`, we could treat is as 
> "failed lookup" instead of treating the stream record as corrupted.
> If we make this change, users that want to keep the current behavior, can add 
> a `filter()` before the join to drop `null`-(join)key records from the stream 
> explicitly.
> Note that this change also requires to change the behavior if we insert a 
> repartition topic before the join: currently, we drop `null`-key record 
> before writing into the repartition topic (as we know they would be dropped 
> later anyway). We need to relax this behavior for a left stream-table and 
> left/outer stream-stream join. User need to be aware (ie, we might need to 
> put this into the docs and JavaDocs), that records with `null`-key would be 
> partitioned randomly.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (KAFKA-14748) Relax non-null FK left-join requirement

2023-07-24 Thread Florin Akermann (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-14748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746242#comment-17746242
 ] 

Florin Akermann edited comment on KAFKA-14748 at 7/24/23 7:37 AM:
--

Hi, I will have a go at this. I hope that's ok.
If I haven't drafted any KIP or PR within three weeks then I am ok to be 
removed again.


was (Author: aki):
Hi, I will have a go at this if ok.
If I haven't drafted any KIP or PR within three weeks then I am ok to be 
removed again.

> Relax non-null FK left-join requirement
> ---
>
> Key: KAFKA-14748
> URL: https://issues.apache.org/jira/browse/KAFKA-14748
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Matthias J. Sax
>Assignee: Florin Akermann
>Priority: Major
>
> Kafka Streams enforces a strict non-null-key policy in the DSL across all 
> key-dependent operations (like aggregations and joins).
> This also applies to FK-joins, in particular to the ForeignKeyExtractor. If 
> it returns `null`, it's treated as invalid. For left-joins, it might make 
> sense to still accept a `null`, and add the left-hand record with an empty 
> right-hand-side to the result.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (KAFKA-14748) Relax non-null FK left-join requirement

2023-07-24 Thread Florin Akermann (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-14748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746242#comment-17746242
 ] 

Florin Akermann edited comment on KAFKA-14748 at 7/24/23 7:37 AM:
--

Hi, I will have a go at this if ok.
If I haven't drafted any KIP or PR within three weeks then I am ok to be 
removed again.


was (Author: aki):
Hi, I will have a go a this if ok.
If I haven't drafted any KIP or PR within three weeks then I am ok to be 
removed again.

> Relax non-null FK left-join requirement
> ---
>
> Key: KAFKA-14748
> URL: https://issues.apache.org/jira/browse/KAFKA-14748
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Matthias J. Sax
>Assignee: Florin Akermann
>Priority: Major
>
> Kafka Streams enforces a strict non-null-key policy in the DSL across all 
> key-dependent operations (like aggregations and joins).
> This also applies to FK-joins, in particular to the ForeignKeyExtractor. If 
> it returns `null`, it's treated as invalid. For left-joins, it might make 
> sense to still accept a `null`, and add the left-hand record with an empty 
> right-hand-side to the result.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-14748) Relax non-null FK left-join requirement

2023-07-24 Thread Florin Akermann (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-14748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746242#comment-17746242
 ] 

Florin Akermann commented on KAFKA-14748:
-

Hi, I will have a go a this if ok.
If I haven't drafted any KIP or PR within three weeks then I am ok to be 
removed again.

> Relax non-null FK left-join requirement
> ---
>
> Key: KAFKA-14748
> URL: https://issues.apache.org/jira/browse/KAFKA-14748
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Matthias J. Sax
>Priority: Major
>
> Kafka Streams enforces a strict non-null-key policy in the DSL across all 
> key-dependent operations (like aggregations and joins).
> This also applies to FK-joins, in particular to the ForeignKeyExtractor. If 
> it returns `null`, it's treated as invalid. For left-joins, it might make 
> sense to still accept a `null`, and add the left-hand record with an empty 
> right-hand-side to the result.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (KAFKA-12317) Relax non-null key requirement for left/outer KStream joins

2023-07-24 Thread Florin Akermann (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-12317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Florin Akermann reassigned KAFKA-12317:
---

Assignee: Florin Akermann

> Relax non-null key requirement for left/outer KStream joins
> ---
>
> Key: KAFKA-12317
> URL: https://issues.apache.org/jira/browse/KAFKA-12317
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Matthias J. Sax
>Assignee: Florin Akermann
>Priority: Major
>
> Currently, for a stream-streams and stream-table/globalTable join 
> KafkaStreams drops all stream records with a `null`-key (`null`-join-key for 
> stream-globalTable), because for a `null`-(join)key the join is undefined: 
> ie, we don't have an attribute the do the table lookup (we consider the 
> stream-record as malformed). Note, that we define the semantics of 
> _left/outer_ join as: keep the stream record if no matching join record was 
> found.
> We could relax the definition of _left_ stream-table/globalTable and 
> _left/outer_ stream-stream join though, and not drop `null`-(join)key stream 
> records, and call the ValueJoiner with a `null` "other-side" value instead: 
> if the stream record key (or join-key) is `null`, we could treat is as 
> "failed lookup" instead of treating the stream record as corrupted.
> If we make this change, users that want to keep the current behavior, can add 
> a `filter()` before the join to drop `null`-(join)key records from the stream 
> explicitly.
> Note that this change also requires to change the behavior if we insert a 
> repartition topic before the join: currently, we drop `null`-key record 
> before writing into the repartition topic (as we know they would be dropped 
> later anyway). We need to relax this behavior for a left stream-table and 
> left/outer stream-stream join. User need to be aware (ie, we might need to 
> put this into the docs and JavaDocs), that records with `null`-key would be 
> partitioned randomly.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (KAFKA-14748) Relax non-null FK left-join requirement

2023-07-24 Thread Florin Akermann (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-14748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Florin Akermann reassigned KAFKA-14748:
---

Assignee: Florin Akermann

> Relax non-null FK left-join requirement
> ---
>
> Key: KAFKA-14748
> URL: https://issues.apache.org/jira/browse/KAFKA-14748
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Matthias J. Sax
>Assignee: Florin Akermann
>Priority: Major
>
> Kafka Streams enforces a strict non-null-key policy in the DSL across all 
> key-dependent operations (like aggregations and joins).
> This also applies to FK-joins, in particular to the ForeignKeyExtractor. If 
> it returns `null`, it's treated as invalid. For left-joins, it might make 
> sense to still accept a `null`, and add the left-hand record with an empty 
> right-hand-side to the result.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-12317) Relax non-null key requirement for left/outer KStream joins

2023-07-24 Thread Florin Akermann (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-12317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17746244#comment-17746244
 ] 

Florin Akermann commented on KAFKA-12317:
-

Hi, I will have a go a this if ok.
If I haven't drafted any KIP or PR within three weeks then I am ok to be 
removed again.

> Relax non-null key requirement for left/outer KStream joins
> ---
>
> Key: KAFKA-12317
> URL: https://issues.apache.org/jira/browse/KAFKA-12317
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Matthias J. Sax
>Priority: Major
>
> Currently, for a stream-streams and stream-table/globalTable join 
> KafkaStreams drops all stream records with a `null`-key (`null`-join-key for 
> stream-globalTable), because for a `null`-(join)key the join is undefined: 
> ie, we don't have an attribute the do the table lookup (we consider the 
> stream-record as malformed). Note, that we define the semantics of 
> _left/outer_ join as: keep the stream record if no matching join record was 
> found.
> We could relax the definition of _left_ stream-table/globalTable and 
> _left/outer_ stream-stream join though, and not drop `null`-(join)key stream 
> records, and call the ValueJoiner with a `null` "other-side" value instead: 
> if the stream record key (or join-key) is `null`, we could treat is as 
> "failed lookup" instead of treating the stream record as corrupted.
> If we make this change, users that want to keep the current behavior, can add 
> a `filter()` before the join to drop `null`-(join)key records from the stream 
> explicitly.
> Note that this change also requires to change the behavior if we insert a 
> repartition topic before the join: currently, we drop `null`-key record 
> before writing into the repartition topic (as we know they would be dropped 
> later anyway). We need to relax this behavior for a left stream-table and 
> left/outer stream-stream join. User need to be aware (ie, we might need to 
> put this into the docs and JavaDocs), that records with `null`-key would be 
> partitioned randomly.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-13602) Allow to broadcast a result record

2022-06-25 Thread Florin Akermann (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-13602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17558760#comment-17558760
 ] 

Florin Akermann commented on KAFKA-13602:
-

Ok cool, thanks for the quick reply.

> Allow to broadcast a result record
> --
>
> Key: KAFKA-13602
> URL: https://issues.apache.org/jira/browse/KAFKA-13602
> Project: Kafka
>  Issue Type: New Feature
>  Components: streams
>Reporter: Matthias J. Sax
>Assignee: Sagar Rao
>Priority: Major
>  Labels: needs-kip, newbie++
>
> From time to time, users ask how they can send a record to more than one 
> partition in a sink topic. Currently, this is only possible by replicate the 
> message N times before the sink and use a custom partitioner to write the N 
> messages into the N different partitions.
> It might be worth to make this easier and add a new feature for it. There are 
> multiple options:
>  * extend `to()` / `addSink()` with a "broadcast" option/config
>  * add `toAllPartitions()` / `addBroadcastSink()` methods
>  * allow StreamPartitioner to return `-1` for "all partitions"
>  * extend `StreamPartitioner` to allow returning more than one partition (ie 
> a list/collection of integers instead of a single int)
> The first three options imply that a "full broadcast" is supported only, so 
> it's less flexible. On the other hand, it's easier to use (especially the 
> first two options are easy as they do not require to implement a custom 
> partitioner).
> The last option would be most flexible and also allow for a "partial 
> broadcast" (aka multi-cast) pattern. It might also be possible to combine two 
> options, or maye even a totally different idea.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (KAFKA-13602) Allow to broadcast a result record

2022-06-25 Thread Florin Akermann (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-13602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17558751#comment-17558751
 ] 

Florin Akermann commented on KAFKA-13602:
-

Hi [~sagarrao] , [~mjsax] 

May I have a go at this?

> Allow to broadcast a result record
> --
>
> Key: KAFKA-13602
> URL: https://issues.apache.org/jira/browse/KAFKA-13602
> Project: Kafka
>  Issue Type: New Feature
>  Components: streams
>Reporter: Matthias J. Sax
>Assignee: Sagar Rao
>Priority: Major
>  Labels: needs-kip, newbie++
>
> From time to time, users ask how they can send a record to more than one 
> partition in a sink topic. Currently, this is only possible by replicate the 
> message N times before the sink and use a custom partitioner to write the N 
> messages into the N different partitions.
> It might be worth to make this easier and add a new feature for it. There are 
> multiple options:
>  * extend `to()` / `addSink()` with a "broadcast" option/config
>  * add `toAllPartitions()` / `addBroadcastSink()` methods
>  * allow StreamPartitioner to return `-1` for "all partitions"
>  * extend `StreamPartitioner` to allow returning more than one partition (ie 
> a list/collection of integers instead of a single int)
> The first three options imply that a "full broadcast" is supported only, so 
> it's less flexible. On the other hand, it's easier to use (especially the 
> first two options are easy as they do not require to implement a custom 
> partitioner).
> The last option would be most flexible and also allow for a "partial 
> broadcast" (aka multi-cast) pattern. It might also be possible to combine two 
> options, or maye even a totally different idea.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Assigned] (KAFKA-13602) Allow to broadcast a result record

2022-01-22 Thread Florin Akermann (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Florin Akermann reassigned KAFKA-13602:
---

Assignee: (was: Florin Akermann)

> Allow to broadcast a result record
> --
>
> Key: KAFKA-13602
> URL: https://issues.apache.org/jira/browse/KAFKA-13602
> Project: Kafka
>  Issue Type: New Feature
>  Components: streams
>Reporter: Matthias J. Sax
>Priority: Major
>  Labels: needs-kip, newbie++
>
> From time to time, users ask how they can send a record to more than one 
> partition in a sink topic. Currently, this is only possible by replicate the 
> message N times before the sink and use a custom partitioner to write the N 
> messages into the N different partitions.
> It might be worth to make this easier and add a new feature for it. There are 
> multiple options:
>  * extend `to()` / `addSink()` with a "broadcast" option/config
>  * add `toAllPartitions()` / `addBroadcastSink()` methods
>  * allow StreamPartitioner to return `-1` for "all partitions"
>  * extend `StreamPartitioner` to allow returning more than one partition (ie 
> a list/collection of integers instead of a single int)
> The first three options imply that a "full broadcast" is supported only, so 
> it's less flexible. On the other hand, it's easier to use (especially the 
> first two options are easy as they do not require to implement a custom 
> partitioner).
> The last option would be most flexible and also allow for a "partial 
> broadcast" (aka multi-cast) pattern. It might also be possible to combine two 
> options, or maye even a totally different idea.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (KAFKA-13602) Allow to broadcast a result record

2022-01-22 Thread Florin Akermann (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Florin Akermann reassigned KAFKA-13602:
---

Assignee: Florin Akermann

> Allow to broadcast a result record
> --
>
> Key: KAFKA-13602
> URL: https://issues.apache.org/jira/browse/KAFKA-13602
> Project: Kafka
>  Issue Type: New Feature
>  Components: streams
>Reporter: Matthias J. Sax
>Assignee: Florin Akermann
>Priority: Major
>  Labels: needs-kip, newbie++
>
> From time to time, users ask how they can send a record to more than one 
> partition in a sink topic. Currently, this is only possible by replicate the 
> message N times before the sink and use a custom partitioner to write the N 
> messages into the N different partitions.
> It might be worth to make this easier and add a new feature for it. There are 
> multiple options:
>  * extend `to()` / `addSink()` with a "broadcast" option/config
>  * add `toAllPartitions()` / `addBroadcastSink()` methods
>  * allow StreamPartitioner to return `-1` for "all partitions"
>  * extend `StreamPartitioner` to allow returning more than one partition (ie 
> a list/collection of integers instead of a single int)
> The first three options imply that a "full broadcast" is supported only, so 
> it's less flexible. On the other hand, it's easier to use (especially the 
> first two options are easy as they do not require to implement a custom 
> partitioner).
> The last option would be most flexible and also allow for a "partial 
> broadcast" (aka multi-cast) pattern. It might also be possible to combine two 
> options, or maye even a totally different idea.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (KAFKA-13351) Add possibility to write kafka headers in Kafka Console Producer

2021-11-12 Thread Florin Akermann (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-13351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17442917#comment-17442917
 ] 

Florin Akermann commented on KAFKA-13351:
-

[~habdank] I created a kip and started a discussion thread: 
https://lists.apache.org/thread/pw0oqbk855vj0d63gfg3q3h3p2p1zopw

> Add possibility to write kafka headers in Kafka Console Producer
> 
>
> Key: KAFKA-13351
> URL: https://issues.apache.org/jira/browse/KAFKA-13351
> Project: Kafka
>  Issue Type: Wish
>  Components: tools
>Affects Versions: 2.8.1
>Reporter: Seweryn Habdank-Wojewodzki
>Assignee: Florin Akermann
>Priority: Major
>
> Dears,
> Currently there is an asymetry between Kafka Console Consumer and Producer.
> Kafka Consumer can display headers (KAFKA-6733), but Kafka Producer cannot 
> produce them.
> It would be good to unify this and add possibility to Kafka Console Producer 
> to produce them.
> Similar ticket is here: KAFKA-6574, but it is very old and does not 
> represents current state of the software.
> Please consider this.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (KAFKA-13351) Add possibility to write kafka headers in Kafka Console Producer

2021-10-31 Thread Florin Akermann (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-13351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17436525#comment-17436525
 ] 

Florin Akermann commented on KAFKA-13351:
-

Hi, I made a PR for it. However I cannot set the jira-status to "Patch 
Available". I guess it is because I am not assigned to the Jira yet.

[~mjsax] I have seen you comment on many other Jiras. I guess  you are 
comitter. Could you assign me to the Jira. Which comitters should I tag on this 
PR?

> Add possibility to write kafka headers in Kafka Console Producer
> 
>
> Key: KAFKA-13351
> URL: https://issues.apache.org/jira/browse/KAFKA-13351
> Project: Kafka
>  Issue Type: Wish
>Affects Versions: 2.8.1
>Reporter: Seweryn Habdank-Wojewodzki
>Priority: Major
>
> Dears,
> Currently there is an asymetry between Kafka Console Consumer and Producer.
> Kafka Consumer can display headers (KAFKA-6733), but Kafka Producer cannot 
> produce them.
> It would be good to unify this and add possibility to Kafka Console Producer 
> to produce them.
> Similar ticket is here: KAFKA-6574, but it is very old and does not 
> represents current state of the software.
> Please consider this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-13351) Add possibility to write kafka headers in Kafka Console Producer

2021-10-29 Thread Florin Akermann (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-13351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17436205#comment-17436205
 ] 

Florin Akermann commented on KAFKA-13351:
-

Hi, I would like to pick this up.

> Add possibility to write kafka headers in Kafka Console Producer
> 
>
> Key: KAFKA-13351
> URL: https://issues.apache.org/jira/browse/KAFKA-13351
> Project: Kafka
>  Issue Type: Wish
>Affects Versions: 2.8.1
>Reporter: Seweryn Habdank-Wojewodzki
>Priority: Major
>
> Dears,
> Currently there is an asymetry between Kafka Console Consumer and Producer.
> Kafka Consumer can display headers (KAFKA-6733), but Kafka Producer cannot 
> produce them.
> It would be good to unify this and add possibility to Kafka Console Producer 
> to produce them.
> Similar ticket is here: KAFKA-6574, but it is very old and does not 
> represents current state of the software.
> Please consider this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)