[jira] [Commented] (KAFKA-13197) KStream-GlobalKTable join semantics don't match documentation
[ https://issues.apache.org/jira/browse/KAFKA-13197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17753349#comment-17753349 ] Florin Akermann commented on KAFKA-13197: - [~twbecker], thanks for raising this. The documentation has been fixed. KIP-962 will allow for the old behavior again: 'no longer drop records when KeyValueMapper returns 'null' and call ValueJoiner with 'null' for right value'. > KStream-GlobalKTable join semantics don't match documentation > - > > Key: KAFKA-13197 > URL: https://issues.apache.org/jira/browse/KAFKA-13197 > Project: Kafka > Issue Type: Bug > Components: documentation, streams >Affects Versions: 2.7.0 >Reporter: Tommy Becker >Assignee: Florin Akermann >Priority: Major > Fix For: 3.6.0, 3.5.2 > > > As part of KAFKA-10277, the behavior of KStream-GlobalKTable joins was > changed. It appears the change was intended to merely relax a requirement but > it actually broke backwards compatibility. Although it does allow {{null}} > keys and values in the KStream to be joined, it now excludes {{null}} results > of the {{KeyValueMapper}}. We have an application which can return {{null}} > from the {{KeyValueMapper}} for non-null keys in the KStream, and relies on > these nulls being passed to the {{ValueJoiner}}. Indeed the javadoc still > explicitly says this is done: > {quote}If a KStream input record key or value is null the record will not be > included in the join operation and thus no output record will be added to the > resulting KStream. > If keyValueMapper returns null implying no match exists, a null value will > be provided to ValueJoiner. > {quote} > Both these statements are incorrect. > I think the new behavior is worse than the previous/documented behavior. It > feels more reasonable to have a non-null stream record map to a null join key > (our use-case is event-enhancement where the incoming record doesn't have the > join field), than the reverse. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-13197) KStream-GlobalKTable join semantics don't match documentation
[ https://issues.apache.org/jira/browse/KAFKA-13197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17402211#comment-17402211 ] Tommy Becker commented on KAFKA-13197: -- Hey [~guozhang] thanks for the response. To be clear, I'm looking at {{KStream.leftJoin(GlobalKTable, KeyValueMapper, ValueJoiner)}} as shown [here|https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/kstream/KStream.java#L2964] and similar signatures, which still have the (incorrect) verbiage I quoted. It seems from the churn in this area that there is a need to allow both null stream key/value records as well as non-null stream side records that map to null join keys,, though my use-case and this issue are specifically about the latter, which used to work prior to 2.7. > KStream-GlobalKTable join semantics don't match documentation > - > > Key: KAFKA-13197 > URL: https://issues.apache.org/jira/browse/KAFKA-13197 > Project: Kafka > Issue Type: Bug >Affects Versions: 2.7.0 >Reporter: Tommy Becker >Priority: Major > > As part of KAFKA-10277, the behavior of KStream-GlobalKTable joins was > changed. It appears the change was intended to merely relax a requirement but > it actually broke backwards compatibility. Although it does allow {{null}} > keys and values in the KStream to be joined, it now excludes {{null}} results > of the {{KeyValueMapper}}. We have an application which can return {{null}} > from the {{KeyValueMapper}} for non-null keys in the KStream, and relies on > these nulls being passed to the {{ValueJoiner}}. Indeed the javadoc still > explicitly says this is done: > {quote}If a KStream input record key or value is null the record will not be > included in the join operation and thus no output record will be added to the > resulting KStream. > If keyValueMapper returns null implying no match exists, a null value will > be provided to ValueJoiner. > {quote} > Both these statements are incorrect. > I think the new behavior is worse than the previous/documented behavior. It > feels more reasonable to have a non-null stream record map to a null join key > (our use-case is event-enhancement where the incoming record doesn't have the > join field), than the reverse. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-13197) KStream-GlobalKTable join semantics don't match documentation
[ https://issues.apache.org/jira/browse/KAFKA-13197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17400653#comment-17400653 ] Guozhang Wang commented on KAFKA-13197: --- Thanks for filing this [~twbecker]. The current doc says "If {@code keyValueMapper} returns {@code null} implying no match exists, no output record will be added to the resulting {@code KStream}." But I read the tickets and I think you are right: this property is not very reasonable since users may want to know if a single stream record does not find any matching results as well. I think the reasonable behavior (and the java doc should be updated accordingly) in KAFKA-10277 should be {code} If the keyValueMapper returns null implying no matching key found, the ValueJoiner would still be triggered with (null, v, null). {code} Does that sound right to you? > KStream-GlobalKTable join semantics don't match documentation > - > > Key: KAFKA-13197 > URL: https://issues.apache.org/jira/browse/KAFKA-13197 > Project: Kafka > Issue Type: Bug >Affects Versions: 2.7.0 >Reporter: Tommy Becker >Priority: Major > > As part of KAFKA-10277, the behavior of KStream-GlobalKTable joins was > changed. It appears the change was intended to merely relax a requirement but > it actually broke backwards compatibility. Although it does allow {{null}} > keys and values in the KStream to be joined, it now excludes {{null}} results > of the {{KeyValueMapper}}. We have an application which can return {{null}} > from the {{KeyValueMapper}} for non-null keys in the KStream, and relies on > these nulls being passed to the {{ValueJoiner}}. Indeed the javadoc still > explicitly says this is done: > {quote}If a KStream input record key or value is null the record will not be > included in the join operation and thus no output record will be added to the > resulting KStream. > If keyValueMapper returns null implying no match exists, a null value will > be provided to ValueJoiner. > {quote} > Both these statements are incorrect. > I think the new behavior is worse than the previous/documented behavior. It > feels more reasonable to have a non-null stream record map to a null join key > (our use-case is event-enhancement where the incoming record doesn't have the > join field), than the reverse. -- This message was sent by Atlassian Jira (v8.3.4#803005)