Thanks for the KIP.

left join KStream-GlobalTable: no longer drop left records with null-key and 
call KeyValueMapper with 'null' for left  key. The case where KeyValueMapper 
returns null is already handled in the current implementation.

Not sure if this is the right phrasing? In the end, even now, the stream input record key can be null (cf https://issues.apache.org/jira/browse/KAFKA-10277) -- a stream record is only dropped if the `KeyValueMapper` returns `null` (note that the key-extractor has no default implemenation but is a required argument) -- this KIP would relax this case for left-join.


In the pull request all relevant Javadocs will be updated with the information 
on how to keep the old behavior for a given operator / method.

I would remove this from the KIP -- I am also not sure if we should put it into the JavaDoc? -- I agree that it should go into the upgrade docs as well as "join section" in the docs: https://kafka.apache.org/35/documentation/streams/developer-guide/dsl-api.html#joining

We also have https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Streams+Join+Semantics that we should update.


I added a remark about the repartition of null-key records.

I think the way it's phrase is good. In the end, it's an optimization to drop records upstream (instead of piping them through the topic and drop the downstream), and thus we don't have to cover it in the KIP in detail. In general, for aggregations we can still apply the optimization, however, we need to be careful as we could also have two downstream operators with a shared repartition topic: for this case, we can only drop upstream if all downstream operator would drop null-key records anyway. We can cover details on the PR.



-Matthias



On 8/9/23 5:39 AM, Florin Akermann wrote:
Hi All,

I added a remark about the repartition of null-key records.
https://cwiki.apache.org/confluence/display/KAFKA/KIP-962%3A+Relax+non-null+key+requirement+in+Kafka+Streams#KIP962:RelaxnonnullkeyrequirementinKafkaStreams-Repartitionofnull-keyrecords

Can we relax this constraint for any kind of repartitioning or should it
only be relaxed in the context of left stream-table and left/outer
stream-stream joins?

Florin

On Mon, 7 Aug 2023 at 13:23, Florin Akermann <florin.akerm...@gmail.com>
wrote:

Hi Lucas,

Thanks. I added the point about the upgrade guide as well.

Florin

On Mon, 7 Aug 2023 at 11:06, Lucas Brutschy <lbruts...@confluent.io.invalid>
wrote:

Hi Florin,

thanks for the KIP! This looks good to me. I agree that the precise
Java doc wording doesn't have to be discussed as part of the KIP.

I would also suggest to include an update to
https://kafka.apache.org/documentation/streams/upgrade-guide

Cheers,
Lucas

On Mon, Aug 7, 2023 at 10:51 AM Florin Akermann
<florin.akerm...@gmail.com> wrote:

Hi Both,

Thanks.
I added remarks to account for this.

https://cwiki.apache.org/confluence/display/KAFKA/KIP-962%3A+Relax+non-null+key+requirement+in+Kafka+Streams#KIP962:RelaxnonnullkeyrequirementinKafkaStreams-Remarks

In short, let's add a note in the Java docs? The exact wording of the
note
can be scrutinized in the pull request?

What do you think?


On Sun, 6 Aug 2023 at 19:41, Guozhang Wang <guozhang.wang...@gmail.com>
wrote:

I'm just thinking we can try to encourage users to migrate from XX to
XXWithKey in the docs, giving this as one good example that the latter
can help you distinguish different scenarios whereas the former
cannot.

On Fri, Aug 4, 2023 at 6:32 PM Matthias J. Sax <mj...@apache.org>
wrote:

Guozhang,

thanks for pointing out ValueJoinerWithKey. In the end, it's just a
documentation change, ie, point out that the passed in key could be
`null` and similar?

-Matthias


On 8/2/23 3:20 PM, Guozhang Wang wrote:
Thanks Florin for the writeup,

One quick thing I'd like to bring up is that in KIP-149
(

https://cwiki.apache.org/confluence/display/KAFKA/KIP-149%3A+Enabling+key+access+in+ValueTransformer%2C+ValueMapper%2C+and+ValueJoiner
)
we introduced ValueJoinerWithKey which is aimed to enhance
ValueJoiner. It would have a benefit for this KIP such that
implementers can distinguish "null-key" v.s. "not-null-key but
null-value" scenarios.

Hence I'd suggest we also include the semantic changes with
ValueJoinerWithKey, which can help distinguish these two
scenarios,
and also document that if users apply ValueJoiner only, they may
not
have this benefit, and hence we suggest users to use the former.


Guozhang

On Mon, Jul 31, 2023 at 12:11 PM Florin Akermann
<florin.akerm...@gmail.com> wrote:



https://cwiki.apache.org/confluence/display/KAFKA/KIP-962%3A+Relax+non-null+key+requirement+in+Kafka+Streams




Reply via email to