[
https://issues.apache.org/jira/browse/KAFKA-16382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18034489#comment-18034489
]
Nikita Shupletsov commented on KAFKA-16382:
-------------------------------------------
I think I figured out what's happening:
if we send to NULL-IN "A1:a" and then send to NULL-IN "A1:" immediately, there
will be nothing, because null + null(nothing in this case) doesn't give null in
the result, it gives nothing.
so if we reset the application, it will read and then process. so instead of:
Send to NULL-IN "A1:a"
process
Send to NULL-IN-AUX "A1:b"
process
Send to NULL-IN-"A1:"
process
it will look something like:
Send to NULL-IN "A1:a"
Send to NULL-IN-"A1:"
process
Send to NULL-IN-AUX "A1:b"
it's interesting that null is a valid on the left side if we receive it, but if
we save that null and then receive something on the right side, it's no longer
valid(basically, the saved null and no value whatsoever are treated the same)
Idk what the expected behavior should be here, maybe [~mjsax] cal help.
> Kafka Streams drop NULL values after reset
> ------------------------------------------
>
> Key: KAFKA-16382
> URL: https://issues.apache.org/jira/browse/KAFKA-16382
> Project: Kafka
> Issue Type: Bug
> Components: streams
> Affects Versions: 3.6.1
> Reporter: Stanislav Spiridonov
> Priority: Major
> Attachments: 1.patch
>
>
> Kafka Streams (KTable) drops null values after full reset.
> See
> [https://github.com/foal/Null-Issue/blob/main/src/main/java/NullProblemExample.java]
> for sample topology
> Step to reproduce (req NULL-IN, NULL-IN-AUX, NULL-OUT topics)
> # Start example - 1st round
> # Send to NULL-IN "A1:a" -> NULL-OUT "A1:anull"
> # Send to NULL-IN-AUX "A1:b" -> NULL-OUT "A1:anull, A1:ab"
> # Stop application
> # Run kafka-streams-application-reset
> {code:java}
> call bin/windows/kafka-streams-application-reset --application-id
> nullproblem-example^
> --input-topics "NULL-IN,NULL-IN-AUX"^
> --bootstrap-server "localhost:9092"
> {code}
> # Send to NULL-IN-AUX "A1:" -> NULL-OUT "A1:anull, A1:ab" - it is Ok (no app
> running yet)
> # Start example - 2nd round
> # After initialization -> NULL-OUT *still contains* 2 messages "A1:anull,
> A1:ab"
> # Expected output *3 messages* "A1:anull, A1:ab, {*}A1:{*}"
> The issue is NOT reproduced if application just restarted (skip step 5).
> The issue is NOT reproduced if internal cache is disabled.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)