Anish Shrigondekar created SPARK-38787:
------------------------------------------

             Summary: Possible correctness issue on stream-stream join when 
handling edge case
                 Key: SPARK-38787
                 URL: https://issues.apache.org/jira/browse/SPARK-38787
             Project: Spark
          Issue Type: Bug
          Components: Structured Streaming
    Affects Versions: 3.2.1
            Reporter: Anish Shrigondekar


There was an issue on NPE in stream-stream join. SPARK-35659 fixed the issue 
“partially”, and the part of fix is to ignore the null value from the last 
index on swapping elements in the list so the null value in the last index is 
going to be effectively dropped. If it is due to out of sync between numValues 
and the actual number of elements, this works effectively as a correction.

This unfortunately opens the possibility of another “correctness” issue; the 
reason we swap the value with last index is effectively to remove the value in 
the current index. Doing nothing in any case would mean “we don’t remove the 
value in the current index”, whereas the caller would expect the value as 
dropped, and even for outer join they may be emitted as left/right null join 
output while the value can be re-evaluated and emitted again.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to