Jose Torres created SPARK-22017:
-----------------------------------

             Summary: watermark evaluation with multi-input stream operators is 
unspecified
                 Key: SPARK-22017
                 URL: https://issues.apache.org/jira/browse/SPARK-22017
             Project: Spark
          Issue Type: Bug
          Components: Structured Streaming
    Affects Versions: 2.2.0
            Reporter: Jose Torres


Watermarks are stored as a single value in StreamExecution. If a query has 
multiple watermark nodes (which can generally only happen with multi input 
operators like union), a headOption call will arbitrarily pick one to use as 
the real one. This will happen independently in each batch, possibly leading to 
strange and undefined behavior.

We should instead choose the minimum from all watermark exec nodes as the 
query-wide watermark.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to