Jose Torres created SPARK-22017: ----------------------------------- Summary: watermark evaluation with multi-input stream operators is unspecified Key: SPARK-22017 URL: https://issues.apache.org/jira/browse/SPARK-22017 Project: Spark Issue Type: Bug Components: Structured Streaming Affects Versions: 2.2.0 Reporter: Jose Torres
Watermarks are stored as a single value in StreamExecution. If a query has multiple watermark nodes (which can generally only happen with multi input operators like union), a headOption call will arbitrarily pick one to use as the real one. This will happen independently in each batch, possibly leading to strange and undefined behavior. We should instead choose the minimum from all watermark exec nodes as the query-wide watermark. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org