Hi, I've been wondering what the "proper" physical plan should be when more than one withWatermark operator is used in a query (as below).
I think a SparkPlan should have one EventTimeWatermarkExec physical operator only in a single query, but wonder if I didn't think about something important (a sort of edge case perhaps). I also think the last withWatermark and hence EventTimeWatermark is in effect. Correct? BTW, Since joins of streaming queries are not supported, the only way two or more EventTimeWatermarkExec operators could be used and applies separately is for two or more streaming queries unioned. Correct? scala> sq.explain(true) == Parsed Logical Plan == EventTimeWatermark timestamp#773: timestamp, interval 40 seconds +- EventTimeWatermark timestamp#773: timestamp, interval 10 seconds +- LogicalRDD [timestamp#773, value#774L] == Analyzed Logical Plan == timestamp: timestamp, value: bigint EventTimeWatermark timestamp#773: timestamp, interval 40 seconds +- EventTimeWatermark timestamp#773: timestamp, interval 10 seconds +- LogicalRDD [timestamp#773, value#774L] == Optimized Logical Plan == EventTimeWatermark timestamp#773: timestamp, interval 40 seconds +- EventTimeWatermark timestamp#773: timestamp, interval 10 seconds +- LogicalRDD [timestamp#773, value#774L] == Physical Plan == EventTimeWatermark timestamp#773: timestamp, interval 40 seconds +- EventTimeWatermark timestamp#773: timestamp, interval 10 seconds +- Scan ExistingRDD[timestamp#773,value#774L] Pozdrawiam, Jacek Laskowski ---- https://medium.com/@jaceklaskowski/ Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark Follow me at https://twitter.com/jaceklaskowski --------------------------------------------------------------------- To unsubscribe e-mail: dev-unsubscr...@spark.apache.org