[ https://issues.apache.org/jira/browse/SPARK-50046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jungtaek Lim reassigned SPARK-50046: ------------------------------------ Assignee: Jungtaek Lim > [Possible bug] Incorrect watermark advancement if watermark node is > lost/pruned > ------------------------------------------------------------------------------- > > Key: SPARK-50046 > URL: https://issues.apache.org/jira/browse/SPARK-50046 > Project: Spark > Issue Type: Bug > Components: Structured Streaming > Affects Versions: 4.0.0 > Reporter: Jungtaek Lim > Assignee: Jungtaek Lim > Priority: Major > Labels: pull-request-available > > This does not happen in current optimization rules, but it was mostly a luck > and we were silently dropping CollectMetrics node, hence it'd be ideal to > address the issue in prior. > WatermarkTracker only looks at the physical plan during calculation of the > new watermark value. It determines the watermark node by index, hence we have > various issues when the watermark node is lost on the optimization. > 1) watermark advancement is made even there is one node to be dropped (should > be considered as no data from that node) > 2) watermark tracker incorrectly update the memory map of the previous value > of watermark node (index is not a stable key) -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org