[ https://issues.apache.org/jira/browse/SPARK-25399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mukul Murthy updated SPARK-25399: --------------------------------- Priority: Major (was: Blocker) > Reusing execution threads from continuous processing for microbatch streaming > can result in correctness issues > -------------------------------------------------------------------------------------------------------------- > > Key: SPARK-25399 > URL: https://issues.apache.org/jira/browse/SPARK-25399 > Project: Spark > Issue Type: Bug > Components: Structured Streaming > Affects Versions: 2.4.0 > Reporter: Mukul Murthy > Priority: Major > > Continuous processing sets some thread local variables that, when read by a > thread running a microbatch stream, may result in incorrect or no previous > state being read and resulting in wrong answers. This was caught by a job > running the StreamSuite tests, and only repros occasionally when the same > threads are used. > The issue is in StateStoreRDD.compute - when we compute currentVersion, we > read from a thread local variable which is set by continuous processing > threads. If this value is set, we then think we're on the wrong state version. > I imagine very few people, if any, would run into this bug, because you'd > have to use continuous processing and then microbatch processing in the same > cluster. However, it can result in silent correctness issues, and it would be > very difficult for someone to tell if they were impacted by this or not. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org