[ 
https://issues.apache.org/jira/browse/SPARK-25399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Murthy updated SPARK-25399:
---------------------------------
    Priority: Major  (was: Blocker)

> Reusing execution threads from continuous processing for microbatch streaming 
> can result in correctness issues
> --------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-25399
>                 URL: https://issues.apache.org/jira/browse/SPARK-25399
>             Project: Spark
>          Issue Type: Bug
>          Components: Structured Streaming
>    Affects Versions: 2.4.0
>            Reporter: Mukul Murthy
>            Priority: Major
>
> Continuous processing sets some thread local variables that, when read by a 
> thread running a microbatch stream, may result in incorrect or no previous 
> state being read and resulting in wrong answers. This was caught by a job 
> running the StreamSuite tests, and only repros occasionally when the same 
> threads are used.
> The issue is in StateStoreRDD.compute - when we compute currentVersion, we 
> read from a thread local variable which is set by continuous processing 
> threads. If this value is set, we then think we're on the wrong state version.
> I imagine very few people, if any, would run into this bug, because you'd 
> have to use continuous processing and then microbatch processing in the same 
> cluster. However, it can result in silent correctness issues, and it would be 
> very difficult for someone to tell if they were impacted by this or not.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to