[jira] [Updated] (SPARK-25399) Reusing execution threads from continuous processing for microbatch streaming can result in correctness issues
[ https://issues.apache.org/jira/browse/SPARK-25399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Murthy updated SPARK-25399: - Priority: Major (was: Blocker) > Reusing execution threads from continuous processing for microbatch streaming > can result in correctness issues > -- > > Key: SPARK-25399 > URL: https://issues.apache.org/jira/browse/SPARK-25399 > Project: Spark > Issue Type: Bug > Components: Structured Streaming >Affects Versions: 2.4.0 >Reporter: Mukul Murthy >Priority: Major > > Continuous processing sets some thread local variables that, when read by a > thread running a microbatch stream, may result in incorrect or no previous > state being read and resulting in wrong answers. This was caught by a job > running the StreamSuite tests, and only repros occasionally when the same > threads are used. > The issue is in StateStoreRDD.compute - when we compute currentVersion, we > read from a thread local variable which is set by continuous processing > threads. If this value is set, we then think we're on the wrong state version. > I imagine very few people, if any, would run into this bug, because you'd > have to use continuous processing and then microbatch processing in the same > cluster. However, it can result in silent correctness issues, and it would be > very difficult for someone to tell if they were impacted by this or not. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-25399) Reusing execution threads from continuous processing for microbatch streaming can result in correctness issues
[ https://issues.apache.org/jira/browse/SPARK-25399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-25399: Priority: Blocker (was: Major) > Reusing execution threads from continuous processing for microbatch streaming > can result in correctness issues > -- > > Key: SPARK-25399 > URL: https://issues.apache.org/jira/browse/SPARK-25399 > Project: Spark > Issue Type: Bug > Components: Structured Streaming >Affects Versions: 2.4.0 >Reporter: Mukul Murthy >Priority: Blocker > Labels: correctness > > Continuous processing sets some thread local variables that, when read by a > thread running a microbatch stream, may result in incorrect or no previous > state being read and resulting in wrong answers. This was caught by a job > running the StreamSuite tests, and only repros occasionally when the same > threads are used. > The issue is in StateStoreRDD.compute - when we compute currentVersion, we > read from a thread local variable which is set by continuous processing > threads. If this value is set, we then think we're on the wrong state version. > I imagine very few people, if any, would run into this bug, because you'd > have to use continuous processing and then microbatch processing in the same > cluster. However, it can result in silent correctness issues, and it would be > very difficult for someone to tell if they were impacted by this or not. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-25399) Reusing execution threads from continuous processing for microbatch streaming can result in correctness issues
[ https://issues.apache.org/jira/browse/SPARK-25399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-25399: Labels: correctness (was: ) > Reusing execution threads from continuous processing for microbatch streaming > can result in correctness issues > -- > > Key: SPARK-25399 > URL: https://issues.apache.org/jira/browse/SPARK-25399 > Project: Spark > Issue Type: Bug > Components: Structured Streaming >Affects Versions: 2.4.0 >Reporter: Mukul Murthy >Priority: Major > Labels: correctness > > Continuous processing sets some thread local variables that, when read by a > thread running a microbatch stream, may result in incorrect or no previous > state being read and resulting in wrong answers. This was caught by a job > running the StreamSuite tests, and only repros occasionally when the same > threads are used. > The issue is in StateStoreRDD.compute - when we compute currentVersion, we > read from a thread local variable which is set by continuous processing > threads. If this value is set, we then think we're on the wrong state version. > I imagine very few people, if any, would run into this bug, because you'd > have to use continuous processing and then microbatch processing in the same > cluster. However, it can result in silent correctness issues, and it would be > very difficult for someone to tell if they were impacted by this or not. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-25399) Reusing execution threads from continuous processing for microbatch streaming can result in correctness issues
[ https://issues.apache.org/jira/browse/SPARK-25399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-25399: Priority: Critical (was: Blocker) > Reusing execution threads from continuous processing for microbatch streaming > can result in correctness issues > -- > > Key: SPARK-25399 > URL: https://issues.apache.org/jira/browse/SPARK-25399 > Project: Spark > Issue Type: Bug > Components: Structured Streaming >Affects Versions: 2.4.0 >Reporter: Mukul Murthy >Priority: Critical > Labels: correctness > > Continuous processing sets some thread local variables that, when read by a > thread running a microbatch stream, may result in incorrect or no previous > state being read and resulting in wrong answers. This was caught by a job > running the StreamSuite tests, and only repros occasionally when the same > threads are used. > The issue is in StateStoreRDD.compute - when we compute currentVersion, we > read from a thread local variable which is set by continuous processing > threads. If this value is set, we then think we're on the wrong state version. > I imagine very few people, if any, would run into this bug, because you'd > have to use continuous processing and then microbatch processing in the same > cluster. However, it can result in silent correctness issues, and it would be > very difficult for someone to tell if they were impacted by this or not. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-25399) Reusing execution threads from continuous processing for microbatch streaming can result in correctness issues
[ https://issues.apache.org/jira/browse/SPARK-25399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-25399: - Fix Version/s: (was: 3.0.0) > Reusing execution threads from continuous processing for microbatch streaming > can result in correctness issues > -- > > Key: SPARK-25399 > URL: https://issues.apache.org/jira/browse/SPARK-25399 > Project: Spark > Issue Type: Bug > Components: Structured Streaming >Affects Versions: 2.4.0 >Reporter: Mukul Murthy >Assignee: Mukul Murthy >Priority: Critical > Labels: correctness > Fix For: 2.4.0 > > > Continuous processing sets some thread local variables that, when read by a > thread running a microbatch stream, may result in incorrect or no previous > state being read and resulting in wrong answers. This was caught by a job > running the StreamSuite tests, and only repros occasionally when the same > threads are used. > The issue is in StateStoreRDD.compute - when we compute currentVersion, we > read from a thread local variable which is set by continuous processing > threads. If this value is set, we then think we're on the wrong state version. > I imagine very few people, if any, would run into this bug, because you'd > have to use continuous processing and then microbatch processing in the same > cluster. However, it can result in silent correctness issues, and it would be > very difficult for someone to tell if they were impacted by this or not. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org