[jira] [Updated] (SPARK-47941) Propagate ForeachBatch worker initialization errors to users for PySpark
[ https://issues.apache.org/jira/browse/SPARK-47941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Marnadi updated SPARK-47941: - Summary: Propagate ForeachBatch worker initialization errors to users for PySpark (was: Propagate ForeachBatch initialization errors to users) > Propagate ForeachBatch worker initialization errors to users for PySpark > > > Key: SPARK-47941 > URL: https://issues.apache.org/jira/browse/SPARK-47941 > Project: Spark > Issue Type: Improvement > Components: Connect, Structured Streaming >Affects Versions: 4.0.0 >Reporter: Eric Marnadi >Priority: Major > > Ensure that errors and exceptions thrown during foreachBatch worker > initialization are propagated to the user, instead of just stderr. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-47805) [Arbitrary State Support] State TTL support - MapState
[ https://issues.apache.org/jira/browse/SPARK-47805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Marnadi updated SPARK-47805: - Description: Add support for expiring state value based on ttl for Map State in transformWithState operator. (was: Add support for expiring state value based on ttl for List State in transformWithState operator.) > [Arbitrary State Support] State TTL support - MapState > -- > > Key: SPARK-47805 > URL: https://issues.apache.org/jira/browse/SPARK-47805 > Project: Spark > Issue Type: Improvement > Components: Structured Streaming >Affects Versions: 4.0.0 >Reporter: Eric Marnadi >Priority: Major > Labels: pull-request-available > > Add support for expiring state value based on ttl for Map State in > transformWithState operator. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47805) [Arbitrary State Support] State TTL support - MapState
Eric Marnadi created SPARK-47805: Summary: [Arbitrary State Support] State TTL support - MapState Key: SPARK-47805 URL: https://issues.apache.org/jira/browse/SPARK-47805 Project: Spark Issue Type: Improvement Components: Structured Streaming Affects Versions: 4.0.0 Reporter: Eric Marnadi Add support for expiring state value based on ttl for List State in transformWithState operator. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-47673) [Arbitrary State Support] State TTL support - ListState
Eric Marnadi created SPARK-47673: Summary: [Arbitrary State Support] State TTL support - ListState Key: SPARK-47673 URL: https://issues.apache.org/jira/browse/SPARK-47673 Project: Spark Issue Type: Improvement Components: Structured Streaming Affects Versions: 4.0.0 Reporter: Eric Marnadi Add support for expiring state value based on ttl for List State in transformWithState operator. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-46961) Adding processorHandle as a Context Variable
Eric Marnadi created SPARK-46961: Summary: Adding processorHandle as a Context Variable Key: SPARK-46961 URL: https://issues.apache.org/jira/browse/SPARK-46961 Project: Spark Issue Type: Improvement Components: Structured Streaming Affects Versions: 4.0.0 Reporter: Eric Marnadi Adding unit tests to ensure multiple input streams are supported for the TransformWithState operator. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-46960) Testing Multiple Input Streams for TransformWithState operator
Eric Marnadi created SPARK-46960: Summary: Testing Multiple Input Streams for TransformWithState operator Key: SPARK-46960 URL: https://issues.apache.org/jira/browse/SPARK-46960 Project: Spark Issue Type: Improvement Components: Structured Streaming Affects Versions: 4.0.0 Reporter: Eric Marnadi Adding unit tests to ensure multiple input streams are supported for the TransformWithState operator. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-46911) Add deleteIfExists operator to StatefulProcessorHandle
Eric Marnadi created SPARK-46911: Summary: Add deleteIfExists operator to StatefulProcessorHandle Key: SPARK-46911 URL: https://issues.apache.org/jira/browse/SPARK-46911 Project: Spark Issue Type: Improvement Components: Structured Streaming Affects Versions: 4.0.0 Reporter: Eric Marnadi Adding the {{deleteIfExists}} method to the {{StatefulProcessorHandle}} in order to remove state variables from the State Store -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-46865) Add Batch Support for TransformWithState Operator
[ https://issues.apache.org/jira/browse/SPARK-46865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Marnadi updated SPARK-46865: - Description: Add Batch support for the TransformWithState operator to maintain parity between the batch and streaming APIs (was: Creating the files for the Arbtirary State V2 Project to use the new Error Class Framework) > Add Batch Support for TransformWithState Operator > - > > Key: SPARK-46865 > URL: https://issues.apache.org/jira/browse/SPARK-46865 > Project: Spark > Issue Type: Improvement > Components: Structured Streaming >Affects Versions: 4.0.0 >Reporter: Eric Marnadi >Priority: Major > Labels: pull-request-available > > Add Batch support for the TransformWithState operator to maintain parity > between the batch and streaming APIs -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-46865) Add Batch Support for TransformWithState Operator
Eric Marnadi created SPARK-46865: Summary: Add Batch Support for TransformWithState Operator Key: SPARK-46865 URL: https://issues.apache.org/jira/browse/SPARK-46865 Project: Spark Issue Type: Improvement Components: Structured Streaming Affects Versions: 4.0.0 Reporter: Eric Marnadi Creating the files for the Arbtirary State V2 Project to use the new Error Class Framework -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-46864) Onboard Arbtirary State V2 onto New Error Class Framework
Eric Marnadi created SPARK-46864: Summary: Onboard Arbtirary State V2 onto New Error Class Framework Key: SPARK-46864 URL: https://issues.apache.org/jira/browse/SPARK-46864 Project: Spark Issue Type: Improvement Components: Structured Streaming Affects Versions: 4.0.0 Reporter: Eric Marnadi Creating the files for the Arbtirary State V2 Project to use the new Error Class Framework -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-44480) Add option for thread pool to perform maintenance for RocksDB/HDFS State Store Providers
Eric Marnadi created SPARK-44480: Summary: Add option for thread pool to perform maintenance for RocksDB/HDFS State Store Providers Key: SPARK-44480 URL: https://issues.apache.org/jira/browse/SPARK-44480 Project: Spark Issue Type: Improvement Components: Structured Streaming Affects Versions: 3.5.0 Reporter: Eric Marnadi Maintenance tasks on StateStore was being done by a single background thread, which is prone to straggling. In this change, the single background thread would instead schedule maintenance tasks to a thread pool. Introduce {{spark.sql.streaming.stateStore.enableStateStoreMaintenanceThreadPool}} config so that the user can enable a thread pool for maintenance manually. Introduce {{spark.sql.streaming.stateStore.numStateStoreMaintenanceThreads}} config so the thread pool size is configurable. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-44440) Use thread pool to perform maintenance activity for hdfs/rocksdb state store providers
[ https://issues.apache.org/jira/browse/SPARK-0?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Marnadi updated SPARK-0: - Shepherd: Jungtaek Lim > Use thread pool to perform maintenance activity for hdfs/rocksdb state store > providers > -- > > Key: SPARK-0 > URL: https://issues.apache.org/jira/browse/SPARK-0 > Project: Spark > Issue Type: Improvement > Components: Structured Streaming >Affects Versions: 3.5.0 >Reporter: Eric Marnadi >Priority: Major > > Use thread pool to perform maintenance activity for hdfs/rocksdb state store > providers -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-44440) Use thread pool to perform maintenance activity for hdfs/rocksdb state store providers
Eric Marnadi created SPARK-0: Summary: Use thread pool to perform maintenance activity for hdfs/rocksdb state store providers Key: SPARK-0 URL: https://issues.apache.org/jira/browse/SPARK-0 Project: Spark Issue Type: Improvement Components: Structured Streaming Affects Versions: 3.5.0 Reporter: Eric Marnadi Use thread pool to perform maintenance activity for hdfs/rocksdb state store providers -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-43542) Define a new error class and apply for the case where streaming query fails due to concurrent run of streaming query with same checkpoint
Eric Marnadi created SPARK-43542: Summary: Define a new error class and apply for the case where streaming query fails due to concurrent run of streaming query with same checkpoint Key: SPARK-43542 URL: https://issues.apache.org/jira/browse/SPARK-43542 Project: Spark Issue Type: Improvement Components: Structured Streaming Affects Versions: 1.6.3 Reporter: Eric Marnadi We are migrating to a new error framework in order to surface errors in a friendlier way to customers. This PR defines a new error class specifically for when there are concurrent updates to the log for the same batch ID -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org