[jira] [Created] (SPARK-45014) Clean up fileserver when cleaning up files, jars and archives in SparkContext

2023-08-29 Thread Hyukjin Kwon (Jira)
Hyukjin Kwon created SPARK-45014:


 Summary: Clean up fileserver when cleaning up files, jars and 
archives in SparkContext
 Key: SPARK-45014
 URL: https://issues.apache.org/jira/browse/SPARK-45014
 Project: Spark
  Issue Type: Bug
  Components: Connect
Affects Versions: 4.0.0
Reporter: Hyukjin Kwon


In SPARK-44348, we clean up Spark Context's added files but we don't clean up 
the ones in fileserver.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-42304) Assign name to _LEGACY_ERROR_TEMP_2189

2023-08-29 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-42304.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 42706
[https://github.com/apache/spark/pull/42706]

> Assign name to _LEGACY_ERROR_TEMP_2189
> --
>
> Key: SPARK-42304
> URL: https://issues.apache.org/jira/browse/SPARK-42304
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: Haejoon Lee
>Priority: Major
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-45008) Improve branch suggestion for backporting

2023-08-29 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-45008:
-

Assignee: Kent Yao

> Improve branch suggestion for backporting
> -
>
> Key: SPARK-45008
> URL: https://issues.apache.org/jira/browse/SPARK-45008
> Project: Spark
>  Issue Type: Improvement
>  Components: Project Infra
>Affects Versions: 4.0.0
>Reporter: Kent Yao
>Assignee: Kent Yao
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-45008) Improve branch suggestion for backporting

2023-08-29 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-45008.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 42723
[https://github.com/apache/spark/pull/42723]

> Improve branch suggestion for backporting
> -
>
> Key: SPARK-45008
> URL: https://issues.apache.org/jira/browse/SPARK-45008
> Project: Spark
>  Issue Type: Improvement
>  Components: Project Infra
>Affects Versions: 4.0.0
>Reporter: Kent Yao
>Assignee: Kent Yao
>Priority: Major
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-45010) Limit GHA job execution time to up to 5 hours in build_and_test.yml

2023-08-29 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-45010.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 42727
[https://github.com/apache/spark/pull/42727]

> Limit GHA job execution time to up to 5 hours in build_and_test.yml
> ---
>
> Key: SPARK-45010
> URL: https://issues.apache.org/jira/browse/SPARK-45010
> Project: Spark
>  Issue Type: Task
>  Components: Project Infra
>Affects Versions: 4.0.0
>Reporter: Dongjoon Hyun
>Assignee: Dongjoon Hyun
>Priority: Minor
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-45010) Limit GHA job execution time to up to 5 hours in build_and_test.yml

2023-08-29 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-45010:
-

Assignee: Dongjoon Hyun

> Limit GHA job execution time to up to 5 hours in build_and_test.yml
> ---
>
> Key: SPARK-45010
> URL: https://issues.apache.org/jira/browse/SPARK-45010
> Project: Spark
>  Issue Type: Task
>  Components: Project Infra
>Affects Versions: 4.0.0
>Reporter: Dongjoon Hyun
>Assignee: Dongjoon Hyun
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-44728) Improve PySpark documentations

2023-08-29 Thread Ruifeng Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-44728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17757209#comment-17757209
 ] 

Ruifeng Zheng edited comment on SPARK-44728 at 8/30/23 2:57 AM:


A docstring should contain the following sections:
 # Brief Description: A concise summary explaining the function's purpose.
 # Version Annotations: Annotations like versionadded and versionchanged to 
signify the addition or modifications of the function in different versions of 
the software.
 # Parameters: This section should list and describe all input parameters. If 
the function doesn't accept any parameters, this section can be omitted.
 # Returns: Detail what the function returns. If the function doesn't return 
anything, this section can be omitted.
 # See Also: A list of related API functions or methods. This section can be 
omitted if no related APIs exist.
 # Notes: Include additional information or warnings about the function's usage 
here.
 # Examples:
 ## A docstring contains 3~5 examples if poosible.
 ## Every example should begin with a brief description, followed by the 
example code, and conclude with the expected output.
 ## Every example should be copy-paste able;
 ## Any necessary import statements should be included at the beginning of each 
example. Specially, the python functions should be imported as *`import 
pyspark.sql.functions as sf`*


was (Author: podongfeng):
A docstring should contain the following sections:
 # Brief Description: A concise summary explaining the function's purpose.
 # Version Annotations: Annotations like versionadded and versionchanged to 
signify the addition or modifications of the function in different versions of 
the software.
 # Parameters: This section should list and describe all input parameters. If 
the function doesn't accept any parameters, this section can be omitted.
 # Returns: Detail what the function returns. If the function doesn't return 
anything, this section can be omitted.
 # See Also: A list of related API functions or methods. This section can be 
omitted if no related APIs exist.
 # Notes: Include additional information or warnings about the function's usage 
here.
 # Examples:
 ## A docstring contains 3~5 examples if poosible.
 ## Every example should begin with a brief description, followed by the 
example code, and conclude with the expected output.
 ## Every example should be copy-paste able;
 ## Any necessary import statements should be included at the beginning of each 
example. Specially, the python functions should be imported as `import 
pyspark.sql.functions as sf`

> Improve PySpark documentations
> --
>
> Key: SPARK-44728
> URL: https://issues.apache.org/jira/browse/SPARK-44728
> Project: Spark
>  Issue Type: Umbrella
>  Components: PySpark
>Affects Versions: 3.5.0, 4.0.0
>Reporter: Allison Wang
>Priority: Major
> Fix For: 4.0.0
>
>
> An umbrella Jira ticket to improve the PySpark documentation.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-44728) Improve PySpark documentations

2023-08-29 Thread Ruifeng Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-44728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17757209#comment-17757209
 ] 

Ruifeng Zheng edited comment on SPARK-44728 at 8/30/23 2:57 AM:


A docstring should contain the following sections:
 # Brief Description: A concise summary explaining the function's purpose.
 # Version Annotations: Annotations like versionadded and versionchanged to 
signify the addition or modifications of the function in different versions of 
the software.
 # Parameters: This section should list and describe all input parameters. If 
the function doesn't accept any parameters, this section can be omitted.
 # Returns: Detail what the function returns. If the function doesn't return 
anything, this section can be omitted.
 # See Also: A list of related API functions or methods. This section can be 
omitted if no related APIs exist.
 # Notes: Include additional information or warnings about the function's usage 
here.
 # Examples:
 ## A docstring contains 3~5 examples if poosible.
 ## Every example should begin with a brief description, followed by the 
example code, and conclude with the expected output.
 ## Every example should be copy-paste able;
 ## Any necessary import statements should be included at the beginning of each 
example. Specially, the python functions should be imported as `import 
pyspark.sql.functions as sf`


was (Author: podongfeng):
A docstring should contain the following sections:
 # Brief Description: A concise summary explaining the function's purpose.
 # Version Annotations: Annotations like versionadded and versionchanged to 
signify the addition or modifications of the function in different versions of 
the software.
 # Parameters: This section should list and describe all input parameters. If 
the function doesn't accept any parameters, this section can be omitted.
 # Returns: Detail what the function returns. If the function doesn't return 
anything, this section can be omitted.
 # See Also: A list of related API functions or methods. This section can be 
omitted if no related APIs exist.
 # Notes: Include additional information or warnings about the function's usage 
here.
 # Examples:
 ## A docstring contains 3~5 examples if poosible.
 ## Every example should begin with a brief description, followed by the 
example code, and conclude with the expected output.
 ## Every example should be copy-paste able;
 ## Any necessary import statements should be included at the beginning of each 
example.

> Improve PySpark documentations
> --
>
> Key: SPARK-44728
> URL: https://issues.apache.org/jira/browse/SPARK-44728
> Project: Spark
>  Issue Type: Umbrella
>  Components: PySpark
>Affects Versions: 3.5.0, 4.0.0
>Reporter: Allison Wang
>Priority: Major
> Fix For: 4.0.0
>
>
> An umbrella Jira ticket to improve the PySpark documentation.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] (SPARK-44728) Improve PySpark documentations

2023-08-29 Thread Ruifeng Zheng (Jira)


[ https://issues.apache.org/jira/browse/SPARK-44728 ]


Ruifeng Zheng deleted comment on SPARK-44728:
---

was (Author: podongfeng):
Issue resolved by pull request 42637
[https://github.com/apache/spark/pull/42637]

> Improve PySpark documentations
> --
>
> Key: SPARK-44728
> URL: https://issues.apache.org/jira/browse/SPARK-44728
> Project: Spark
>  Issue Type: Umbrella
>  Components: PySpark
>Affects Versions: 3.5.0, 4.0.0
>Reporter: Allison Wang
>Priority: Major
> Fix For: 4.0.0
>
>
> An umbrella Jira ticket to improve the PySpark documentation.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-44949) Refine docstring for DataFrame.approxQuantile

2023-08-29 Thread Ruifeng Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-44949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17760210#comment-17760210
 ] 

Ruifeng Zheng commented on SPARK-44949:
---

resolved in https://github.com/apache/spark/pull/42637

> Refine docstring for DataFrame.approxQuantile
> -
>
> Key: SPARK-44949
> URL: https://issues.apache.org/jira/browse/SPARK-44949
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, PySpark
>Affects Versions: 3.5.0, 4.0.0
>Reporter: Michael Zhang
>Priority: Major
>
> The docstring has no examples



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-44949) Refine docstring for DataFrame.approxQuantile

2023-08-29 Thread Ruifeng Zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-44949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruifeng Zheng resolved SPARK-44949.
---
  Assignee: Michael Zhang
Resolution: Fixed

> Refine docstring for DataFrame.approxQuantile
> -
>
> Key: SPARK-44949
> URL: https://issues.apache.org/jira/browse/SPARK-44949
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, PySpark
>Affects Versions: 3.5.0, 4.0.0
>Reporter: Michael Zhang
>Assignee: Michael Zhang
>Priority: Major
>
> The docstring has no examples



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Reopened] (SPARK-44728) Improve PySpark documentations

2023-08-29 Thread Ruifeng Zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-44728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruifeng Zheng reopened SPARK-44728:
---

> Improve PySpark documentations
> --
>
> Key: SPARK-44728
> URL: https://issues.apache.org/jira/browse/SPARK-44728
> Project: Spark
>  Issue Type: Umbrella
>  Components: PySpark
>Affects Versions: 3.5.0, 4.0.0
>Reporter: Allison Wang
>Priority: Major
> Fix For: 4.0.0
>
>
> An umbrella Jira ticket to improve the PySpark documentation.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-45011) Refine docstring of `Column.between`

2023-08-29 Thread Ruifeng Zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruifeng Zheng resolved SPARK-45011.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 42728
[https://github.com/apache/spark/pull/42728]

> Refine docstring of `Column.between`
> 
>
> Key: SPARK-45011
> URL: https://issues.apache.org/jira/browse/SPARK-45011
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, PySpark
>Affects Versions: 4.0.0
>Reporter: Allison Wang
>Assignee: Allison Wang
>Priority: Major
> Fix For: 4.0.0
>
>
> Refine the docstring of Column.between



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-45011) Refine docstring of `Column.between`

2023-08-29 Thread Ruifeng Zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruifeng Zheng reassigned SPARK-45011:
-

Assignee: Allison Wang

> Refine docstring of `Column.between`
> 
>
> Key: SPARK-45011
> URL: https://issues.apache.org/jira/browse/SPARK-45011
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, PySpark
>Affects Versions: 4.0.0
>Reporter: Allison Wang
>Assignee: Allison Wang
>Priority: Major
>
> Refine the docstring of Column.between



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-44728) Improve PySpark documentations

2023-08-29 Thread Ruifeng Zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-44728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruifeng Zheng resolved SPARK-44728.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 42637
[https://github.com/apache/spark/pull/42637]

> Improve PySpark documentations
> --
>
> Key: SPARK-44728
> URL: https://issues.apache.org/jira/browse/SPARK-44728
> Project: Spark
>  Issue Type: Umbrella
>  Components: PySpark
>Affects Versions: 3.5.0, 4.0.0
>Reporter: Allison Wang
>Priority: Major
> Fix For: 4.0.0
>
>
> An umbrella Jira ticket to improve the PySpark documentation.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-45013) Flaky Test with NPE: track allocated resources by taskId

2023-08-29 Thread Kent Yao (Jira)
Kent Yao created SPARK-45013:


 Summary: Flaky Test with NPE: track allocated resources by taskId
 Key: SPARK-45013
 URL: https://issues.apache.org/jira/browse/SPARK-45013
 Project: Spark
  Issue Type: Test
  Components: Tests
Affects Versions: 4.0.0
Reporter: Kent Yao


{code:java}
- track allocated resources by taskId *** FAILED *** (76 milliseconds)
28782[info]   java.lang.NullPointerException:
28783[info]   at 
org.apache.spark.executor.CoarseGrainedExecutorBackend.statusUpdate(CoarseGrainedExecutorBackend.scala:267)
28784[info]   at 
org.apache.spark.executor.CoarseGrainedExecutorBackendSuite.$anonfun$new$22(CoarseGrainedExecutorBackendSuite.scala:347)
28785[info]   at 
org.scalatest.enablers.Timed$$anon$1.timeoutAfter(Timed.scala:127)
28786[info]   at 
org.scalatest.concurrent.TimeLimits$.failAfterImpl(TimeLimits.scala:282)
28787[info]   at 
org.scalatest.concurrent.TimeLimits.failAfter(TimeLimits.scala:231)
28788[info]   at 
org.scalatest.concurrent.TimeLimits.failAfter$(TimeLimits.scala:230)
28789[info]   at 
org.apache.spark.SparkFunSuite.failAfter(SparkFunSuite.scala:69)
28790[info]   at 
org.apache.spark.SparkFunSuite.$anonfun$test$2(SparkFunSuite.scala:155)
28791[info]   at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85)
28792[info]   at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83)
28793[info]   at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
28794[info]   at org.scalatest.Transformer.apply(Transformer.scala:22)
28795[info]   at org.scalatest.Transformer.apply(Transformer.scala:20)
28796[info]   at 
org.scalatest.funsuite.AnyFunSuiteLike$$anon$1.apply(AnyFunSuiteLike.scala:226)
28797[info]   at 
org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:227)
28798[info]   at 
org.scalatest.funsuite.AnyFunSuiteLike.invokeWithFixture$1(AnyFunSuiteLike.scala:224)
28799[info]   at 
org.scalatest.funsuite.AnyFunSuiteLike.$anonfun$runTest$1(AnyFunSuiteLike.scala:236)
28800[info]   at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306)
28801[info]   at 
org.scalatest.funsuite.AnyFunSuiteLike.runTest(AnyFunSuiteLike.scala:236)
28802[info]   at 
org.scalatest.funsuite.AnyFunSuiteLike.runTest$(AnyFunSuiteLike.scala:218)
28803[info]   at 
org.apache.spark.SparkFunSuite.org$scalatest$BeforeAndAfterEach$$super$runTest(SparkFunSuite.scala:69)
28804[info]   at 
org.scalatest.BeforeAndAfterEach.runTest(BeforeAndAfterEach.scala:234)
28805[info]   at 
org.scalatest.BeforeAndAfterEach.runTest$(BeforeAndAfterEach.scala:227)
28806[info]   at org.apache.spark.SparkFunSuite.runTest(SparkFunSuite.scala:69)
28807[info]   at 
org.scalatest.funsuite.AnyFunSuiteLike.$anonfun$runTests$1(AnyFunSuiteLike.scala:269)
28808[info]   at 
org.scalatest.SuperEngine.$anonfun$runTestsInBranch$1(Engine.scala:413) {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-45012) CheckAnalysis should throw inlined plan in AnalysisException

2023-08-29 Thread Rui Wang (Jira)
Rui Wang created SPARK-45012:


 Summary: CheckAnalysis should throw inlined plan in 
AnalysisException
 Key: SPARK-45012
 URL: https://issues.apache.org/jira/browse/SPARK-45012
 Project: Spark
  Issue Type: Task
  Components: SQL
Affects Versions: 3.5.0
Reporter: Rui Wang
Assignee: Rui Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-44450) Make direct Arrow encoding work with SQL/API

2023-08-29 Thread Jira


[ 
https://issues.apache.org/jira/browse/SPARK-44450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17760186#comment-17760186
 ] 

Herman van Hövell commented on SPARK-44450:
---

It is not really part of the public API, and we don't write user facing 
documentation for that. The use case is main that we can encode  directly 
between user objects and arrow batches. We needed this to get rid of the 
Catalyst dependency. You can check how it is integrated with the Spark Connect 
Scala Client, look at SparkResult for deserialization, and look at 
SparkSession.createDataset for serialization. Alternatively you can look at the 
ArrowEncoder suite.

> Make direct Arrow encoding work with SQL/API
> 
>
> Key: SPARK-44450
> URL: https://issues.apache.org/jira/browse/SPARK-44450
> Project: Spark
>  Issue Type: New Feature
>  Components: Connect
>Affects Versions: 3.4.1
>Reporter: Herman van Hövell
>Assignee: Herman van Hövell
>Priority: Major
> Fix For: 3.5.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-45002) Avoid uncaught exception from state store maintenance task thread on error

2023-08-29 Thread Jungtaek Lim (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jungtaek Lim resolved SPARK-45002.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 42716
[https://github.com/apache/spark/pull/42716]

> Avoid uncaught exception from state store maintenance task thread on error
> --
>
> Key: SPARK-45002
> URL: https://issues.apache.org/jira/browse/SPARK-45002
> Project: Spark
>  Issue Type: Task
>  Components: Structured Streaming
>Affects Versions: 4.0.0
>Reporter: Anish Shrigondekar
>Assignee: Anish Shrigondekar
>Priority: Major
> Fix For: 4.0.0
>
>
> Avoid uncaught exception from state store maintenance task thread on error



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-45002) Avoid uncaught exception from state store maintenance task thread on error

2023-08-29 Thread Jungtaek Lim (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jungtaek Lim reassigned SPARK-45002:


Assignee: Anish Shrigondekar

> Avoid uncaught exception from state store maintenance task thread on error
> --
>
> Key: SPARK-45002
> URL: https://issues.apache.org/jira/browse/SPARK-45002
> Project: Spark
>  Issue Type: Task
>  Components: Structured Streaming
>Affects Versions: 4.0.0
>Reporter: Anish Shrigondekar
>Assignee: Anish Shrigondekar
>Priority: Major
>
> Avoid uncaught exception from state store maintenance task thread on error



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-43299) JVM Client throw StreamingQueryException when error handling is implemented

2023-08-29 Thread Jira


[ 
https://issues.apache.org/jira/browse/SPARK-43299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17760170#comment-17760170
 ] 

Herman van Hövell commented on SPARK-43299:
---

[~heyihong] eyes on this please. Let's make sure this also works with your code.

> JVM Client throw StreamingQueryException when error handling is implemented
> ---
>
> Key: SPARK-43299
> URL: https://issues.apache.org/jira/browse/SPARK-43299
> Project: Spark
>  Issue Type: Task
>  Components: Connect, Structured Streaming
>Affects Versions: 3.5.0
>Reporter: Wei Liu
>Priority: Major
>
> Currently the awaitTermination() method of connect's JVM client's 
> StreamingQuery won't throw error when there is an exception. 
>  
> In Python connect this is directly handled by python client's error-handling 
> framework but such is not existed in JVM client right now.
>  
> We should verify it works when JVM adds that
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-44450) Make direct Arrow encoding work with SQL/API

2023-08-29 Thread H. Vetinari (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-44450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17760163#comment-17760163
 ] 

H. Vetinari commented on SPARK-44450:
-

Thanks for the quick response! Are there any docs for using this? I've checked 
the [rc3 
docs|https://dist.apache.org/repos/dist/dev/spark/v3.5.0-rc3-docs/_site/api/scala/org/apache/spark/sql/]
 (not easily searchable, so only in API for now), and there's no section/page 
for {{apache.spark.sql.connect.client}}.

> Make direct Arrow encoding work with SQL/API
> 
>
> Key: SPARK-44450
> URL: https://issues.apache.org/jira/browse/SPARK-44450
> Project: Spark
>  Issue Type: New Feature
>  Components: Connect
>Affects Versions: 3.4.1
>Reporter: Herman van Hövell
>Assignee: Herman van Hövell
>Priority: Major
> Fix For: 3.5.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-44450) Make direct Arrow encoding work with SQL/API

2023-08-29 Thread Jira


[ 
https://issues.apache.org/jira/browse/SPARK-44450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17760155#comment-17760155
 ] 

Herman van Hövell edited comment on SPARK-44450 at 8/29/23 10:22 PM:
-

[~h-vetinari] this was mostly about making sure we moved all needed classes to 
sql/api module, and that was done about a month ago in SPARK-44532 
(https://github.com/apache/spark/pull/42156).

I assume you are mostly interested in the actual encoders. Those can be found 
here: 
https://github.com/apache/spark/tree/master/connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/arrow


was (Author: hvanhovell):
[~h-vetinari] this was mostly about making sure we moved all needed classes to 
sql/api module, and that was done about a month ago in SPARK-44532 
(https://github.com/apache/spark/pull/42156).

I assume you are mostly interested in the actual encoders. Those can be found 
here:https://github.com/apache/spark/tree/master/connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/arrow

> Make direct Arrow encoding work with SQL/API
> 
>
> Key: SPARK-44450
> URL: https://issues.apache.org/jira/browse/SPARK-44450
> Project: Spark
>  Issue Type: New Feature
>  Components: Connect
>Affects Versions: 3.4.1
>Reporter: Herman van Hövell
>Assignee: Herman van Hövell
>Priority: Major
> Fix For: 3.5.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-44450) Make direct Arrow encoding work with SQL/API

2023-08-29 Thread Jira


[ 
https://issues.apache.org/jira/browse/SPARK-44450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17760155#comment-17760155
 ] 

Herman van Hövell commented on SPARK-44450:
---

[~h-vetinari] this was mostly about making sure we moved all needed classes to 
sql/api module, and that was done about a month ago in SPARK-44532 
(https://github.com/apache/spark/pull/42156).

I assume you are mostly interested in the actual encoders. Those can be found 
here:https://github.com/apache/spark/tree/master/connector/connect/common/src/main/scala/org/apache/spark/sql/connect/client/arrow

> Make direct Arrow encoding work with SQL/API
> 
>
> Key: SPARK-44450
> URL: https://issues.apache.org/jira/browse/SPARK-44450
> Project: Spark
>  Issue Type: New Feature
>  Components: Connect
>Affects Versions: 3.4.1
>Reporter: Herman van Hövell
>Assignee: Herman van Hövell
>Priority: Major
> Fix For: 3.5.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-45011) Refine docstring of `Column.between`

2023-08-29 Thread Allison Wang (Jira)
Allison Wang created SPARK-45011:


 Summary: Refine docstring of `Column.between`
 Key: SPARK-45011
 URL: https://issues.apache.org/jira/browse/SPARK-45011
 Project: Spark
  Issue Type: Sub-task
  Components: Documentation, PySpark
Affects Versions: 4.0.0
Reporter: Allison Wang


Refine the docstring of Column.between



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-45010) Limit GHA job execution time to up to 5 hours in build_and_test.yml

2023-08-29 Thread Dongjoon Hyun (Jira)
Dongjoon Hyun created SPARK-45010:
-

 Summary: Limit GHA job execution time to up to 5 hours in 
build_and_test.yml
 Key: SPARK-45010
 URL: https://issues.apache.org/jira/browse/SPARK-45010
 Project: Spark
  Issue Type: Task
  Components: Project Infra
Affects Versions: 4.0.0
Reporter: Dongjoon Hyun






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-45009) Correlated EXISTS subqueries in join ON condition unsupported and fail with internal error

2023-08-29 Thread Jack Chen (Jira)
Jack Chen created SPARK-45009:
-

 Summary: Correlated EXISTS subqueries in join ON condition 
unsupported and fail with internal error
 Key: SPARK-45009
 URL: https://issues.apache.org/jira/browse/SPARK-45009
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.4.0
Reporter: Jack Chen


They are not handled in decorrelation, and we also don’t have any checks to 
block them, so these queries have outer references in the query plan leading to 
internal errors:
{code:java}
CREATE TEMP VIEW x(x1, x2) AS VALUES (0, 1), (1, 2);
CREATE TEMP VIEW y(y1, y2) AS VALUES (0, 2), (0, 3);
CREATE TEMP VIEW z(z1, z2) AS VALUES (0, 2), (0, 3);
select * from x left join y on x1 = y1 and exists (select * from z where z1 = 
x1)

Error occurred during query planning: 
org.apache.spark.sql.catalyst.plans.logical.Filter cannot be cast to 
org.apache.spark.sql.execution.SparkPlan {code}
PullupCorrelatedPredicates#apply and RewritePredicateSubquery only handle 
subqueries in UnaryNode, it seems to assume that they cannot occur elsewhere, 
like in a join ON condition.

We will need to decide whether to block them properly in analysis (i.e. give a 
proper error for them), or see if we can add support for them.

Also note, scalar subqueries in the ON condition are unsupported too but return 
a proper error.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-43438) Fix mismatched column list error on INSERT

2023-08-29 Thread Max Gekk (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-43438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Max Gekk reassigned SPARK-43438:


Assignee: Max Gekk

> Fix mismatched column list error on INSERT
> --
>
> Key: SPARK-43438
> URL: https://issues.apache.org/jira/browse/SPARK-43438
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 3.4.0
>Reporter: Serge Rielau
>Assignee: Max Gekk
>Priority: Major
>
> This error message is pretty bad, and common
> "_LEGACY_ERROR_TEMP_1038" : {
> "message" : [
> "Cannot write to table due to mismatched user specified column 
> size() and data column size()."
> ]
> },
> It can perhaps be merged with this one - after giving it an ERROR_CLASS
> "_LEGACY_ERROR_TEMP_1168" : {
> "message" : [
> " requires that the data to be inserted have the same number of 
> columns as the target table: target table has  column(s) but 
> the inserted data has  column(s), including  
> partition column(s) having constant value(s)."
> ]
> },
> Repro:
> CREATE TABLE tabtest(c1 INT, c2 INT);
> INSERT INTO tabtest SELECT 1;
> `spark_catalog`.`default`.`tabtest` requires that the data to be inserted 
> have the same number of columns as the target table: target table has 2 
> column(s) but the inserted data has 1 column(s), including 0 partition 
> column(s) having constant value(s).
> INSERT INTO tabtest(c1) SELECT 1, 2, 3;
> Cannot write to table due to mismatched user specified column size(1) and 
> data column size(3).; line 1 pos 24
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-43438) Fix mismatched column list error on INSERT

2023-08-29 Thread Max Gekk (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-43438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Max Gekk resolved SPARK-43438.
--
Fix Version/s: 3.5.0
   4.0.0
   Resolution: Fixed

Issue resolved by pull request 42393
[https://github.com/apache/spark/pull/42393]

> Fix mismatched column list error on INSERT
> --
>
> Key: SPARK-43438
> URL: https://issues.apache.org/jira/browse/SPARK-43438
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 3.4.0
>Reporter: Serge Rielau
>Assignee: Max Gekk
>Priority: Major
> Fix For: 3.5.0, 4.0.0
>
>
> This error message is pretty bad, and common
> "_LEGACY_ERROR_TEMP_1038" : {
> "message" : [
> "Cannot write to table due to mismatched user specified column 
> size() and data column size()."
> ]
> },
> It can perhaps be merged with this one - after giving it an ERROR_CLASS
> "_LEGACY_ERROR_TEMP_1168" : {
> "message" : [
> " requires that the data to be inserted have the same number of 
> columns as the target table: target table has  column(s) but 
> the inserted data has  column(s), including  
> partition column(s) having constant value(s)."
> ]
> },
> Repro:
> CREATE TABLE tabtest(c1 INT, c2 INT);
> INSERT INTO tabtest SELECT 1;
> `spark_catalog`.`default`.`tabtest` requires that the data to be inserted 
> have the same number of columns as the target table: target table has 2 
> column(s) but the inserted data has 1 column(s), including 0 partition 
> column(s) having constant value(s).
> INSERT INTO tabtest(c1) SELECT 1, 2, 3;
> Cannot write to table due to mismatched user specified column size(1) and 
> data column size(3).; line 1 pos 24
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-34412) RemoveNoopOperators can remove non-trivial projects

2023-08-29 Thread Jira


 [ 
https://issues.apache.org/jira/browse/SPARK-34412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Herman van Hövell resolved SPARK-34412.
---
Resolution: Fixed

> RemoveNoopOperators can remove non-trivial projects
> ---
>
> Key: SPARK-34412
> URL: https://issues.apache.org/jira/browse/SPARK-34412
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.0.1
>Reporter: Herman van Hövell
>Assignee: Herman van Hövell
>Priority: Major
>
> RemoveNoopOperators can remove non-trivial projects. This can happen when the 
> top project has a non-trivial expression that reuses expression id of the 
> child project.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-26131) Remove sqlContext.conf from Spark SQL physical operators

2023-08-29 Thread Jira


 [ 
https://issues.apache.org/jira/browse/SPARK-26131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Herman van Hövell reassigned SPARK-26131:
-

Assignee: (was: Herman van Hövell)

> Remove sqlContext.conf from Spark SQL physical operators
> 
>
> Key: SPARK-26131
> URL: https://issues.apache.org/jira/browse/SPARK-26131
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Herman van Hövell
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-42512) Scala Client: Add 3rd batch of functions

2023-08-29 Thread Jira


 [ 
https://issues.apache.org/jira/browse/SPARK-42512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Herman van Hövell resolved SPARK-42512.
---
Fix Version/s: 3.4.0
   Resolution: Fixed

> Scala Client: Add 3rd batch of functions
> 
>
> Key: SPARK-42512
> URL: https://issues.apache.org/jira/browse/SPARK-42512
> Project: Spark
>  Issue Type: Task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Herman van Hövell
>Assignee: Herman van Hövell
>Priority: Major
> Fix For: 3.4.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-44450) Make direct Arrow encoding work with SQL/API

2023-08-29 Thread Jira


 [ 
https://issues.apache.org/jira/browse/SPARK-44450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Herman van Hövell resolved SPARK-44450.
---
Fix Version/s: 3.5.0
   Resolution: Fixed

> Make direct Arrow encoding work with SQL/API
> 
>
> Key: SPARK-44450
> URL: https://issues.apache.org/jira/browse/SPARK-44450
> Project: Spark
>  Issue Type: New Feature
>  Components: Connect
>Affects Versions: 3.4.1
>Reporter: Herman van Hövell
>Assignee: Herman van Hövell
>Priority: Major
> Fix For: 3.5.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-44344) Prepare RowEncoder for the move to sql/api

2023-08-29 Thread Jira


 [ 
https://issues.apache.org/jira/browse/SPARK-44344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Herman van Hövell resolved SPARK-44344.
---
Fix Version/s: 3.5.0
   Resolution: Fixed

> Prepare RowEncoder for the move to sql/api
> --
>
> Key: SPARK-44344
> URL: https://issues.apache.org/jira/browse/SPARK-44344
> Project: Spark
>  Issue Type: New Feature
>  Components: Connect, SQL
>Affects Versions: 3.4.1
>Reporter: Herman van Hövell
>Assignee: Herman van Hövell
>Priority: Major
> Fix For: 3.5.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-44576) Session Artifact update breaks XXWithState methods in KVGDS

2023-08-29 Thread Jira


 [ 
https://issues.apache.org/jira/browse/SPARK-44576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Herman van Hövell resolved SPARK-44576.
---
Fix Version/s: 3.5.0
   Resolution: Fixed

> Session Artifact update breaks XXWithState methods in KVGDS
> ---
>
> Key: SPARK-44576
> URL: https://issues.apache.org/jira/browse/SPARK-44576
> Project: Spark
>  Issue Type: Bug
>  Components: Connect
>Affects Versions: 3.5.0
>Reporter: Zhen Li
>Assignee: Herman van Hövell
>Priority: Major
> Fix For: 3.5.0
>
>
> When changing the client test jar from system classloader to session 
> classloader 
> (https://github.com/apache/spark/compare/master...zhenlineo:spark:streaming-artifacts?expand=1),
>  all XXWithState test suite failed with class loader errors: e.g.
> ```
> 23/07/25 16:13:14 WARN TaskSetManager: Lost task 1.0 in stage 2.0 (TID 16) 
> (10.8.132.125 executor driver): TaskKilled (Stage cancelled: Job aborted due 
> to stage failure: Task 170 in stage 2.0 failed 1 times, most recent failure: 
> Lost task 170.0 in stage 2.0 (TID 14) (10.8.132.125 executor driver): 
> java.lang.ClassCastException: class org.apache.spark.sql.streaming.ClickState 
> cannot be cast to class org.apache.spark.sql.streaming.ClickState 
> (org.apache.spark.sql.streaming.ClickState is in unnamed module of loader 
> org.apache.spark.util.MutableURLClassLoader @2c604965; 
> org.apache.spark.sql.streaming.ClickState is in unnamed module of loader 
> java.net.URLClassLoader @57751f4)
>   at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage4.processNext(Unknown
>  Source)
>   at 
> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
>   at 
> org.apache.spark.sql.execution.WholeStageCodegenEvaluatorFactory$WholeStageCodegenPartitionEvaluator$$anon$1.hasNext(WholeStageCodegenEvaluatorFactory.scala:43)
>   at 
> org.apache.spark.sql.execution.datasources.v2.WritingSparkTask.$anonfun$run$1(WriteToDataSourceV2Exec.scala:441)
>   at 
> org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1514)
>   at 
> org.apache.spark.sql.execution.datasources.v2.WritingSparkTask.run(WriteToDataSourceV2Exec.scala:486)
>   at 
> org.apache.spark.sql.execution.datasources.v2.WritingSparkTask.run$(WriteToDataSourceV2Exec.scala:425)
>   at 
> org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$.run(WriteToDataSourceV2Exec.scala:491)
>   at 
> org.apache.spark.sql.execution.datasources.v2.V2TableWriteExec.$anonfun$writeWithV2$2(WriteToDataSourceV2Exec.scala:388)
>   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93)
>   at 
> org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:161)
>   at org.apache.spark.scheduler.Task.run(Task.scala:141)
>   at 
> org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:592)
>   at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1480)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:595)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>   at java.base/java.lang.Thread.run(Thread.java:829)
> Driver stacktrace:)
> 23/07/25 16:13:14 ERROR Utils: Aborting task
> java.lang.IllegalStateException: Error committing version 1 into 
> HDFSStateStore[id=(op=0,part=5),dir=file:/private/var/folders/b0/f9jmmrrx5js7xsswxyf58nwrgp/T/temporary-02cca002-e189-4e32-afd8-964d6f8d5056/state/0/5]
>   at 
> org.apache.spark.sql.execution.streaming.state.HDFSBackedStateStoreProvider$HDFSBackedStateStore.commit(HDFSBackedStateStoreProvider.scala:148)
>   at 
> org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.$anonfun$processDataWithPartition$4(FlatMapGroupsWithStateExec.scala:183)
>   at 
> scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
>   at org.apache.spark.util.Utils$.timeTakenMs(Utils.scala:611)
>   at 
> org.apache.spark.sql.execution.streaming.StateStoreWriter.timeTakenMs(statefulOperators.scala:179)
>   at 
> org.apache.spark.sql.execution.streaming.StateStoreWriter.timeTakenMs$(statefulOperators.scala:179)
>   at 
> org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExec.timeTakenMs(FlatMapGroupsWithStateExec.scala:374)
>   at 
> org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.$anonfun$processDataWithPartition$3(FlatMapGroupsWithStateExec.scala:183)
>   at 
> org.apache.spark.util.CompletionIterator$$anon$1.completion(CompletionIterator.scala:47)
>   at 
> org.apa

[jira] [Commented] (SPARK-44576) Session Artifact update breaks XXWithState methods in KVGDS

2023-08-29 Thread Jira


[ 
https://issues.apache.org/jira/browse/SPARK-44576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17760062#comment-17760062
 ] 

Herman van Hövell commented on SPARK-44576:
---

This has been fixed by making the SBT build more hermetic (and fixing the class 
sync).

> Session Artifact update breaks XXWithState methods in KVGDS
> ---
>
> Key: SPARK-44576
> URL: https://issues.apache.org/jira/browse/SPARK-44576
> Project: Spark
>  Issue Type: Bug
>  Components: Connect
>Affects Versions: 3.5.0
>Reporter: Zhen Li
>Assignee: Herman van Hövell
>Priority: Major
>
> When changing the client test jar from system classloader to session 
> classloader 
> (https://github.com/apache/spark/compare/master...zhenlineo:spark:streaming-artifacts?expand=1),
>  all XXWithState test suite failed with class loader errors: e.g.
> ```
> 23/07/25 16:13:14 WARN TaskSetManager: Lost task 1.0 in stage 2.0 (TID 16) 
> (10.8.132.125 executor driver): TaskKilled (Stage cancelled: Job aborted due 
> to stage failure: Task 170 in stage 2.0 failed 1 times, most recent failure: 
> Lost task 170.0 in stage 2.0 (TID 14) (10.8.132.125 executor driver): 
> java.lang.ClassCastException: class org.apache.spark.sql.streaming.ClickState 
> cannot be cast to class org.apache.spark.sql.streaming.ClickState 
> (org.apache.spark.sql.streaming.ClickState is in unnamed module of loader 
> org.apache.spark.util.MutableURLClassLoader @2c604965; 
> org.apache.spark.sql.streaming.ClickState is in unnamed module of loader 
> java.net.URLClassLoader @57751f4)
>   at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage4.processNext(Unknown
>  Source)
>   at 
> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
>   at 
> org.apache.spark.sql.execution.WholeStageCodegenEvaluatorFactory$WholeStageCodegenPartitionEvaluator$$anon$1.hasNext(WholeStageCodegenEvaluatorFactory.scala:43)
>   at 
> org.apache.spark.sql.execution.datasources.v2.WritingSparkTask.$anonfun$run$1(WriteToDataSourceV2Exec.scala:441)
>   at 
> org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1514)
>   at 
> org.apache.spark.sql.execution.datasources.v2.WritingSparkTask.run(WriteToDataSourceV2Exec.scala:486)
>   at 
> org.apache.spark.sql.execution.datasources.v2.WritingSparkTask.run$(WriteToDataSourceV2Exec.scala:425)
>   at 
> org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$.run(WriteToDataSourceV2Exec.scala:491)
>   at 
> org.apache.spark.sql.execution.datasources.v2.V2TableWriteExec.$anonfun$writeWithV2$2(WriteToDataSourceV2Exec.scala:388)
>   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93)
>   at 
> org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:161)
>   at org.apache.spark.scheduler.Task.run(Task.scala:141)
>   at 
> org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:592)
>   at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1480)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:595)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>   at java.base/java.lang.Thread.run(Thread.java:829)
> Driver stacktrace:)
> 23/07/25 16:13:14 ERROR Utils: Aborting task
> java.lang.IllegalStateException: Error committing version 1 into 
> HDFSStateStore[id=(op=0,part=5),dir=file:/private/var/folders/b0/f9jmmrrx5js7xsswxyf58nwrgp/T/temporary-02cca002-e189-4e32-afd8-964d6f8d5056/state/0/5]
>   at 
> org.apache.spark.sql.execution.streaming.state.HDFSBackedStateStoreProvider$HDFSBackedStateStore.commit(HDFSBackedStateStoreProvider.scala:148)
>   at 
> org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.$anonfun$processDataWithPartition$4(FlatMapGroupsWithStateExec.scala:183)
>   at 
> scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
>   at org.apache.spark.util.Utils$.timeTakenMs(Utils.scala:611)
>   at 
> org.apache.spark.sql.execution.streaming.StateStoreWriter.timeTakenMs(statefulOperators.scala:179)
>   at 
> org.apache.spark.sql.execution.streaming.StateStoreWriter.timeTakenMs$(statefulOperators.scala:179)
>   at 
> org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExec.timeTakenMs(FlatMapGroupsWithStateExec.scala:374)
>   at 
> org.apache.spark.sql.execution.streaming.FlatMapGroupsWithStateExecBase.$anonfun$processDataWithPartition$3(FlatMapGroupsWithStateExec.scala:183)
>   at 
> org.apache.spark.util.CompletionIterator$$anon$

[jira] [Created] (SPARK-45008) Improve branch suggestion for backporting

2023-08-29 Thread Kent Yao (Jira)
Kent Yao created SPARK-45008:


 Summary: Improve branch suggestion for backporting
 Key: SPARK-45008
 URL: https://issues.apache.org/jira/browse/SPARK-45008
 Project: Spark
  Issue Type: Improvement
  Components: Project Infra
Affects Versions: 4.0.0
Reporter: Kent Yao






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-45003) Refine docstring of `asc/desc`

2023-08-29 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-45003.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 42717
[https://github.com/apache/spark/pull/42717]

> Refine docstring of `asc/desc`
> --
>
> Key: SPARK-45003
> URL: https://issues.apache.org/jira/browse/SPARK-45003
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, PySpark
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Assignee: Yang Jie
>Priority: Minor
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-45003) Refine docstring of `asc/desc`

2023-08-29 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-45003:
-

Assignee: Yang Jie

> Refine docstring of `asc/desc`
> --
>
> Key: SPARK-45003
> URL: https://issues.apache.org/jira/browse/SPARK-45003
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation, PySpark
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Assignee: Yang Jie
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-44999) Refactor ExternalSorter to reduce checks on shouldPartition when calling getPartition

2023-08-29 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-44999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-44999:
-

Assignee: Yang Jie

> Refactor ExternalSorter to reduce checks on shouldPartition when calling 
> getPartition
> -
>
> Key: SPARK-44999
> URL: https://issues.apache.org/jira/browse/SPARK-44999
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Assignee: Yang Jie
>Priority: Minor
>
> {code:java}
>   private def getPartition(key: K): Int = {
>     if (shouldPartition) partitioner.get.getPartition(key) else 0
>   } {code}
>  
> The {{getPartition}} method checks {{shouldPartition}} every time it is 
> called. However, {{shouldPartition}} should not be able to change after the 
> {{ExternalSorter}} is instantiated. Therefore, it can be refactored to reduce 
> the checks on {{{}shouldPartition{}}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-44999) Refactor ExternalSorter to reduce checks on shouldPartition when calling getPartition

2023-08-29 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-44999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-44999.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 42713
[https://github.com/apache/spark/pull/42713]

> Refactor ExternalSorter to reduce checks on shouldPartition when calling 
> getPartition
> -
>
> Key: SPARK-44999
> URL: https://issues.apache.org/jira/browse/SPARK-44999
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Assignee: Yang Jie
>Priority: Minor
> Fix For: 4.0.0
>
>
> {code:java}
>   private def getPartition(key: K): Int = {
>     if (shouldPartition) partitioner.get.getPartition(key) else 0
>   } {code}
>  
> The {{getPartition}} method checks {{shouldPartition}} every time it is 
> called. However, {{shouldPartition}} should not be able to change after the 
> {{ExternalSorter}} is instantiated. Therefore, it can be refactored to reduce 
> the checks on {{{}shouldPartition{}}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-45007) fix merged pull requests resolution

2023-08-29 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-45007.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 42722
[https://github.com/apache/spark/pull/42722]

> fix merged pull requests resolution
> ---
>
> Key: SPARK-45007
> URL: https://issues.apache.org/jira/browse/SPARK-45007
> Project: Spark
>  Issue Type: Bug
>  Components: Project Infra
>Affects Versions: 4.0.0
>Reporter: Kent Yao
>Assignee: Kent Yao
>Priority: Major
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-45007) fix merged pull requests resolution

2023-08-29 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-45007:
-

Assignee: Kent Yao

> fix merged pull requests resolution
> ---
>
> Key: SPARK-45007
> URL: https://issues.apache.org/jira/browse/SPARK-45007
> Project: Spark
>  Issue Type: Bug
>  Components: Project Infra
>Affects Versions: 4.0.0
>Reporter: Kent Yao
>Assignee: Kent Yao
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-45006) Use the same date format of other UI date elements for the x-axis of timelines

2023-08-29 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-45006:
--
Issue Type: Improvement  (was: Bug)

> Use the same date format of other UI date elements  for the x-axis of 
> timelines
> ---
>
> Key: SPARK-45006
> URL: https://issues.apache.org/jira/browse/SPARK-45006
> Project: Spark
>  Issue Type: Improvement
>  Components: Web UI
>Affects Versions: 4.0.0
>Reporter: Kent Yao
>Assignee: Kent Yao
>Priority: Major
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-45006) Use the same date format of other UI date elements for the x-axis of timelines

2023-08-29 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-45006.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 42720
[https://github.com/apache/spark/pull/42720]

> Use the same date format of other UI date elements  for the x-axis of 
> timelines
> ---
>
> Key: SPARK-45006
> URL: https://issues.apache.org/jira/browse/SPARK-45006
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 4.0.0
>Reporter: Kent Yao
>Assignee: Kent Yao
>Priority: Major
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-45006) Use the same date format of other UI date elements for the x-axis of timelines

2023-08-29 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-45006:
-

Assignee: Kent Yao

> Use the same date format of other UI date elements  for the x-axis of 
> timelines
> ---
>
> Key: SPARK-45006
> URL: https://issues.apache.org/jira/browse/SPARK-45006
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 4.0.0
>Reporter: Kent Yao
>Assignee: Kent Yao
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-44979) Cache results of simple udfs on executors if same arguments are passed.

2023-08-29 Thread Dinesh Dharme (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-44979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dinesh Dharme updated SPARK-44979:
--
Shepherd: Deepak Goyal

> Cache results of simple udfs on executors if same arguments are passed.
> ---
>
> Key: SPARK-44979
> URL: https://issues.apache.org/jira/browse/SPARK-44979
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.4.1
>Reporter: Dinesh Dharme
>Priority: Minor
>
> Consider two dataframes :
> {{keyword_given = [
> ["green pstr",],
> ["greenpstr",],
> ["wlmrt", ],
> ["walmart",],
> ["walmart super",]
> ]}}
> {{variations = [
> ("type green pstr", "ABC", 100),
> ("type green pstr","PQR",200),
> ("type green pstr", "NZSD", 2999),
> ("wlmrt payment","walmart",200),
> ("wlmrt solutions", "walmart", 200),
> ("nppssdwlmrt", "walmart", 2000)
> ]}}
> {{Imagine I have a task to do fuzzy substring matching between keyword and 
> variation[0] using in built levenstein function. It is possible to optimize 
> this futher in the code itself where we extract out the uniques and then do 
> fuzzy matching on the uniques and join back with the original tables. }}
> {{But it could be possible as an optimization to cache the results of the 
> already computed udfs till now and do a lookup on each executor separately.}}
> Just a thought. Not sure if it makes any sense. This behaviour could be 
> behind a config.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-45001) Implement DataFrame.foreachPartition

2023-08-29 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-45001.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 42715
[https://github.com/apache/spark/pull/42715]

> Implement DataFrame.foreachPartition
> 
>
> Key: SPARK-45001
> URL: https://issues.apache.org/jira/browse/SPARK-45001
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect, PySpark
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Assignee: Hyukjin Kwon
>Priority: Major
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-45001) Implement DataFrame.foreachPartition

2023-08-29 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-45001:


Assignee: Hyukjin Kwon

> Implement DataFrame.foreachPartition
> 
>
> Key: SPARK-45001
> URL: https://issues.apache.org/jira/browse/SPARK-45001
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect, PySpark
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Assignee: Hyukjin Kwon
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-44967) Unit should be considered first before using Boolean for TreeNodeTag

2023-08-29 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-44967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-44967.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 42683
[https://github.com/apache/spark/pull/42683]

> Unit should be considered first before using Boolean for TreeNodeTag
> 
>
> Key: SPARK-44967
> URL: https://issues.apache.org/jira/browse/SPARK-44967
> Project: Spark
>  Issue Type: New Feature
>  Components: SQL
>Affects Versions: 3.5.0
>Reporter: jiaan.geng
>Assignee: jiaan.geng
>Priority: Major
> Fix For: 4.0.0
>
>
> Currently, there are a lot of TreeNodeTag[Boolean] defined.
> In fact, we don't require the boolean value boxed into TreeNodeTag, just want 
> know it as a flag.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-44967) Unit should be considered first before using Boolean for TreeNodeTag

2023-08-29 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-44967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-44967:
---

Assignee: jiaan.geng

> Unit should be considered first before using Boolean for TreeNodeTag
> 
>
> Key: SPARK-44967
> URL: https://issues.apache.org/jira/browse/SPARK-44967
> Project: Spark
>  Issue Type: New Feature
>  Components: SQL
>Affects Versions: 3.5.0
>Reporter: jiaan.geng
>Assignee: jiaan.geng
>Priority: Major
>
> Currently, there are a lot of TreeNodeTag[Boolean] defined.
> In fact, we don't require the boolean value boxed into TreeNodeTag, just want 
> know it as a flag.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-45007) fix merged pull requests resolution

2023-08-29 Thread Kent Yao (Jira)
Kent Yao created SPARK-45007:


 Summary: fix merged pull requests resolution
 Key: SPARK-45007
 URL: https://issues.apache.org/jira/browse/SPARK-45007
 Project: Spark
  Issue Type: Bug
  Components: Project Infra
Affects Versions: 4.0.0
Reporter: Kent Yao






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-42304) Assign name to _LEGACY_ERROR_TEMP_2189

2023-08-29 Thread Ignite TC Bot (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-42304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17759898#comment-17759898
 ] 

Ignite TC Bot commented on SPARK-42304:
---

User 'valentinp17' has created a pull request for this issue:
https://github.com/apache/spark/pull/42706

> Assign name to _LEGACY_ERROR_TEMP_2189
> --
>
> Key: SPARK-42304
> URL: https://issues.apache.org/jira/browse/SPARK-42304
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: Haejoon Lee
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-45006) Use the same date format of other UI date elements for the x-axis of timelines

2023-08-29 Thread Kent Yao (Jira)
Kent Yao created SPARK-45006:


 Summary: Use the same date format of other UI date elements  for 
the x-axis of timelines
 Key: SPARK-45006
 URL: https://issues.apache.org/jira/browse/SPARK-45006
 Project: Spark
  Issue Type: Bug
  Components: Web UI
Affects Versions: 4.0.0
Reporter: Kent Yao






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-45005) Reducing the CI time for slow pyspark-pandas-connect tests

2023-08-29 Thread Haejoon Lee (Jira)
Haejoon Lee created SPARK-45005:
---

 Summary: Reducing the CI time for slow pyspark-pandas-connect tests
 Key: SPARK-45005
 URL: https://issues.apache.org/jira/browse/SPARK-45005
 Project: Spark
  Issue Type: Bug
  Components: Connect, Pandas API on Spark, Tests
Affects Versions: 4.0.0
Reporter: Haejoon Lee


pyspark-pandas-connect test takes more than 3 hours in Github Actions, so we 
might need to reduce the execution time. See 
https://github.com/apache/spark/actions/runs/5989124806/job/16245001034



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-43203) Fix DROP table behavior in session catalog

2023-08-29 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-43203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17759882#comment-17759882
 ] 

ASF GitHub Bot commented on SPARK-43203:


User 'Hisoka-X' has created a pull request for this issue:
https://github.com/apache/spark/pull/41765

> Fix DROP table behavior in session catalog
> --
>
> Key: SPARK-43203
> URL: https://issues.apache.org/jira/browse/SPARK-43203
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: Anton Okolnychyi
>Assignee: Jia Fan
>Priority: Major
> Fix For: 3.5.0
>
>
> DROP table behavior is not working correctly in 3.4.0 because we always 
> invoke V1 drop logic if the identifier looks like a V1 identifier. This is a 
> big blocker for external data sources that provide custom session catalogs.
> See [here|https://github.com/apache/spark/pull/37879/files#r1170501180] for 
> details.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-44981) Filter out static configurations used in local mode

2023-08-29 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-44981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17759883#comment-17759883
 ] 

ASF GitHub Bot commented on SPARK-44981:


User 'HyukjinKwon' has created a pull request for this issue:
https://github.com/apache/spark/pull/42718

> Filter out static configurations used in local mode
> ---
>
> Key: SPARK-44981
> URL: https://issues.apache.org/jira/browse/SPARK-44981
> Project: Spark
>  Issue Type: Improvement
>  Components: Connect, PySpark
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Assignee: Hyukjin Kwon
>Priority: Minor
> Fix For: 3.5.0, 4.0.0
>
>
> If you set a static configuration with `--remote local` mode, it shows a 
> bunch of warnings as below:
> {code}
> 23/08/28 11:39:42 ERROR ErrorUtils: Spark Connect RPC error during: config. 
> UserId: hyukjin.kwon. SessionId: 424674ef-af95-4b12-b10e-86479413f9fd.
> org.apache.spark.sql.AnalysisException: Cannot modify the value of a static 
> config: spark.connect.copyFromLocalToFs.allowDestLocal.
>   at 
> org.apache.spark.sql.errors.QueryCompilationErrors$.cannotModifyValueOfStaticConfigError(QueryCompilationErrors.scala:3227)
>   at 
> org.apache.spark.sql.RuntimeConfig.requireNonStaticConf(RuntimeConfig.scala:162)
>   at org.apache.spark.sql.RuntimeConfig.set(RuntimeConfig.scala:42)
>   at 
> org.apache.spark.sql.connect.service.SparkConnectConfigHandler.$anonfun$handleSet$1(SparkConnectConfigHandler.scala:67)
>   at 
> org.apache.spark.sql.connect.service.SparkConnectConfigHandler.$anonfun$handleSet$1$adapted(SparkConnectConfigHandler.scala:65)
>   at scala.collection.Iterator.foreach(Iterator.scala:943)
>   at scala.collection.Iterator.foreach$(Iterator.scala:943)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1431)
>   at 
> org.apache.spark.sql.connect.service.SparkConnectConfigHandler.handleSet(SparkConnectConfigHandler.scala:65)
>   at 
> org.apache.spark.sql.connect.service.SparkConnectConfigHandler.handle(SparkConnectConfigHandler.scala:40)
>   at 
> org.apache.spark.sql.connect.service.SparkConnectService.config(SparkConnectService.scala:120)
>   at 
> org.apache.spark.connect.proto.SparkConnectServiceGrpc$MethodHandlers.invoke(SparkConnectServiceGrpc.java:751)
>   at 
> org.sparkproject.connect.grpc.io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:182)
>   at 
> org.sparkproject.connect.grpc.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:346)
>   at 
> org.sparkproject.connect.grpc.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:860)
>   at 
> org.sparkproject.connect.grpc.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
>   at 
> org.sparkproject.connect.grpc.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-45000) Implement DataFrame.foreach

2023-08-29 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-45000.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 42714
[https://github.com/apache/spark/pull/42714]

> Implement DataFrame.foreach
> ---
>
> Key: SPARK-45000
> URL: https://issues.apache.org/jira/browse/SPARK-45000
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect, PySpark
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Assignee: Hyukjin Kwon
>Priority: Major
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-45000) Implement DataFrame.foreach

2023-08-29 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-45000:
-

Assignee: Hyukjin Kwon

> Implement DataFrame.foreach
> ---
>
> Key: SPARK-45000
> URL: https://issues.apache.org/jira/browse/SPARK-45000
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect, PySpark
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Assignee: Hyukjin Kwon
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-45004) Adding extension for Spark SQL authorization with Ranger-Hive policies

2023-08-29 Thread Hasan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hasan updated SPARK-45004:
--
Description: 
Apache Ranger have not plugin for Spark SQL authorization and due to this 
limitation it's required some extra work and using extensions to manage Spark 
authorizations.

 

Spark HWC have some performance issues and Cloudera (CDP) solution is complex 
and required extra copy of data (big data!).

 

Seems natively adding plugins like "Submarine Spark Security Plugin" in Spark 
will help to implementation of standard and high performance solution. 

Reference:  
[https://submarine.apache.org/zh-cn/docs/0.6.0/userDocs/submarine-security/spark-security/]
 

 

This plugin works fine with table/column/row level and masking options.

  was:
Apache Ranger have not plugin for Spark SQL authorization and due to this 
limitation it's required some extra work and using extensions to manage Spark 
authorizations.

 

Spark HWC have some performance issues and Cloudera (CDP) solution is complex 
and required extra copy of data (big data!).

 

Seems natively adding plugins like "Submarine Spark Security Plugin" in Spark 
will help to implementation of standard and high performance solution. 

Reference:  
[https://submarine.apache.org/zh-cn/docs/0.6.0/userDocs/submarine-security/spark-security/]
 

 

This plugin works fine with table level, column level, row level and masking 
options.


> Adding extension for Spark SQL authorization with Ranger-Hive policies
> --
>
> Key: SPARK-45004
> URL: https://issues.apache.org/jira/browse/SPARK-45004
> Project: Spark
>  Issue Type: New Feature
>  Components: Spark Core, SQL
>Affects Versions: 3.4.2
>Reporter: Hasan
>Priority: Major
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Apache Ranger have not plugin for Spark SQL authorization and due to this 
> limitation it's required some extra work and using extensions to manage Spark 
> authorizations.
>  
> Spark HWC have some performance issues and Cloudera (CDP) solution is complex 
> and required extra copy of data (big data!).
>  
> Seems natively adding plugins like "Submarine Spark Security Plugin" in Spark 
> will help to implementation of standard and high performance solution. 
> Reference:  
> [https://submarine.apache.org/zh-cn/docs/0.6.0/userDocs/submarine-security/spark-security/]
>  
>  
> This plugin works fine with table/column/row level and masking options.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-45004) Adding extension for Spark SQL authorization with Ranger-Hive policies

2023-08-29 Thread Hasan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hasan updated SPARK-45004:
--
Description: 
Apache Ranger have not plugin for Spark SQL authorization and due to this 
limitation it's required some extra work and using extensions to manage Spark 
authorizations.

 

Spark HWC have some performance issues and Cloudera (CDP) solution is complex 
and required extra copy of data (big data!).

 

Seems natively adding plugins like "Submarine Spark Security Plugin" in Spark 
will help to implementation of standard and high performance solution. 

Reference:  
[https://submarine.apache.org/zh-cn/docs/0.6.0/userDocs/submarine-security/spark-security/]
 

 

This plugin works fine with table level, column level, row level and masking 
options.

  was:
Apache Ranger have not plugin for Spark SQL authorization and due to this 
limitation it's required some extra work and using extensions to manage Spark 
authorizations.

 

Spark HWC have some performance issues and Cloudera (CDP) solution is complex 
and required extra copy of data (big data!).

 

Seems natively adding plugins like "Submarine Spark Security Plugin" in Spark 
will help to implementation of standard and high performance solution. 

Reference:  
[https://submarine.apache.org/zh-cn/docs/0.6.0/userDocs/submarine-security/spark-security/]
 


> Adding extension for Spark SQL authorization with Ranger-Hive policies
> --
>
> Key: SPARK-45004
> URL: https://issues.apache.org/jira/browse/SPARK-45004
> Project: Spark
>  Issue Type: New Feature
>  Components: Spark Core, SQL
>Affects Versions: 3.4.2
>Reporter: Hasan
>Priority: Major
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Apache Ranger have not plugin for Spark SQL authorization and due to this 
> limitation it's required some extra work and using extensions to manage Spark 
> authorizations.
>  
> Spark HWC have some performance issues and Cloudera (CDP) solution is complex 
> and required extra copy of data (big data!).
>  
> Seems natively adding plugins like "Submarine Spark Security Plugin" in Spark 
> will help to implementation of standard and high performance solution. 
> Reference:  
> [https://submarine.apache.org/zh-cn/docs/0.6.0/userDocs/submarine-security/spark-security/]
>  
>  
> This plugin works fine with table level, column level, row level and masking 
> options.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-45004) Adding extension for Spark SQL authorization with Ranger-Hive policies

2023-08-29 Thread Hasan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hasan updated SPARK-45004:
--
Summary: Adding extension for Spark SQL authorization with Ranger-Hive 
policies  (was: adding extension for Spark SQL authorization with Ranger-Hive 
policies)

> Adding extension for Spark SQL authorization with Ranger-Hive policies
> --
>
> Key: SPARK-45004
> URL: https://issues.apache.org/jira/browse/SPARK-45004
> Project: Spark
>  Issue Type: New Feature
>  Components: Spark Core, SQL
>Affects Versions: 3.4.2
>Reporter: Hasan
>Priority: Major
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Apache Ranger have not plugin for Spark SQL authorization and due to this 
> limitation it's required some extra work and using extensions to manage Spark 
> authorizations.
>  
> Spark HWC have some performance issues and Cloudera (CDP) solution is complex 
> and required extra copy of data (big data!).
>  
> Seems natively adding plugins like "Submarine Spark Security Plugin" in Spark 
> will help to implementation of standard and high performance solution. 
> Reference:  
> [https://submarine.apache.org/zh-cn/docs/0.6.0/userDocs/submarine-security/spark-security/]
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-45004) adding extension for Spark SQL authorization with Ranger-Hive policies

2023-08-29 Thread Hasan (Jira)
Hasan created SPARK-45004:
-

 Summary: adding extension for Spark SQL authorization with 
Ranger-Hive policies
 Key: SPARK-45004
 URL: https://issues.apache.org/jira/browse/SPARK-45004
 Project: Spark
  Issue Type: New Feature
  Components: Spark Core, SQL
Affects Versions: 3.4.2
Reporter: Hasan


Apache Ranger have not plugin for Spark SQL authorization and due to this 
limitation it's required some extra work and using extensions to manage Spark 
authorizations.

 

Spark HWC have some performance issues and Cloudera (CDP) solution is complex 
and required extra copy of data (big data!).

 

Seems natively adding plugins like "Submarine Spark Security Plugin" in Spark 
will help to implementation of standard and high performance solution. 

Reference:  
[https://submarine.apache.org/zh-cn/docs/0.6.0/userDocs/submarine-security/spark-security/]
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-44902) The precision of LongDecimal is inconsistent with Hive.

2023-08-29 Thread Zhen Wang (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-44902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17759854#comment-17759854
 ] 

Zhen Wang commented on SPARK-44902:
---

I tried changing it to 19 but some test cases failed.

https://github.com/wForget/spark/actions/runs/5934710599

> The precision of LongDecimal is inconsistent with Hive.
> ---
>
> Key: SPARK-44902
> URL: https://issues.apache.org/jira/browse/SPARK-44902
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: Zhen Wang
>Priority: Major
>
> The precision of LongDecimal in Hive is 19 but it is 20 in Spark. This leads 
> to type conversion errors in some cases.
>  
> Relevant code:
> [https://github.com/apache/spark/blob/4646991abd7f4a47a1b8712e2017a2fae98f7c5a/sql/api/src/main/scala/org/apache/spark/sql/types/DecimalType.scala#L129|https://github.com/apache/spark/blob/4646991abd7f4a47a1b8712e2017a2fae98f7c5a/sql/api/src/main/scala/org/apache/spark/sql/types/DecimalType.scala#L129C51-L129C51]
> [https://github.com/apache/hive/blob/3d3acc7a19399d749a39818573a76a0dbbaf2598/serde/src/java/org/apache/hadoop/hive/serde2/typeinfo/HiveDecimalUtils.java#L76]
>  
> Reproduce:
> create table and view in hive:
> {code:java}
> create table t (value bigint);
> create view v as select value * 0.1 from t; {code}
> read in spark:
> {code:java}
> select * from v; {code}
> error occurred:
> {code:java}
> org.apache.spark.sql.AnalysisException: [CANNOT_UP_CAST_DATATYPE] Cannot up 
> cast `(value * 0.1)` from "DECIMAL(22,1)" to "DECIMAL(21,1)".The type path of 
> the target object is:
> You can either add an explicit cast to the input data or choose a higher 
> precision type of the field in the target object at 
> org.apache.spark.sql.errors.QueryCompilationErrors$.upCastFailureError(QueryCompilationErrors.scala:285)
>  at 
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveUpCast$.org$apache$spark$sql$catalyst$analysis$Analyzer$ResolveUpCast$$fail(Analyzer.scala:3627)
>   at 
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveUpCast$$anonfun$apply$57$$anonfun$applyOrElse$235.applyOrElse(Analyzer.scala:3658)
> at 
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveUpCast$$anonfun$apply$57$$anonfun$applyOrElse$235.applyOrElse(Analyzer.scala:3635)
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-45002) Avoid uncaught exception from state store maintenance task thread on error

2023-08-29 Thread Anish Shrigondekar (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anish Shrigondekar updated SPARK-45002:
---
Affects Version/s: 4.0.0
   (was: 3.5.1)

> Avoid uncaught exception from state store maintenance task thread on error
> --
>
> Key: SPARK-45002
> URL: https://issues.apache.org/jira/browse/SPARK-45002
> Project: Spark
>  Issue Type: Task
>  Components: Structured Streaming
>Affects Versions: 4.0.0
>Reporter: Anish Shrigondekar
>Priority: Major
>
> Avoid uncaught exception from state store maintenance task thread on error



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org