[jira] [Assigned] (SPARK-52540) Support the time type by make_timestamp_ntz()

2025-06-23 Thread Max Gekk (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-52540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Max Gekk reassigned SPARK-52540:


Assignee: Max Gekk

> Support the time type by make_timestamp_ntz()
> -
>
> Key: SPARK-52540
> URL: https://issues.apache.org/jira/browse/SPARK-52540
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.1.0
>Reporter: Max Gekk
>Assignee: Max Gekk
>Priority: Major
>
> Modify the make_timestamp_ntz() function to create a timestamp from date and 
> time.
> h4. Syntax
> {code:sql}
> make_timestamp_ntz(date [, time])
> {code}
> h4. Arguments
> # date: A date expression
> # time: A time expression
> h4. Returns
> A TIMESTAMP.
> h4. Examples
> {code:sql}
> > SELECT make_timestamp_ntz(DATE'2014-12-28', TIME'6:30:45.887');
>  2014-12-28 06:30:45.887
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-52549) Disable Recursive CTE self-references from window functions

2025-06-23 Thread Jira
Pavle Martinović created SPARK-52549:


 Summary: Disable Recursive CTE self-references from window 
functions
 Key: SPARK-52549
 URL: https://issues.apache.org/jira/browse/SPARK-52549
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 4.1.0
Reporter: Pavle Martinović






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-52547) Build dry runs against master branch

2025-06-23 Thread Hyukjin Kwon (Jira)
Hyukjin Kwon created SPARK-52547:


 Summary: Build dry runs against master branch
 Key: SPARK-52547
 URL: https://issues.apache.org/jira/browse/SPARK-52547
 Project: Spark
  Issue Type: Sub-task
  Components: Project Infra
Affects Versions: 4.0.0
Reporter: Hyukjin Kwon






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-52548) Add a test case for when shuffle manager is overridden by a SparkPlugin

2025-06-23 Thread Hongze Zhang (Jira)
Hongze Zhang created SPARK-52548:


 Summary: Add a test case for when shuffle manager is overridden by 
a SparkPlugin
 Key: SPARK-52548
 URL: https://issues.apache.org/jira/browse/SPARK-52548
 Project: Spark
  Issue Type: Improvement
  Components: SQL, Tests
Affects Versions: 4.0.0
Reporter: Hongze Zhang


The PR [https://github.com/apache/spark/pull/43627] for SPARK-45762 introduced 
a change to allow the shuffle manager specified in Spark configuration to be 
overridden in SparkPlugin, this change was not tested though.

 

Will suggest adding a test case for it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-52528) Enable divide-by-zero for numeric mod with ANSI enabled

2025-06-23 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-52528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-52528.
--
Fix Version/s: 4.1.0
   Resolution: Fixed

Issue resolved by pull request 51219
[https://github.com/apache/spark/pull/51219]

>  Enable divide-by-zero for numeric mod with ANSI enabled
> 
>
> Key: SPARK-52528
> URL: https://issues.apache.org/jira/browse/SPARK-52528
> Project: Spark
>  Issue Type: Sub-task
>  Components: PS
>Affects Versions: 4.1.0
>Reporter: Xinrong Meng
>Assignee: Xinrong Meng
>Priority: Major
> Fix For: 4.1.0
>
>
>  Enable divide-by-zero for numeric mod with ANSI enabled



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-52408) SPIP: Upgrade Apache Hive to 4.x

2025-06-23 Thread Kousuke Saruta (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-52408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17985476#comment-17985476
 ] 

Kousuke Saruta commented on SPARK-52408:


[~dongjoon], [~srowen], I don't think we need SPIP for this kind of change but 
what do you think?
In the past, we upgrade Scala to 2.13 and we had lots of tasks including 
user-facing ones but it's done without SPIP.
https://issues.apache.org/jira/browse/SPARK-25075
https://issues.apache.org/jira/browse/SPARK-39786

[~sunchao] I'd like to hear from you if you have any concern about this upgrade 
because you are very familiar with Hive.

> SPIP: Upgrade Apache Hive to 4.x
> 
>
> Key: SPARK-52408
> URL: https://issues.apache.org/jira/browse/SPARK-52408
> Project: Spark
>  Issue Type: Umbrella
>  Components: SQL
>Affects Versions: 4.1.0
>Reporter: Vlad Rozov
>Priority: Major
>  Labels: SPIP
>
> The 
> [SPIP|https://docs.google.com/document/d/1ejaGpuBvwBz2cD3Xj-QysShauBrdgYSh5yTxfAGvS1c/edit?usp=sharing]
>  proposes upgrading the Apache Hive version used in Apache Spark builds from 
> 2.3.10 to *version 4.x* (either 4.0.1 or the upcoming 4.1.0). It also 
> proposes discontinuing support for Apache Hive 2.x and 3.x, as these versions 
> are no longer maintained by the Apache Hive community and have reached 
> end-of-life (EOL).
> The *key objectives* of this proposal are to:
>  # *Maintain all existing functionality* currently supported in Apache Hive 
> 2.x that *is compatible* with Apache Hive 4.x
>  # Ensure *no functional or performance regressions* occur
>  # Provide *the best upgrade path* for current Apache Spark users, minimizing 
> prerequisites and manual steps for those using Hive 2.x or 3.x
> SPIP 
> [doc|https://docs.google.com/document/d/1ejaGpuBvwBz2cD3Xj-QysShauBrdgYSh5yTxfAGvS1c/edit?usp=sharing]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-52462) Enforce type coercion before children output deduplication in Union

2025-06-23 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-52462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-52462.
-
Fix Version/s: 4.1.0
   Resolution: Fixed

Issue resolved by pull request 51172
[https://github.com/apache/spark/pull/51172]

> Enforce type coercion before children output deduplication in Union
> ---
>
> Key: SPARK-52462
> URL: https://issues.apache.org/jira/browse/SPARK-52462
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 4.1.0
>Reporter: Mihailo Aleksic
>Assignee: Mihailo Aleksic
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.1.0
>
>
> Right now, query the following query produces plans that are not consistent 
> over different underlying table providers. Query:
> SELECT col1, col2, col3, NULLIF('','') AS col4
> FROM table
> UNION ALL
> SELECT col2, col2, null AS col3, col4
> FROM table;
> This happens because of rule ordering:
>  - Sometimes: ... -> WidenSetOperationTypes -> ... -> ResolveReferences 
> (deduplication of Union children outputs) -> ...
>  - Sometimes: ... -> ResolveReferences (deduplication of Union children 
> outputs) -> ... -> WidenSetOperationTypes -> ...
> In this issue I propose that we align those two by enforcing type coercion to 
> happen before deduplication.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-52462) Enforce type coercion before children output deduplication in Union

2025-06-23 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-52462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-52462:
---

Assignee: Mihailo Aleksic

> Enforce type coercion before children output deduplication in Union
> ---
>
> Key: SPARK-52462
> URL: https://issues.apache.org/jira/browse/SPARK-52462
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 4.1.0
>Reporter: Mihailo Aleksic
>Assignee: Mihailo Aleksic
>Priority: Major
>  Labels: pull-request-available
>
> Right now, query the following query produces plans that are not consistent 
> over different underlying table providers. Query:
> SELECT col1, col2, col3, NULLIF('','') AS col4
> FROM table
> UNION ALL
> SELECT col2, col2, null AS col3, col4
> FROM table;
> This happens because of rule ordering:
>  - Sometimes: ... -> WidenSetOperationTypes -> ... -> ResolveReferences 
> (deduplication of Union children outputs) -> ...
>  - Sometimes: ... -> ResolveReferences (deduplication of Union children 
> outputs) -> ... -> WidenSetOperationTypes -> ...
> In this issue I propose that we align those two by enforcing type coercion to 
> happen before deduplication.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-52408) SPIP: Upgrade Apache Hive to 4.x

2025-06-23 Thread Sean R. Owen (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-52408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17985640#comment-17985640
 ] 

Sean R. Owen commented on SPARK-52408:
--

Maybe, maybe not, but if there is already a document detailing some of the work 
and tradeoffs, that seems fine and not a problem. I don't see that it makes any 
particular difference from here.

> SPIP: Upgrade Apache Hive to 4.x
> 
>
> Key: SPARK-52408
> URL: https://issues.apache.org/jira/browse/SPARK-52408
> Project: Spark
>  Issue Type: Umbrella
>  Components: SQL
>Affects Versions: 4.1.0
>Reporter: Vlad Rozov
>Priority: Major
>  Labels: SPIP
>
> The 
> [SPIP|https://docs.google.com/document/d/1ejaGpuBvwBz2cD3Xj-QysShauBrdgYSh5yTxfAGvS1c/edit?usp=sharing]
>  proposes upgrading the Apache Hive version used in Apache Spark builds from 
> 2.3.10 to *version 4.x* (either 4.0.1 or the upcoming 4.1.0). It also 
> proposes discontinuing support for Apache Hive 2.x and 3.x, as these versions 
> are no longer maintained by the Apache Hive community and have reached 
> end-of-life (EOL).
> The *key objectives* of this proposal are to:
>  # *Maintain all existing functionality* currently supported in Apache Hive 
> 2.x that *is compatible* with Apache Hive 4.x
>  # Ensure *no functional or performance regressions* occur
>  # Provide *the best upgrade path* for current Apache Spark users, minimizing 
> prerequisites and manual steps for those using Hive 2.x or 3.x
> SPIP 
> [doc|https://docs.google.com/document/d/1ejaGpuBvwBz2cD3Xj-QysShauBrdgYSh5yTxfAGvS1c/edit?usp=sharing]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-52548) Add a test case for when shuffle manager is overridden by a SparkPlugin

2025-06-23 Thread Hongze Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-52548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongze Zhang updated SPARK-52548:
-
Component/s: Spark Core
 (was: SQL)

> Add a test case for when shuffle manager is overridden by a SparkPlugin
> ---
>
> Key: SPARK-52548
> URL: https://issues.apache.org/jira/browse/SPARK-52548
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core, Tests
>Affects Versions: 4.0.0
>Reporter: Hongze Zhang
>Priority: Major
>
> The PR [https://github.com/apache/spark/pull/43627] for SPARK-45762 
> introduced a change to allow the shuffle manager specified in Spark 
> configuration to be overridden in SparkPlugin, this change was not tested 
> though.
>  
> Will suggest adding a test case for it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-52550) SparkSessionExtensions requires support for DSV2 based extensions which require the classic SparkSession

2025-06-23 Thread Jack (Jira)
Jack created SPARK-52550:


 Summary: SparkSessionExtensions requires support for DSV2 based 
extensions which require the classic SparkSession
 Key: SPARK-52550
 URL: https://issues.apache.org/jira/browse/SPARK-52550
 Project: Spark
  Issue Type: New Feature
  Components: SQL
Affects Versions: 4.0.0
Reporter: Jack


Extensions providing connector capabilities such as 
[https://github.com/apache/cassandra-spark-connector] register custom V2 
strategies as part of the extension. However, in order to implement the data 
source v2 strategy, we require a session type to be provided as 
classic.SparkSession [due to API changes in 
DSV2|https://github.com/apache/spark/commit/5db31aec33c53aaa7c814f33ec84e6ba66fc193b#diff-7aeb491d44e183c8c8cf86d90b57701dba009fc19983c2a5c09449c768b047ceR36].

It appears it is no longer possible to implement a custom strategy from an 
extension in this mechanism, since SparkSessionExtensions only provides a 
handle to the SparkSession class, which cannot be used for DSV2 strategy 
planners.

This item of work is to enable a user to provision extensions when they know 
the session will be scoped to classic, allowing registration of a DSV2 
strategy, eg.
{code:java}
import org.apache.spark.sql.classic.{SparkSession => ClassicSparkSession}

.
.
.

case class MyCustomStrategy(spark: ClassicSparkSession)  extends Strategy with 
Serializable {
.
.
.
  override def apply(plan: LogicalPlan): Seq[SparkPlan] = plan match {
.
.
val dataSourceOptimizedPlan = new DataSourceV2Strategy(spark)...{code}

Where this is registered via
{code:java}
class CoolSparkExtensions extends (SparkSessionExtensions => Unit) with Logging 
{
  override def apply(extensions: SparkSessionExtensions): Unit = {
extensions.injectPlannerStrategy(MyCustomStrategy.apply) 

.
.
.{code}

Worth noting that the existing API is marked both Experimental and @Unstable, 
meaning changes proposed could be considered if a better solution to this issue 
is not devised.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-52551) Add a new v2 Predicate BOOLEAN_EXPRESSION

2025-06-23 Thread Wenchen Fan (Jira)
Wenchen Fan created SPARK-52551:
---

 Summary: Add a new v2 Predicate BOOLEAN_EXPRESSION
 Key: SPARK-52551
 URL: https://issues.apache.org/jira/browse/SPARK-52551
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 4.1.0
Reporter: Wenchen Fan






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-52549) Disable Recursive CTE self-references from window functions and inside sorts

2025-06-23 Thread Jira


 [ 
https://issues.apache.org/jira/browse/SPARK-52549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavle Martinović updated SPARK-52549:
-
Summary: Disable Recursive CTE self-references from window functions and 
inside sorts  (was: Disable Recursive CTE self-references from window functions)

> Disable Recursive CTE self-references from window functions and inside sorts
> 
>
> Key: SPARK-52549
> URL: https://issues.apache.org/jira/browse/SPARK-52549
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.1.0
>Reporter: Pavle Martinović
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-52538) Add new method to check if the value is fully extractable

2025-06-23 Thread Anastasia Filippova (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-52538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17985253#comment-17985253
 ] 

Anastasia Filippova commented on SPARK-52538:
-

User 'vladimirg-db' has created a pull request for this issue:
https://github.com/apache/spark/pull/51231

> Add new method to check if the value is fully extractable
> -
>
> Key: SPARK-52538
> URL: https://issues.apache.org/jira/browse/SPARK-52538
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 4.1.0
>Reporter: Vladimir Golubev
>Priority: Major
>
> To be later used in the single-pass Analyzer.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-52542) Use `/nonexistent` instead of nonexistent `/opt/spark`

2025-06-23 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-52542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-52542.
---
Fix Version/s: 4.0.1
 Assignee: Dongjoon Hyun
   Resolution: Fixed

This is resolved via https://github.com/apache/spark-docker/pull/87

> Use `/nonexistent` instead of nonexistent `/opt/spark`
> --
>
> Key: SPARK-52542
> URL: https://issues.apache.org/jira/browse/SPARK-52542
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes
>Affects Versions: 3.3.4, 3.4.4, 3.5.6, 4.0.0
>Reporter: Dongjoon Hyun
>Assignee: Dongjoon Hyun
>Priority: Major
> Fix For: 4.0.1
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-51717) Possible SST mismatch error for the second snapshot created for a new query

2025-06-23 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-51717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17985730#comment-17985730
 ] 

Hudson commented on SPARK-51717:


User 'micheal-o' has created a pull request for this issue:
https://github.com/apache/spark/pull/51255

> Possible SST mismatch error for the second snapshot created for a new query
> ---
>
> Key: SPARK-51717
> URL: https://issues.apache.org/jira/browse/SPARK-51717
> Project: Spark
>  Issue Type: Bug
>  Components: Structured Streaming
>Affects Versions: 4.1.0, 4.0.0
>Reporter: B. Micheal Okutubo
>Assignee: B. Micheal Okutubo
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Fix this error: Sst file size mismatch ... MANIFEST-05 may be corrupted
> An edge case in SST file reuse that can only happen for the first ever 
> RocksDB checkpoint if:
>  # The first ever RocksDB checkpoint (e.g. for version 10) was created with 
> x.sst, but not yet upload by maintenance
>  # The next batch using RocksDB at v10 fails and rolls back store to -1 
> (invalidates RocksDB)
>  # A new request to load RocksDB at v10 comes in, but v10 checkpoint is still 
> not uploaded hence we have to start replaying changelog starting from 
> checkpoint v0.
>  # We create a new v11 and new checkpoint with new x*.sst. v10 is now 
> uploaded by maintenance. Then during upload of x*.sst for v11, we reuse x.sst 
> DFS file, thinking it is the same as x*.sst.
> The problem here is from step 3, the way the file manager loads v0 is 
> different from how it loads other versions. During the load of other 
> versions, when we delete an existing local file we also delete it from file 
> mapping. But for v0, file manager just deletes the local dir and we missed 
> clearing the file mapping in this case. Hence the old x.sst was still showing 
> in the file mapping at step 4. We need to fix this and also add additional 
> size check.
>  
> Only when using changelog checkpointing



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-52515) Approx_top_k SQL function based on Apache DataSketches

2025-06-23 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-52515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17985731#comment-17985731
 ] 

Hudson commented on SPARK-52515:


User 'yhuang-db' has created a pull request for this issue:
https://github.com/apache/spark/pull/51236

> Approx_top_k SQL function based on Apache DataSketches
> --
>
> Key: SPARK-52515
> URL: https://issues.apache.org/jira/browse/SPARK-52515
> Project: Spark
>  Issue Type: New Feature
>  Components: SQL
>Affects Versions: 4.1.0
>Reporter: Yuchuan Huang
>Priority: Major
>
> Apache DataSketches an open-source library of sketches algorithms, and it is 
> widely used and adopted in industry. This ticket aims to introduce a new 
> function "approx_top_k", which uses the _frequent items sketches_ from Apache 
> DataSketches to find the approximate k-most-frequent items in a dataset.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-52549) Disable Recursive CTE self-references from window functions and inside sorts

2025-06-23 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-52549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-52549.
-
Fix Version/s: 4.1.0
   Resolution: Fixed

Issue resolved by pull request 51178
[https://github.com/apache/spark/pull/51178]

> Disable Recursive CTE self-references from window functions and inside sorts
> 
>
> Key: SPARK-52549
> URL: https://issues.apache.org/jira/browse/SPARK-52549
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.1.0
>Reporter: Pavle Martinović
>Assignee: Pavle Martinović
>Priority: Major
> Fix For: 4.1.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-52549) Disable Recursive CTE self-references from window functions and inside sorts

2025-06-23 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-52549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-52549:
---

Assignee: Pavle Martinović

> Disable Recursive CTE self-references from window functions and inside sorts
> 
>
> Key: SPARK-52549
> URL: https://issues.apache.org/jira/browse/SPARK-52549
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.1.0
>Reporter: Pavle Martinović
>Assignee: Pavle Martinović
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-52555) Enforce `UnusedLocalVariable` rule

2025-06-23 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-52555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-52555.
---
Fix Version/s: kubernetes-operator-0.4.0
   Resolution: Fixed

Issue resolved by pull request 254
[https://github.com/apache/spark-kubernetes-operator/pull/254]

> Enforce `UnusedLocalVariable` rule
> --
>
> Key: SPARK-52555
> URL: https://issues.apache.org/jira/browse/SPARK-52555
> Project: Spark
>  Issue Type: Sub-task
>  Components: Kubernetes
>Affects Versions: kubernetes-operator-0.4.0
>Reporter: Dongjoon Hyun
>Assignee: Dongjoon Hyun
>Priority: Minor
> Fix For: kubernetes-operator-0.4.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-52555) Enforce `UnusedLocalVariable` rule

2025-06-23 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-52555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-52555:
-

Assignee: Dongjoon Hyun

> Enforce `UnusedLocalVariable` rule
> --
>
> Key: SPARK-52555
> URL: https://issues.apache.org/jira/browse/SPARK-52555
> Project: Spark
>  Issue Type: Sub-task
>  Components: Kubernetes
>Affects Versions: kubernetes-operator-0.4.0
>Reporter: Dongjoon Hyun
>Assignee: Dongjoon Hyun
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-52558) Lower `SparkOperatorConfManager` log level to WARN for `FileNotFoundException`

2025-06-23 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-52558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-52558.
---
Fix Version/s: kubernetes-operator-0.4.0
   Resolution: Fixed

Issue resolved by pull request 255
[https://github.com/apache/spark-kubernetes-operator/pull/255]

> Lower `SparkOperatorConfManager` log level to WARN for `FileNotFoundException`
> --
>
> Key: SPARK-52558
> URL: https://issues.apache.org/jira/browse/SPARK-52558
> Project: Spark
>  Issue Type: Sub-task
>  Components: Kubernetes
>Affects Versions: kubernetes-operator-0.4.0
>Reporter: Dongjoon Hyun
>Assignee: Dongjoon Hyun
>Priority: Minor
> Fix For: kubernetes-operator-0.4.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-52558) Lower `SparkOperatorConfManager` log level to WARN for `FileNotFoundException`

2025-06-23 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-52558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-52558:
-

Assignee: Dongjoon Hyun

> Lower `SparkOperatorConfManager` log level to WARN for `FileNotFoundException`
> --
>
> Key: SPARK-52558
> URL: https://issues.apache.org/jira/browse/SPARK-52558
> Project: Spark
>  Issue Type: Sub-task
>  Components: Kubernetes
>Affects Versions: kubernetes-operator-0.4.0
>Reporter: Dongjoon Hyun
>Assignee: Dongjoon Hyun
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-52559) Synchronize `SparkOperatorConfManager.getValue`

2025-06-23 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-52559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-52559:
-

Assignee: Dongjoon Hyun

> Synchronize `SparkOperatorConfManager.getValue`
> ---
>
> Key: SPARK-52559
> URL: https://issues.apache.org/jira/browse/SPARK-52559
> Project: Spark
>  Issue Type: Sub-task
>  Components: Kubernetes
>Affects Versions: kubernetes-operator-0.4.0
>Reporter: Dongjoon Hyun
>Assignee: Dongjoon Hyun
>Priority: Blocker
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-52559) Synchronize `SparkOperatorConfManager.getValue`

2025-06-23 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-52559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-52559.
---
Fix Version/s: kubernetes-operator-0.4.0
   Resolution: Fixed

Issue resolved by pull request 256
[https://github.com/apache/spark-kubernetes-operator/pull/256]

> Synchronize `SparkOperatorConfManager.getValue`
> ---
>
> Key: SPARK-52559
> URL: https://issues.apache.org/jira/browse/SPARK-52559
> Project: Spark
>  Issue Type: Sub-task
>  Components: Kubernetes
>Affects Versions: kubernetes-operator-0.4.0
>Reporter: Dongjoon Hyun
>Assignee: Dongjoon Hyun
>Priority: Blocker
> Fix For: kubernetes-operator-0.4.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-52559) Synchronize `SparkOperatorConfManager.getValue`

2025-06-23 Thread Dongjoon Hyun (Jira)
Dongjoon Hyun created SPARK-52559:
-

 Summary: Synchronize `SparkOperatorConfManager.getValue`
 Key: SPARK-52559
 URL: https://issues.apache.org/jira/browse/SPARK-52559
 Project: Spark
  Issue Type: Sub-task
  Components: Kubernetes
Affects Versions: kubernetes-operator-0.4.0
Reporter: Dongjoon Hyun






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-52547) Build dry runs against master branch

2025-06-23 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-52547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-52547.
--
Fix Version/s: 4.1.0
   Resolution: Fixed

Issue resolved by pull request 51245
[https://github.com/apache/spark/pull/51245]

> Build dry runs against master branch
> 
>
> Key: SPARK-52547
> URL: https://issues.apache.org/jira/browse/SPARK-52547
> Project: Spark
>  Issue Type: Sub-task
>  Components: Project Infra
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Assignee: Hyukjin Kwon
>Priority: Major
> Fix For: 4.1.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-52547) Build dry runs against master branch

2025-06-23 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-52547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-52547:


Assignee: Hyukjin Kwon

> Build dry runs against master branch
> 
>
> Key: SPARK-52547
> URL: https://issues.apache.org/jira/browse/SPARK-52547
> Project: Spark
>  Issue Type: Sub-task
>  Components: Project Infra
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Assignee: Hyukjin Kwon
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-52554) Avoid multiple roundtrips for config check in connect

2025-06-23 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-52554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-52554.
--
Fix Version/s: 4.1.0
   Resolution: Fixed

Issue resolved by pull request 51252
[https://github.com/apache/spark/pull/51252]

> Avoid multiple roundtrips for config check in connect
> -
>
> Key: SPARK-52554
> URL: https://issues.apache.org/jira/browse/SPARK-52554
> Project: Spark
>  Issue Type: Sub-task
>  Components: Pandas API on Spark
>Affects Versions: 4.1.0
>Reporter: Takuya Ueshin
>Assignee: Takuya Ueshin
>Priority: Major
> Fix For: 4.1.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-52554) Avoid multiple roundtrips for config check in connect

2025-06-23 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-52554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-52554:


Assignee: Takuya Ueshin

> Avoid multiple roundtrips for config check in connect
> -
>
> Key: SPARK-52554
> URL: https://issues.apache.org/jira/browse/SPARK-52554
> Project: Spark
>  Issue Type: Sub-task
>  Components: Pandas API on Spark
>Affects Versions: 4.1.0
>Reporter: Takuya Ueshin
>Assignee: Takuya Ueshin
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-52536) Specify AsyncProfilerLoader.extractionDir to spark local dir

2025-06-23 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-52536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-52536:


Assignee: Zhen Wang

> Specify AsyncProfilerLoader.extractionDir to spark local dir
> 
>
> Key: SPARK-52536
> URL: https://issues.apache.org/jira/browse/SPARK-52536
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 4.0.0
>Reporter: Zhen Wang
>Assignee: Zhen Wang
>Priority: Major
>
> AsyncProfilerLoader uses `user.home` by default to store the extracted 
> libraries: 
> [https://github.com/jvm-profiling-tools/ap-loader/blob/main/src/main/java/one/profiler/AsyncProfilerLoader.java#L139-L152]
> The `user.home` directory of the datanodes in our yarn cluster was not 
> initialized, causing the executor startup to fail:
> {code:java}
> 25/06/20 11:54:26 ERROR YarnCoarseGrainedExecutorBackend: Executor 
> self-exiting due to : Unable to create executor due to /home/pilot
> java.nio.file.AccessDeniedException: /home/pilot
>   at 
> sun.nio.fs.UnixException.translateToIOException(UnixException.java:84)
>   at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
>   at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
>   at 
> sun.nio.fs.UnixFileSystemProvider.createDirectory(UnixFileSystemProvider.java:384)
>   at java.nio.file.Files.createDirectory(Files.java:674)
>   at java.nio.file.Files.createAndCheckIsDirectory(Files.java:781)
>   at java.nio.file.Files.createDirectories(Files.java:767)
>   at 
> one.profiler.AsyncProfilerLoader.getExtractionDirectory(AsyncProfilerLoader.java:133)
>   at 
> one.profiler.AsyncProfilerLoader.getAsyncProfilerPath(AsyncProfilerLoader.java:562)
>   at one.profiler.AsyncProfilerLoader.load(AsyncProfilerLoader.java:861)
>   at 
> org.apache.spark.profiler.SparkAsyncProfiler.(SparkAsyncProfiler.scala:70)
>   at 
> org.apache.spark.profiler.ProfilerExecutorPlugin.init(ProfilerPlugin.scala:82)
>   at 
> org.apache.spark.internal.plugin.ExecutorPluginContainer.$anonfun$executorPlugins$1(PluginContainer.scala:125)
>   at 
> scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:293)
>   at 
> scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
>   at 
> scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
>   at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
>   at scala.collection.TraversableLike.flatMap(TraversableLike.scala:293)
>   at scala.collection.TraversableLike.flatMap$(TraversableLike.scala:290)
>   at scala.collection.AbstractTraversable.flatMap(Traversable.scala:108)
>   at 
> org.apache.spark.internal.plugin.ExecutorPluginContainer.(PluginContainer.scala:113)
>   at 
> org.apache.spark.internal.plugin.PluginContainer$.apply(PluginContainer.scala:211)
>   at 
> org.apache.spark.internal.plugin.PluginContainer$.apply(PluginContainer.scala:199)
>   at 
> org.apache.spark.executor.Executor.$anonfun$plugins$1(Executor.scala:337)
>   at org.apache.spark.util.Utils$.withContextClassLoader(Utils.scala:178)
>   at org.apache.spark.executor.Executor.(Executor.scala:337)
>   at 
> org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$receive$1.applyOrElse(CoarseGrainedExecutorBackend.scala:181)
>   at org.apache.spark.rpc.netty.Inbox.$anonfun$process$1(Inbox.scala:115)
>   at org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:213)
>   at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:100)
>   at 
> org.apache.spark.rpc.netty.MessageLoop.org$apache$spark$rpc$netty$MessageLoop$$receiveLoop(MessageLoop.scala:75)
>   at 
> org.apache.spark.rpc.netty.MessageLoop$$anon$1.run(MessageLoop.scala:41)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> 25/06/20 11:54:26 INFO YarnCoarseGrainedExecutorBackend: Driver commanded a 
> shutdown {code}
>  
> We can specify `AsyncProfilerLoader.extractionDir` to spark local dir to 
> avoid this issue.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-52536) Specify AsyncProfilerLoader.extractionDir to spark local dir

2025-06-23 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-52536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-52536.
--
Fix Version/s: 4.1.0
   Resolution: Fixed

Issue resolved by pull request 51229
[https://github.com/apache/spark/pull/51229]

> Specify AsyncProfilerLoader.extractionDir to spark local dir
> 
>
> Key: SPARK-52536
> URL: https://issues.apache.org/jira/browse/SPARK-52536
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 4.0.0
>Reporter: Zhen Wang
>Assignee: Zhen Wang
>Priority: Major
> Fix For: 4.1.0
>
>
> AsyncProfilerLoader uses `user.home` by default to store the extracted 
> libraries: 
> [https://github.com/jvm-profiling-tools/ap-loader/blob/main/src/main/java/one/profiler/AsyncProfilerLoader.java#L139-L152]
> The `user.home` directory of the datanodes in our yarn cluster was not 
> initialized, causing the executor startup to fail:
> {code:java}
> 25/06/20 11:54:26 ERROR YarnCoarseGrainedExecutorBackend: Executor 
> self-exiting due to : Unable to create executor due to /home/pilot
> java.nio.file.AccessDeniedException: /home/pilot
>   at 
> sun.nio.fs.UnixException.translateToIOException(UnixException.java:84)
>   at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
>   at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
>   at 
> sun.nio.fs.UnixFileSystemProvider.createDirectory(UnixFileSystemProvider.java:384)
>   at java.nio.file.Files.createDirectory(Files.java:674)
>   at java.nio.file.Files.createAndCheckIsDirectory(Files.java:781)
>   at java.nio.file.Files.createDirectories(Files.java:767)
>   at 
> one.profiler.AsyncProfilerLoader.getExtractionDirectory(AsyncProfilerLoader.java:133)
>   at 
> one.profiler.AsyncProfilerLoader.getAsyncProfilerPath(AsyncProfilerLoader.java:562)
>   at one.profiler.AsyncProfilerLoader.load(AsyncProfilerLoader.java:861)
>   at 
> org.apache.spark.profiler.SparkAsyncProfiler.(SparkAsyncProfiler.scala:70)
>   at 
> org.apache.spark.profiler.ProfilerExecutorPlugin.init(ProfilerPlugin.scala:82)
>   at 
> org.apache.spark.internal.plugin.ExecutorPluginContainer.$anonfun$executorPlugins$1(PluginContainer.scala:125)
>   at 
> scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:293)
>   at 
> scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
>   at 
> scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
>   at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
>   at scala.collection.TraversableLike.flatMap(TraversableLike.scala:293)
>   at scala.collection.TraversableLike.flatMap$(TraversableLike.scala:290)
>   at scala.collection.AbstractTraversable.flatMap(Traversable.scala:108)
>   at 
> org.apache.spark.internal.plugin.ExecutorPluginContainer.(PluginContainer.scala:113)
>   at 
> org.apache.spark.internal.plugin.PluginContainer$.apply(PluginContainer.scala:211)
>   at 
> org.apache.spark.internal.plugin.PluginContainer$.apply(PluginContainer.scala:199)
>   at 
> org.apache.spark.executor.Executor.$anonfun$plugins$1(Executor.scala:337)
>   at org.apache.spark.util.Utils$.withContextClassLoader(Utils.scala:178)
>   at org.apache.spark.executor.Executor.(Executor.scala:337)
>   at 
> org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$receive$1.applyOrElse(CoarseGrainedExecutorBackend.scala:181)
>   at org.apache.spark.rpc.netty.Inbox.$anonfun$process$1(Inbox.scala:115)
>   at org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:213)
>   at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:100)
>   at 
> org.apache.spark.rpc.netty.MessageLoop.org$apache$spark$rpc$netty$MessageLoop$$receiveLoop(MessageLoop.scala:75)
>   at 
> org.apache.spark.rpc.netty.MessageLoop$$anon$1.run(MessageLoop.scala:41)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> 25/06/20 11:54:26 INFO YarnCoarseGrainedExecutorBackend: Driver commanded a 
> shutdown {code}
>  
> We can specify `AsyncProfilerLoader.extractionDir` to spark local dir to 
> avoid this issue.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-52499) Add more tests for data types

2025-06-23 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-52499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-52499.
-
Fix Version/s: 4.1.0
   4.0.1
   Resolution: Fixed

Issue resolved by pull request 51193
[https://github.com/apache/spark/pull/51193]

> Add more tests for data types
> -
>
> Key: SPARK-52499
> URL: https://issues.apache.org/jira/browse/SPARK-52499
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.1.0
>Reporter: Allison Wang
>Assignee: Allison Wang
>Priority: Major
> Fix For: 4.1.0, 4.0.1
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-52499) Add more tests for data types

2025-06-23 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-52499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-52499:
---

Assignee: Allison Wang

> Add more tests for data types
> -
>
> Key: SPARK-52499
> URL: https://issues.apache.org/jira/browse/SPARK-52499
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.1.0
>Reporter: Allison Wang
>Assignee: Allison Wang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-52534) Make MLCache and MLHandler thread-safe

2025-06-23 Thread Weichen Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-52534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weichen Xu resolved SPARK-52534.

Fix Version/s: 4.1.0
   Resolution: Fixed

Issue resolved by pull request 51226
[https://github.com/apache/spark/pull/51226]

> Make MLCache and MLHandler thread-safe
> --
>
> Key: SPARK-52534
> URL: https://issues.apache.org/jira/browse/SPARK-52534
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect, ML
>Affects Versions: 4.1.0
>Reporter: Weichen Xu
>Assignee: Weichen Xu
>Priority: Major
> Fix For: 4.1.0
>
>
> Make MLCache and MLHandler thread-safe
>  
> Current implementation might cause race conditions.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-52546) when sparkcontext crashes in executing sql, final state should be "error", but eventually return "finished".

2025-06-23 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-52546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-52546.
--
Fix Version/s: 4.1.0
   Resolution: Fixed

Issue resolved by pull request 51243
[https://github.com/apache/spark/pull/51243]

> when sparkcontext crashes in executing sql, final state should be "error", 
> but eventually return "finished".
> 
>
> Key: SPARK-52546
> URL: https://issues.apache.org/jira/browse/SPARK-52546
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.5.0, 4.0.0
>Reporter: xuyu
>Assignee: xuyu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.1.0
>
>
> when running execute() in SparkExecuteStatementOperation.scala, sparkContext 
> crashes, and runs into catch code block, state in Operation.scala should be 
> set "error", but eventually return "finished". The final state is wrong, 
> because catch code block in execute() missed a judgment branch.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-52546) when sparkcontext crashes in executing sql, final state should be "error", but eventually return "finished".

2025-06-23 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-52546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-52546:


Assignee: xuyu

> when sparkcontext crashes in executing sql, final state should be "error", 
> but eventually return "finished".
> 
>
> Key: SPARK-52546
> URL: https://issues.apache.org/jira/browse/SPARK-52546
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.5.0, 4.0.0
>Reporter: xuyu
>Assignee: xuyu
>Priority: Major
>  Labels: pull-request-available
>
> when running execute() in SparkExecuteStatementOperation.scala, sparkContext 
> crashes, and runs into catch code block, state in Operation.scala should be 
> set "error", but eventually return "finished". The final state is wrong, 
> because catch code block in execute() missed a judgment branch.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-52540) Support the time type by make_timestamp_ntz()

2025-06-23 Thread Max Gekk (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-52540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Max Gekk resolved SPARK-52540.
--
Fix Version/s: 4.1.0
   Resolution: Fixed

Issue resolved by pull request 51232
[https://github.com/apache/spark/pull/51232]

> Support the time type by make_timestamp_ntz()
> -
>
> Key: SPARK-52540
> URL: https://issues.apache.org/jira/browse/SPARK-52540
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.1.0
>Reporter: Max Gekk
>Assignee: Max Gekk
>Priority: Major
> Fix For: 4.1.0
>
>
> Modify the make_timestamp_ntz() function to create a timestamp from date and 
> time.
> h4. Syntax
> {code:sql}
> make_timestamp_ntz(date [, time])
> {code}
> h4. Arguments
> # date: A date expression
> # time: A time expression
> h4. Returns
> A TIMESTAMP.
> h4. Examples
> {code:sql}
> > SELECT make_timestamp_ntz(DATE'2014-12-28', TIME'6:30:45.887');
>  2014-12-28 06:30:45.887
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-52349) Enable boolean division tests with ANSI enabled

2025-06-23 Thread Xinrong Meng (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-52349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xinrong Meng updated SPARK-52349:
-
Summary: Enable boolean division tests with ANSI enabled  (was: Enable 
divide-by-zero for boolean division with ANSI enabled)

> Enable boolean division tests with ANSI enabled
> ---
>
> Key: SPARK-52349
> URL: https://issues.apache.org/jira/browse/SPARK-52349
> Project: Spark
>  Issue Type: Sub-task
>  Components: PS
>Affects Versions: 4.1.0
>Reporter: Xinrong Meng
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-52554) Avoid multiple roundtrips for config check in connect

2025-06-23 Thread Takuya Ueshin (Jira)
Takuya Ueshin created SPARK-52554:
-

 Summary: Avoid multiple roundtrips for config check in connect
 Key: SPARK-52554
 URL: https://issues.apache.org/jira/browse/SPARK-52554
 Project: Spark
  Issue Type: Sub-task
  Components: Pandas API on Spark
Affects Versions: 4.1.0
Reporter: Takuya Ueshin






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-52555) Enforce `UnusedLocalVariable` rule

2025-06-23 Thread Dongjoon Hyun (Jira)
Dongjoon Hyun created SPARK-52555:
-

 Summary: Enforce `UnusedLocalVariable` rule
 Key: SPARK-52555
 URL: https://issues.apache.org/jira/browse/SPARK-52555
 Project: Spark
  Issue Type: Sub-task
  Components: Kubernetes
Affects Versions: kubernetes-operator-0.4.0
Reporter: Dongjoon Hyun






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-52556) CAST_INVALID_INPUT for Pandas on Spark in ANSI mode

2025-06-23 Thread Xinrong Meng (Jira)
Xinrong Meng created SPARK-52556:


 Summary: CAST_INVALID_INPUT for Pandas on Spark in ANSI mode
 Key: SPARK-52556
 URL: https://issues.apache.org/jira/browse/SPARK-52556
 Project: Spark
  Issue Type: Umbrella
  Components: PS
Affects Versions: 4.1.0
Reporter: Xinrong Meng






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-52557) Avoid CAST_INVALID_INPUT of to_numeric(errors='coerce') in ANSI mode

2025-06-23 Thread Xinrong Meng (Jira)
Xinrong Meng created SPARK-52557:


 Summary: Avoid CAST_INVALID_INPUT of to_numeric(errors='coerce') 
in ANSI mode
 Key: SPARK-52557
 URL: https://issues.apache.org/jira/browse/SPARK-52557
 Project: Spark
  Issue Type: Sub-task
  Components: PS
Affects Versions: 4.1.0
Reporter: Xinrong Meng






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-52558) Lower `SparkOperatorConfManager` log level to WARN for `FileNotFoundException`

2025-06-23 Thread Dongjoon Hyun (Jira)
Dongjoon Hyun created SPARK-52558:
-

 Summary: Lower `SparkOperatorConfManager` log level to WARN for 
`FileNotFoundException`
 Key: SPARK-52558
 URL: https://issues.apache.org/jira/browse/SPARK-52558
 Project: Spark
  Issue Type: Sub-task
  Components: Kubernetes
Affects Versions: kubernetes-operator-0.4.0
Reporter: Dongjoon Hyun






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-52553) Fix NumberFormatException when reading v1 changelog

2025-06-23 Thread Jungtaek Lim (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-52553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jungtaek Lim resolved SPARK-52553.
--
Fix Version/s: 4.1.0
   4.0.1
   Resolution: Fixed

Issue resolved by pull request 51255
[https://github.com/apache/spark/pull/51255]

> Fix NumberFormatException when reading v1 changelog
> ---
>
> Key: SPARK-52553
> URL: https://issues.apache.org/jira/browse/SPARK-52553
> Project: Spark
>  Issue Type: Bug
>  Components: Structured Streaming
>Affects Versions: 4.0.0
>Reporter: B. Micheal Okutubo
>Assignee: B. Micheal Okutubo
>Priority: Major
> Fix For: 4.1.0, 4.0.1
>
>
> When trying to read the changelog version, the reader factory throws 
> NumberFormatException for v1 changelog, if it decodes the first few bytes in 
> the file as UTF string e.g. "v)"



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-52553) Fix NumberFormatException when reading v1 changelog

2025-06-23 Thread Jungtaek Lim (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-52553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jungtaek Lim reassigned SPARK-52553:


Assignee: B. Micheal Okutubo

> Fix NumberFormatException when reading v1 changelog
> ---
>
> Key: SPARK-52553
> URL: https://issues.apache.org/jira/browse/SPARK-52553
> Project: Spark
>  Issue Type: Bug
>  Components: Structured Streaming
>Affects Versions: 4.0.0
>Reporter: B. Micheal Okutubo
>Assignee: B. Micheal Okutubo
>Priority: Major
>
> When trying to read the changelog version, the reader factory throws 
> NumberFormatException for v1 changelog, if it decodes the first few bytes in 
> the file as UTF string e.g. "v)"



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48231) Remove unused CodeHaus Jackson dependencies

2025-06-23 Thread Yang Jie (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Jie resolved SPARK-48231.
--
Fix Version/s: 4.1.0
   Resolution: Fixed

Issue resolved by pull request 46521
[https://github.com/apache/spark/pull/46521]

> Remove unused CodeHaus Jackson dependencies
> ---
>
> Key: SPARK-48231
> URL: https://issues.apache.org/jira/browse/SPARK-48231
> Project: Spark
>  Issue Type: Sub-task
>  Components: Build
>Affects Versions: 4.1.0
>Reporter: Cheng Pan
>Assignee: Cheng Pan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.1.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-52349) Enable boolean division tests with ANSI enabled

2025-06-23 Thread Takuya Ueshin (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-52349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takuya Ueshin reassigned SPARK-52349:
-

Assignee: Xinrong Meng

> Enable boolean division tests with ANSI enabled
> ---
>
> Key: SPARK-52349
> URL: https://issues.apache.org/jira/browse/SPARK-52349
> Project: Spark
>  Issue Type: Sub-task
>  Components: PS
>Affects Versions: 4.1.0
>Reporter: Xinrong Meng
>Assignee: Xinrong Meng
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-52552) Skip CHECK constraint enforcement for deletion vector deletes

2025-06-23 Thread Gengliang Wang (Jira)
Gengliang Wang created SPARK-52552:
--

 Summary: Skip CHECK constraint enforcement for deletion vector 
deletes
 Key: SPARK-52552
 URL: https://issues.apache.org/jira/browse/SPARK-52552
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 4.1.0
Reporter: Gengliang Wang
Assignee: Gengliang Wang


Writes a delta of rows to an existing table doesn't produce any new rows, thus 
enforcing
check constraints is unnecessary.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-51068) CTEs are not canonicalized and resulting in cached result not being used and recomputed

2025-06-23 Thread Nimesh Khandelwal (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-51068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nimesh Khandelwal updated SPARK-51068:
--
Target Version/s: 3.3.2  (was: 3.3.2, 4.0.0)

> CTEs are not canonicalized and resulting in cached result not being used and 
> recomputed
> ---
>
> Key: SPARK-51068
> URL: https://issues.apache.org/jira/browse/SPARK-51068
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.1.2, 3.1.3, 3.3.2
>Reporter: Nimesh Khandelwal
>Priority: Major
>  Labels: pull-request-available
>
> To check whether the plan exists in the cache or not, CacheManager matches 
> the canonicalized version of the plan. Currently, in canonicalized versions, 
> CTEIds are not handled and thus result in unnecessary cache misses in cases 
> where queries using CTE are stored. This issue starts after the commit to 
> [Avoid inlining non-deterministic 
> With-CTEs|https://github.com/apache/spark/pull/33671/files] in which each 
> CTERelationDef and CTERelationRef were introduced and their canonicalization 
> was not handled.
> {code:java}
> >>>spark.sql("CACHE TABLE cached_cte AS WITH cte1 AS ( SELECT 1 AS id, 
> >>>'Alice' AS name UNION ALL SELECT 2 AS id, 'Bob' AS name ), cte2 AS ( 
> >>>SELECT 1 AS id, 10 AS score UNION ALL SELECT 2 AS id, 20 AS score ) SELECT 
> >>>cte1.id, cte1.name, cte2.score FROM cte1 JOIN cte2 ON cte1.id = cte2.id");
> DataFrame[]
> >>> spark.sql("select count(*) from cached_cte").explain()
> == Physical Plan ==
> AdaptiveSparkPlan isFinalPlan=false
> +- HashAggregate(keys=[], functions=[count(1)])
>    +- Exchange SinglePartition, ENSURE_REQUIREMENTS, [plan_id=165]
>       +- HashAggregate(keys=[], functions=[partial_count(1)])
>          +- Project
>             +- BroadcastHashJoin [id#120], [id#124], Inner, BuildRight, false
>                :- Union
>                :  :- Project [1 AS id#120]
>                :  :  +- Scan OneRowRelation[]
>                :  +- Project [2 AS id#122]
>                :     +- Scan OneRowRelation[]
>                +- BroadcastExchange 
> HashedRelationBroadcastMode(List(cast(input[0, int, false] as 
> bigint)),false), [plan_id=160]
>                   +- Union
>                      :- Project [1 AS id#124]
>                      :  +- Scan OneRowRelation[]
>                      +- Project [2 AS id#126]
>                         +- Scan OneRowRelation[]{code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-52349) Enable boolean division tests with ANSI enabled

2025-06-23 Thread Takuya Ueshin (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-52349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takuya Ueshin resolved SPARK-52349.
---
Fix Version/s: 4.1.0
   Resolution: Fixed

Issue resolved by pull request 51249
[https://github.com/apache/spark/pull/51249]

> Enable boolean division tests with ANSI enabled
> ---
>
> Key: SPARK-52349
> URL: https://issues.apache.org/jira/browse/SPARK-52349
> Project: Spark
>  Issue Type: Sub-task
>  Components: PS
>Affects Versions: 4.1.0
>Reporter: Xinrong Meng
>Assignee: Xinrong Meng
>Priority: Major
> Fix For: 4.1.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-52349) Enable divide-by-zero for boolean division with ANSI enabled

2025-06-23 Thread Xinrong Meng (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-52349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xinrong Meng updated SPARK-52349:
-
Summary: Enable divide-by-zero for boolean division with ANSI enabled  
(was: Enable divide-by-zero test cases)

> Enable divide-by-zero for boolean division with ANSI enabled
> 
>
> Key: SPARK-52349
> URL: https://issues.apache.org/jira/browse/SPARK-52349
> Project: Spark
>  Issue Type: Sub-task
>  Components: PS
>Affects Versions: 4.1.0
>Reporter: Xinrong Meng
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-52553) Fix NumberFormatException when reading v1 changelog

2025-06-23 Thread B. Micheal Okutubo (Jira)
B. Micheal Okutubo created SPARK-52553:
--

 Summary: Fix NumberFormatException when reading v1 changelog
 Key: SPARK-52553
 URL: https://issues.apache.org/jira/browse/SPARK-52553
 Project: Spark
  Issue Type: Bug
  Components: Structured Streaming
Affects Versions: 4.0.0
Reporter: B. Micheal Okutubo


When trying to read the changelog version, the reader factory throws 
NumberFormatException for v1 changelog, if it decodes the first few bytes in 
the file as UTF string e.g. "v)"



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org