[jira] [Resolved] (SPARK-47548) Remove unused `commons-beanutils` dependency

2024-03-25 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-47548.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 45705
[https://github.com/apache/spark/pull/45705]

> Remove unused `commons-beanutils` dependency
> 
>
> Key: SPARK-47548
> URL: https://issues.apache.org/jira/browse/SPARK-47548
> Project: Spark
>  Issue Type: Sub-task
>  Components: Build
>Affects Versions: 4.0.0
>    Reporter: Dongjoon Hyun
>    Assignee: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-47549) Remove Spark 3.0~3.2 pyspark/version.py workaround from release scripts

2024-03-25 Thread Dongjoon Hyun (Jira)
Dongjoon Hyun created SPARK-47549:
-

 Summary: Remove Spark 3.0~3.2 pyspark/version.py workaround from 
release scripts
 Key: SPARK-47549
 URL: https://issues.apache.org/jira/browse/SPARK-47549
 Project: Spark
  Issue Type: Improvement
  Components: Build
Affects Versions: 4.0.0
Reporter: Dongjoon Hyun






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-46743) Count bug introduced for scalar subquery when using TEMPORARY VIEW, as compared to using table

2024-03-25 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-46743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-46743:
--
Labels: correctness pull-request-available  (was: pull-request-available)

> Count bug introduced for scalar subquery when using TEMPORARY VIEW, as 
> compared to using table
> --
>
> Key: SPARK-46743
> URL: https://issues.apache.org/jira/browse/SPARK-46743
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.5.0
>Reporter: Andy Lam
>Assignee: Andy Lam
>Priority: Major
>  Labels: correctness, pull-request-available
> Fix For: 4.0.0
>
>
> Using the temp view reproduces COUNT bug, returns nulls instead of 0.
> With a table:
> {code:java}
> scala> spark.sql("""CREATE TABLE outer_table USING parquet AS SELECT * FROM 
> VALUES
>      |     (1, 1),
>      |     (2, 1),
>      |     (3, 3),
>      |     (6, 6),
>      |     (7, 7),
>      |     (9, 9) AS inner_table(a, b)""")
> val res6: org.apache.spark.sql.DataFrame = []
> scala> spark.sql("CREATE TABLE null_table USING parquet AS SELECT CAST(null 
> AS int) AS a, CAST(null as int) AS b ;")
> val res7: org.apache.spark.sql.DataFrame = []
> scala> spark.sql("""SELECT ( SELECT COUNT(null_table.a) AS aggAlias FROM 
> null_table WHERE null_table.a = outer_table.a) FROM outer_table""").collect()
> val res8: Array[org.apache.spark.sql.Row] = Array([0], [0], [0], [0], [0], 
> [0]) {code}
> With a view:
>  
> {code:java}
> spark.sql("CREATE TEMPORARY VIEW outer_view(a, b) AS VALUES (1, 1), (2, 
> 1),(3, 3), (6, 6), (7, 7), (9, 9);")
> spark.sql("CREATE TEMPORARY VIEW null_view(a, b) AS SELECT CAST(null AS int), 
> CAST(null as int);")
> spark.sql("""SELECT ( SELECT COUNT(null_view.a) AS aggAlias FROM null_view 
> WHERE null_view.a = outer_view.a) FROM outer_view""").collect()
> val res2: Array[org.apache.spark.sql.Row] = Array([null], [null], [null], 
> [null], [null], [null]){code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-46743) Count bug introduced for scalar subquery when using TEMPORARY VIEW, as compared to using table

2024-03-25 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-46743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-46743:
--
Component/s: SQL
 (was: Optimizer)

> Count bug introduced for scalar subquery when using TEMPORARY VIEW, as 
> compared to using table
> --
>
> Key: SPARK-46743
> URL: https://issues.apache.org/jira/browse/SPARK-46743
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.5.0
>Reporter: Andy Lam
>Assignee: Andy Lam
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Using the temp view reproduces COUNT bug, returns nulls instead of 0.
> With a table:
> {code:java}
> scala> spark.sql("""CREATE TABLE outer_table USING parquet AS SELECT * FROM 
> VALUES
>      |     (1, 1),
>      |     (2, 1),
>      |     (3, 3),
>      |     (6, 6),
>      |     (7, 7),
>      |     (9, 9) AS inner_table(a, b)""")
> val res6: org.apache.spark.sql.DataFrame = []
> scala> spark.sql("CREATE TABLE null_table USING parquet AS SELECT CAST(null 
> AS int) AS a, CAST(null as int) AS b ;")
> val res7: org.apache.spark.sql.DataFrame = []
> scala> spark.sql("""SELECT ( SELECT COUNT(null_table.a) AS aggAlias FROM 
> null_table WHERE null_table.a = outer_table.a) FROM outer_table""").collect()
> val res8: Array[org.apache.spark.sql.Row] = Array([0], [0], [0], [0], [0], 
> [0]) {code}
> With a view:
>  
> {code:java}
> spark.sql("CREATE TEMPORARY VIEW outer_view(a, b) AS VALUES (1, 1), (2, 
> 1),(3, 3), (6, 6), (7, 7), (9, 9);")
> spark.sql("CREATE TEMPORARY VIEW null_view(a, b) AS SELECT CAST(null AS int), 
> CAST(null as int);")
> spark.sql("""SELECT ( SELECT COUNT(null_view.a) AS aggAlias FROM null_view 
> WHERE null_view.a = outer_view.a) FROM outer_view""").collect()
> val res2: Array[org.apache.spark.sql.Row] = Array([null], [null], [null], 
> [null], [null], [null]){code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-47548) Remove unused `commons-beanutils` dependency

2024-03-25 Thread Dongjoon Hyun (Jira)
Dongjoon Hyun created SPARK-47548:
-

 Summary: Remove unused `commons-beanutils` dependency
 Key: SPARK-47548
 URL: https://issues.apache.org/jira/browse/SPARK-47548
 Project: Spark
  Issue Type: Sub-task
  Components: Build
Affects Versions: 4.0.0
Reporter: Dongjoon Hyun






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-42452) Remove hadoop-2 profile from Apache Spark

2024-03-25 Thread Dongjoon Hyun (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-42452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17830566#comment-17830566
 ] 

Dongjoon Hyun commented on SPARK-42452:
---

This was resolved via https://github.com/apache/spark/pull/40788

> Remove hadoop-2 profile from Apache Spark
> -
>
> Key: SPARK-42452
> URL: https://issues.apache.org/jira/browse/SPARK-42452
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 3.5.0
>Reporter: Yang Jie
>Assignee: Yang Jie
>Priority: Major
> Fix For: 3.5.0
>
>
> SPARK-40651 Drop Hadoop2 binary distribtuion from release process and 
> SPARK-42447 Remove Hadoop 2 GitHub Action job
>   



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47503) Spark history sever fails to display query for cached JDBC relation named in quotes

2024-03-25 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-47503:
--
Fix Version/s: 3.4.3

> Spark history sever fails to display query for cached JDBC relation named in 
> quotes
> ---
>
> Key: SPARK-47503
> URL: https://issues.apache.org/jira/browse/SPARK-47503
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.2.1, 4.0.0
>Reporter: alexey
>Assignee: alexey
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0, 3.5.2, 3.4.3
>
> Attachments: Screenshot_11.png, eventlog_v2_local-1711020585149.rar
>
>
> Spark history sever fails to display query for cached JDBC relation (or 
> calculation derived from it)  named in quotes
> (Screenshot and generated history in attachments)
> How to reproduce:
> {code:java}
> val ticketsDf = spark.read.jdbc("jdbc:postgresql://localhost:5432/demo", """ 
> "test-schema".tickets """.trim, properties)
> val bookingDf = spark.read.parquet("path/bookings")
> ticketsDf.cache().count()
> val resultDf = bookingDf.join(ticketsDf, Seq("book_ref"))
> resultDf.write.mode(SaveMode.Overwrite).parquet("path/result") {code}
>  
> So the problem is in SparkPlanGraphNode class which creates a dot node. When 
> there is no metrics to display it simply returns tagged name and in this case 
> name contains quotes which corrupts dot file.
> Suggested solution is to escape name string
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47537) Use MySQL Connector/J for MySQL DB instead of MariaDB Connector/J

2024-03-25 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-47537:
--
Fix Version/s: 3.4.3

> Use MySQL Connector/J for MySQL DB instead of MariaDB Connector/J 
> --
>
> Key: SPARK-47537
> URL: https://issues.apache.org/jira/browse/SPARK-47537
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.4.1, 4.0.0, 3.5.2
>Reporter: Kent Yao
>Assignee: Kent Yao
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0, 3.5.2, 3.4.3
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47537) Use MySQL Connector/J for MySQL DB instead of MariaDB Connector/J

2024-03-25 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-47537:
--
Affects Version/s: 3.5.1
   (was: 3.5.2)

> Use MySQL Connector/J for MySQL DB instead of MariaDB Connector/J 
> --
>
> Key: SPARK-47537
> URL: https://issues.apache.org/jira/browse/SPARK-47537
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.4.1, 4.0.0, 3.5.1
>Reporter: Kent Yao
>Assignee: Kent Yao
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0, 3.5.2, 3.4.3
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47537) Use MySQL Connector/J for MySQL DB instead of MariaDB Connector/J

2024-03-25 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-47537:
--
Fix Version/s: 3.5.2

> Use MySQL Connector/J for MySQL DB instead of MariaDB Connector/J 
> --
>
> Key: SPARK-47537
> URL: https://issues.apache.org/jira/browse/SPARK-47537
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.4.1, 4.0.0, 3.5.2
>Reporter: Kent Yao
>Assignee: Kent Yao
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0, 3.5.2
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



Re: [DISCUSS] MySQL version support policy

2024-03-25 Thread Dongjoon Hyun
Hi, Cheng.

Thank you for the suggestion. Your suggestion seems to have at least two
themes.

A. Adding a new Apache Spark community policy (contract) to guarantee MySQL
LTS Versions Support.
B. Dropping the support of non-LTS version support (MySQL 8.3/8.2/8.1)

And, it brings me three questions.

1. For (A), do you mean MySQL LTS versions are not supported by Apache
Spark releases properly due to the improper test suite?
2. For (B), why does Apache Spark need to drop non-LTS MySQL support?
3. What about MariaDB? Do we need to stick to some versions?

To be clear, if needed, we can have daily GitHub Action CIs easily like
Python CI (Python 3.8/3.10/3.11/3.12).

-
https://github.com/apache/spark/blob/master/.github/workflows/build_python.yml

Thanks,
Dongjoon.


On Sun, Mar 24, 2024 at 10:29 PM Cheng Pan  wrote:

> Hi, Spark community,
>
> I noticed that the Spark JDBC connector MySQL dialect is testing against
> the 8.3.0[1] now, a non-LTS version.
>
> MySQL changed the version policy recently[2], which is now very similar to
> the Java version policy. In short, 5.5, 5.6, 5.7, 8.0 is the LTS version,
> 8.1, 8.2, 8.3 is non-LTS, and the next LTS version is 8.4.
>
> I would say that MySQL is one of the most important infrastructures today,
> I checked the AWS RDS MySQL[4] and Azure Database for MySQL[5] version
> support policy, and both only support 5.7 and 8.0.
>
> Also, Spark officially only supports LTS Java versions, like JDK 17 and
> 21, but not 22. I would recommend using MySQL 8.0 for testing until the
> next MySQL LTS version (8.4) is available.
>
> Additional discussion can be found at [3]
>
> [1] https://issues.apache.org/jira/browse/SPARK-47453
> [2]
> https://dev.mysql.com/blog-archive/introducing-mysql-innovation-and-long-term-support-lts-versions/
> [3] https://github.com/apache/spark/pull/45581
> [4] https://aws.amazon.com/rds/mysql/
> [5] https://learn.microsoft.com/en-us/azure/mysql/concepts-version-policy
>
> Thanks,
> Cheng Pan
>
>
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>


Re: [DISCUSS] MySQL version support policy

2024-03-25 Thread Dongjoon Hyun
Hi, Cheng.

Thank you for the suggestion. Your suggestion seems to have at least two
themes.

A. Adding a new Apache Spark community policy (contract) to guarantee MySQL
LTS Versions Support.
B. Dropping the support of non-LTS version support (MySQL 8.3/8.2/8.1)

And, it brings me three questions.

1. For (A), do you mean MySQL LTS versions are not supported by Apache
Spark releases properly due to the improper test suite?
2. For (B), why does Apache Spark need to drop non-LTS MySQL support?
3. What about MariaDB? Do we need to stick to some versions?

To be clear, if needed, we can have daily GitHub Action CIs easily like
Python CI (Python 3.8/3.10/3.11/3.12).

-
https://github.com/apache/spark/blob/master/.github/workflows/build_python.yml

Thanks,
Dongjoon.


On Sun, Mar 24, 2024 at 10:29 PM Cheng Pan  wrote:

> Hi, Spark community,
>
> I noticed that the Spark JDBC connector MySQL dialect is testing against
> the 8.3.0[1] now, a non-LTS version.
>
> MySQL changed the version policy recently[2], which is now very similar to
> the Java version policy. In short, 5.5, 5.6, 5.7, 8.0 is the LTS version,
> 8.1, 8.2, 8.3 is non-LTS, and the next LTS version is 8.4.
>
> I would say that MySQL is one of the most important infrastructures today,
> I checked the AWS RDS MySQL[4] and Azure Database for MySQL[5] version
> support policy, and both only support 5.7 and 8.0.
>
> Also, Spark officially only supports LTS Java versions, like JDK 17 and
> 21, but not 22. I would recommend using MySQL 8.0 for testing until the
> next MySQL LTS version (8.4) is available.
>
> Additional discussion can be found at [3]
>
> [1] https://issues.apache.org/jira/browse/SPARK-47453
> [2]
> https://dev.mysql.com/blog-archive/introducing-mysql-innovation-and-long-term-support-lts-versions/
> [3] https://github.com/apache/spark/pull/45581
> [4] https://aws.amazon.com/rds/mysql/
> [5] https://learn.microsoft.com/en-us/azure/mysql/concepts-version-policy
>
> Thanks,
> Cheng Pan
>
>
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>


[jira] [Resolved] (SPARK-47538) Remove `commons-logging` dependency

2024-03-25 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-47538.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 45687
[https://github.com/apache/spark/pull/45687]

> Remove `commons-logging` dependency
> ---
>
> Key: SPARK-47538
> URL: https://issues.apache.org/jira/browse/SPARK-47538
> Project: Spark
>  Issue Type: Sub-task
>  Components: Build
>Affects Versions: 4.0.0
>    Reporter: Dongjoon Hyun
>    Assignee: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-47537) Use MySQL Connector/J for MySQL DB instead of MariaDB Connector/J

2024-03-25 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-47537.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 45689
[https://github.com/apache/spark/pull/45689]

> Use MySQL Connector/J for MySQL DB instead of MariaDB Connector/J 
> --
>
> Key: SPARK-47537
> URL: https://issues.apache.org/jira/browse/SPARK-47537
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.4.1, 4.0.0, 3.5.2
>Reporter: Kent Yao
>Assignee: Kent Yao
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-47537) Use MySQL Connector/J for MySQL DB instead of MariaDB Connector/J

2024-03-25 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-47537:
-

Assignee: Kent Yao

> Use MySQL Connector/J for MySQL DB instead of MariaDB Connector/J 
> --
>
> Key: SPARK-47537
> URL: https://issues.apache.org/jira/browse/SPARK-47537
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.4.1, 4.0.0, 3.5.2
>Reporter: Kent Yao
>Assignee: Kent Yao
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-47538) Remove `commons-logging` dependency

2024-03-24 Thread Dongjoon Hyun (Jira)
Dongjoon Hyun created SPARK-47538:
-

 Summary: Remove `commons-logging` dependency
 Key: SPARK-47538
 URL: https://issues.apache.org/jira/browse/SPARK-47538
 Project: Spark
  Issue Type: Sub-task
  Components: Build
Affects Versions: 4.0.0
Reporter: Dongjoon Hyun






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-47536) Upgrade jmock-junit5 to 2.13.1

2024-03-24 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-47536:
-

Assignee: Yang Jie

> Upgrade jmock-junit5 to 2.13.1
> --
>
> Key: SPARK-47536
> URL: https://issues.apache.org/jira/browse/SPARK-47536
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Assignee: Yang Jie
>Priority: Major
>  Labels: pull-request-available
>
> https://github.com/jmock-developers/jmock-library/releases/tag/2.13.1



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47536) Upgrade jmock-junit5 to 2.13.1

2024-03-24 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-47536:
--
Parent: SPARK-47046
Issue Type: Sub-task  (was: Improvement)

> Upgrade jmock-junit5 to 2.13.1
> --
>
> Key: SPARK-47536
> URL: https://issues.apache.org/jira/browse/SPARK-47536
> Project: Spark
>  Issue Type: Sub-task
>  Components: Build
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Assignee: Yang Jie
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> https://github.com/jmock-developers/jmock-library/releases/tag/2.13.1



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-47536) Upgrade jmock-junit5 to 2.13.1

2024-03-24 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-47536.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 45669
[https://github.com/apache/spark/pull/45669]

> Upgrade jmock-junit5 to 2.13.1
> --
>
> Key: SPARK-47536
> URL: https://issues.apache.org/jira/browse/SPARK-47536
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Assignee: Yang Jie
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> https://github.com/jmock-developers/jmock-library/releases/tag/2.13.1



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-47535) Update `publish_snapshot.yml` to publish twice per day

2024-03-24 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-47535.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 45686
[https://github.com/apache/spark/pull/45686]

> Update `publish_snapshot.yml` to publish twice per day
> --
>
> Key: SPARK-47535
> URL: https://issues.apache.org/jira/browse/SPARK-47535
> Project: Spark
>  Issue Type: Task
>  Components: Project Infra
>Affects Versions: 4.0.0
>    Reporter: Dongjoon Hyun
>    Assignee: Dongjoon Hyun
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-47535) Update `publish_snapshot.yml` to publish twice per day

2024-03-24 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-47535:
-

Assignee: Dongjoon Hyun

> Update `publish_snapshot.yml` to publish twice per day
> --
>
> Key: SPARK-47535
> URL: https://issues.apache.org/jira/browse/SPARK-47535
> Project: Spark
>  Issue Type: Task
>  Components: Project Infra
>Affects Versions: 4.0.0
>    Reporter: Dongjoon Hyun
>    Assignee: Dongjoon Hyun
>Priority: Minor
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-47535) Update `publish_snapshot.yml` to publish twice per day

2024-03-24 Thread Dongjoon Hyun (Jira)
Dongjoon Hyun created SPARK-47535:
-

 Summary: Update `publish_snapshot.yml` to publish twice per day
 Key: SPARK-47535
 URL: https://issues.apache.org/jira/browse/SPARK-47535
 Project: Spark
  Issue Type: Task
  Components: Project Infra
Affects Versions: 4.0.0
Reporter: Dongjoon Hyun






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-47534) Move `o.a.s.variant` to `o.a.s.types.variant`

2024-03-24 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-47534.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 45685
[https://github.com/apache/spark/pull/45685]

> Move `o.a.s.variant` to `o.a.s.types.variant`
> -
>
> Key: SPARK-47534
> URL: https://issues.apache.org/jira/browse/SPARK-47534
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>    Reporter: Dongjoon Hyun
>    Assignee: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> {code}
> -package org.apache.spark.variant;
> +package org.apache.spark.types.variant;
> {code}
>  
> {code:java}
> -package org.apache.spark.sql.catalyst.expressions
> +package org.apache.spark.sql.catalyst.expressions.variant
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47534) Move `o.a.s.variant` to `o.a.s.types.variant`

2024-03-24 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-47534:
--
Parent: SPARK-44111
Issue Type: Sub-task  (was: Task)

> Move `o.a.s.variant` to `o.a.s.types.variant`
> -
>
> Key: SPARK-47534
> URL: https://issues.apache.org/jira/browse/SPARK-47534
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>    Reporter: Dongjoon Hyun
>    Assignee: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
>
> {code}
> -package org.apache.spark.variant;
> +package org.apache.spark.types.variant;
> {code}
>  
> {code:java}
> -package org.apache.spark.sql.catalyst.expressions
> +package org.apache.spark.sql.catalyst.expressions.variant
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-47534) Move `o.a.s.variant` to `o.a.s.types.variant`

2024-03-24 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-47534:
-

Assignee: Dongjoon Hyun

> Move `o.a.s.variant` to `o.a.s.types.variant`
> -
>
> Key: SPARK-47534
> URL: https://issues.apache.org/jira/browse/SPARK-47534
> Project: Spark
>  Issue Type: Task
>  Components: SQL
>Affects Versions: 4.0.0
>    Reporter: Dongjoon Hyun
>    Assignee: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
>
> {code}
> -package org.apache.spark.variant;
> +package org.apache.spark.types.variant;
> {code}
>  
> {code:java}
> -package org.apache.spark.sql.catalyst.expressions
> +package org.apache.spark.sql.catalyst.expressions.variant
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47534) Move `o.a.s.variant` to `o.a.s.types.variant`

2024-03-24 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-47534:
--
Description: 
{code}
-package org.apache.spark.variant;
+package org.apache.spark.types.variant;
{code}
 
{code:java}
-package org.apache.spark.sql.catalyst.expressions
+package org.apache.spark.sql.catalyst.expressions.variant
{code}
 

  was:
{code:java}
-package org.apache.spark.sql.catalyst.expressions
+package org.apache.spark.sql.catalyst.expressions.variant {code}
 


> Move `o.a.s.variant` to `o.a.s.types.variant`
> -
>
> Key: SPARK-47534
> URL: https://issues.apache.org/jira/browse/SPARK-47534
> Project: Spark
>  Issue Type: Task
>  Components: SQL
>Affects Versions: 4.0.0
>    Reporter: Dongjoon Hyun
>Priority: Major
>
> {code}
> -package org.apache.spark.variant;
> +package org.apache.spark.types.variant;
> {code}
>  
> {code:java}
> -package org.apache.spark.sql.catalyst.expressions
> +package org.apache.spark.sql.catalyst.expressions.variant
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-47533) Migrate scalafmt dialect to scala213

2024-03-24 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-47533.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 45683
[https://github.com/apache/spark/pull/45683]

> Migrate scalafmt dialect to scala213 
> -
>
> Key: SPARK-47533
> URL: https://issues.apache.org/jira/browse/SPARK-47533
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 4.0.0
>Reporter: BingKun Pan
>Assignee: BingKun Pan
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47503) Spark history sever fails to display query for cached JDBC relation named in quotes

2024-03-24 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-47503:
--
Fix Version/s: 3.5.2

> Spark history sever fails to display query for cached JDBC relation named in 
> quotes
> ---
>
> Key: SPARK-47503
> URL: https://issues.apache.org/jira/browse/SPARK-47503
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.2.1, 4.0.0
>Reporter: alexey
>Assignee: alexey
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0, 3.5.2
>
> Attachments: Screenshot_11.png, eventlog_v2_local-1711020585149.rar
>
>
> Spark history sever fails to display query for cached JDBC relation (or 
> calculation derived from it)  named in quotes
> (Screenshot and generated history in attachments)
> How to reproduce:
> {code:java}
> val ticketsDf = spark.read.jdbc("jdbc:postgresql://localhost:5432/demo", """ 
> "test-schema".tickets """.trim, properties)
> val bookingDf = spark.read.parquet("path/bookings")
> ticketsDf.cache().count()
> val resultDf = bookingDf.join(ticketsDf, Seq("book_ref"))
> resultDf.write.mode(SaveMode.Overwrite).parquet("path/result") {code}
>  
> So the problem is in SparkPlanGraphNode class which creates a dot node. When 
> there is no metrics to display it simply returns tagged name and in this case 
> name contains quotes which corrupts dot file.
> Suggested solution is to escape name string
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-47528) Add UserDefinedType support to DataTypeUtils.canWrite

2024-03-24 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-47528.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 45678
[https://github.com/apache/spark/pull/45678]

> Add UserDefinedType support to DataTypeUtils.canWrite
> -
>
> Key: SPARK-47528
> URL: https://issues.apache.org/jira/browse/SPARK-47528
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: L. C. Hsieh
>Assignee: L. C. Hsieh
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Our customer hits an issue recently when they tries to save a DataFrame 
> containing some UDTs as table (`saveAsTable`). The error looks like:
> ```
> - Cannot write 'xxx': struct<...> is incompatible with struct<...>
> ```
> The catalog strings between two sides are actually same which makes the 
> customer confused.
> It is because `DataTypeUtils.canWrite` doesn't handle `UserDefinedType`. If 
> the `UserDefinedType`'s underlying sql type is same as read side, `canWrite` 
> should return true for two sides.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-47526) Upgrade `netty` to 4.1.108.Final and `netty-tcnative` to 2.0.65.Final

2024-03-24 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-47526:
-

Assignee: BingKun Pan

> Upgrade `netty` to 4.1.108.Final and `netty-tcnative` to 2.0.65.Final
> -
>
> Key: SPARK-47526
> URL: https://issues.apache.org/jira/browse/SPARK-47526
> Project: Spark
>  Issue Type: Sub-task
>  Components: Build
>Affects Versions: 4.0.0
>Reporter: BingKun Pan
>Assignee: BingKun Pan
>Priority: Minor
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-47526) Upgrade `netty` to 4.1.108.Final and `netty-tcnative` to 2.0.65.Final

2024-03-24 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-47526.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 45676
[https://github.com/apache/spark/pull/45676]

> Upgrade `netty` to 4.1.108.Final and `netty-tcnative` to 2.0.65.Final
> -
>
> Key: SPARK-47526
> URL: https://issues.apache.org/jira/browse/SPARK-47526
> Project: Spark
>  Issue Type: Sub-task
>  Components: Build
>Affects Versions: 4.0.0
>Reporter: BingKun Pan
>Assignee: BingKun Pan
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-47503) Spark history sever fails to display query for cached JDBC relation named in quotes

2024-03-24 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-47503.
---
Fix Version/s: 4.0.0
 Assignee: alexey
   Resolution: Fixed

This is resolved via [https://github.com/apache/spark/pull/45640]

> Spark history sever fails to display query for cached JDBC relation named in 
> quotes
> ---
>
> Key: SPARK-47503
> URL: https://issues.apache.org/jira/browse/SPARK-47503
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.2.1, 4.0.0
>Reporter: alexey
>Assignee: alexey
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: Screenshot_11.png, eventlog_v2_local-1711020585149.rar
>
>
> Spark history sever fails to display query for cached JDBC relation (or 
> calculation derived from it)  named in quotes
> (Screenshot and generated history in attachments)
> How to reproduce:
> {code:java}
> val ticketsDf = spark.read.jdbc("jdbc:postgresql://localhost:5432/demo", """ 
> "test-schema".tickets """.trim, properties)
> val bookingDf = spark.read.parquet("path/bookings")
> ticketsDf.cache().count()
> val resultDf = bookingDf.join(ticketsDf, Seq("book_ref"))
> resultDf.write.mode(SaveMode.Overwrite).parquet("path/result") {code}
>  
> So the problem is in SparkPlanGraphNode class which creates a dot node. When 
> there is no metrics to display it simply returns tagged name and in this case 
> name contains quotes which corrupts dot file.
> Suggested solution is to escape name string
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-47497) Make `to_csv` support the output of `array/struct/map` as pretty strings

2024-03-24 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-47497.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 45657
[https://github.com/apache/spark/pull/45657]

> Make `to_csv` support the output of `array/struct/map` as pretty strings
> 
>
> Key: SPARK-47497
> URL: https://issues.apache.org/jira/browse/SPARK-47497
> Project: Spark
>  Issue Type: Improvement
>  Components: Project Infra
>Affects Versions: 4.0.0
>Reporter: BingKun Pan
>Assignee: BingKun Pan
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-47497) Make `to_csv` support the output of `array/struct/map` as pretty strings

2024-03-24 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-47497:
-

Assignee: BingKun Pan

> Make `to_csv` support the output of `array/struct/map` as pretty strings
> 
>
> Key: SPARK-47497
> URL: https://issues.apache.org/jira/browse/SPARK-47497
> Project: Spark
>  Issue Type: Improvement
>  Components: Project Infra
>Affects Versions: 4.0.0
>Reporter: BingKun Pan
>Assignee: BingKun Pan
>Priority: Minor
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-47531) Upgrade Arrow to 15.0.2

2024-03-24 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-47531.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 45682
[https://github.com/apache/spark/pull/45682]

> Upgrade Arrow to 15.0.2
> ---
>
> Key: SPARK-47531
> URL: https://issues.apache.org/jira/browse/SPARK-47531
> Project: Spark
>  Issue Type: Sub-task
>  Components: Build
>Affects Versions: 4.0.0
>Reporter: BingKun Pan
>Assignee: BingKun Pan
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-47530) Add `bcpkix-jdk18on` test dependencies to `hive` module for Hadoop 3.4.0

2024-03-23 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-47530.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 45681
[https://github.com/apache/spark/pull/45681]

> Add `bcpkix-jdk18on` test dependencies to `hive` module for Hadoop 3.4.0
> 
>
> Key: SPARK-47530
> URL: https://issues.apache.org/jira/browse/SPARK-47530
> Project: Spark
>  Issue Type: Sub-task
>  Components: Build, Tests
>Affects Versions: 4.0.0
>    Reporter: Dongjoon Hyun
>    Assignee: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47531) Upgrade Arrow to 15.0.2

2024-03-23 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-47531:
--
Summary: Upgrade Arrow to 15.0.2  (was: Upgrade `Arrow` to 15.0.2)

> Upgrade Arrow to 15.0.2
> ---
>
> Key: SPARK-47531
> URL: https://issues.apache.org/jira/browse/SPARK-47531
> Project: Spark
>  Issue Type: Sub-task
>  Components: Build
>Affects Versions: 4.0.0
>Reporter: BingKun Pan
>Assignee: BingKun Pan
>Priority: Minor
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47531) Upgrade `Arrow` to 15.0.2

2024-03-23 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-47531:
--
Parent Issue: SPARK-44111  (was: SPARK-47046)

> Upgrade `Arrow` to 15.0.2
> -
>
> Key: SPARK-47531
> URL: https://issues.apache.org/jira/browse/SPARK-47531
> Project: Spark
>  Issue Type: Sub-task
>  Components: Build
>Affects Versions: 4.0.0
>Reporter: BingKun Pan
>Priority: Minor
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47531) Upgrade `Arrow` to 15.0.2

2024-03-23 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-47531:
--
Summary: Upgrade `Arrow` to 15.0.2  (was: Upgrade `arrow-memory-netty` to 
15.0.2)

> Upgrade `Arrow` to 15.0.2
> -
>
> Key: SPARK-47531
> URL: https://issues.apache.org/jira/browse/SPARK-47531
> Project: Spark
>  Issue Type: Sub-task
>  Components: Build
>Affects Versions: 4.0.0
>Reporter: BingKun Pan
>Priority: Minor
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-46411) Change to use bcprov/bcpkix-jdk18on for test

2024-03-23 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-46411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-46411:
--
Parent: SPARK-44111
Issue Type: Sub-task  (was: Improvement)

> Change to use bcprov/bcpkix-jdk18on for test
> 
>
> Key: SPARK-46411
> URL: https://issues.apache.org/jira/browse/SPARK-46411
> Project: Spark
>  Issue Type: Sub-task
>  Components: Build
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Assignee: Yang Jie
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-47530) Add `bcpkix-jdk18on` test dependencies to `hive` module for Hadoop 3.4.0

2024-03-23 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-47530:
-

Assignee: Dongjoon Hyun

> Add `bcpkix-jdk18on` test dependencies to `hive` module for Hadoop 3.4.0
> 
>
> Key: SPARK-47530
> URL: https://issues.apache.org/jira/browse/SPARK-47530
> Project: Spark
>  Issue Type: Sub-task
>  Components: Build, Tests
>Affects Versions: 4.0.0
>    Reporter: Dongjoon Hyun
>    Assignee: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47530) Add `bcpkix-jdk18on` test dependencies to `hive` module for Hadoop 3.4.0

2024-03-23 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-47530:
--
Summary: Add `bcpkix-jdk18on` test dependencies to `hive` module for Hadoop 
3.4.0  (was: Add `org.bouncycastle` test dependencies to `hive` module for 
Hadoop 3.4.0)

> Add `bcpkix-jdk18on` test dependencies to `hive` module for Hadoop 3.4.0
> 
>
> Key: SPARK-47530
> URL: https://issues.apache.org/jira/browse/SPARK-47530
> Project: Spark
>  Issue Type: Sub-task
>  Components: Build, Tests
>Affects Versions: 4.0.0
>    Reporter: Dongjoon Hyun
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47530) Add `org.bouncycastle` test dependencies to `hive` module for Hadoop 3.4.0

2024-03-23 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-47530:
--
Parent: SPARK-44111
Issue Type: Sub-task  (was: Task)

> Add `org.bouncycastle` test dependencies to `hive` module for Hadoop 3.4.0
> --
>
> Key: SPARK-47530
> URL: https://issues.apache.org/jira/browse/SPARK-47530
> Project: Spark
>  Issue Type: Sub-task
>  Components: Build, Tests
>Affects Versions: 4.0.0
>    Reporter: Dongjoon Hyun
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-47530) Add `org.bouncycastle` test dependencies to `hive` module for Hadoop 3.4.0

2024-03-23 Thread Dongjoon Hyun (Jira)
Dongjoon Hyun created SPARK-47530:
-

 Summary: Add `org.bouncycastle` test dependencies to `hive` module 
for Hadoop 3.4.0
 Key: SPARK-47530
 URL: https://issues.apache.org/jira/browse/SPARK-47530
 Project: Spark
  Issue Type: Task
  Components: Build, Tests
Affects Versions: 4.0.0
Reporter: Dongjoon Hyun






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-45781) Upgrade Arrow to 14.0.0

2024-03-23 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-45781:
--
Parent: SPARK-44111
Issue Type: Sub-task  (was: Improvement)

> Upgrade Arrow to 14.0.0
> ---
>
> Key: SPARK-45781
> URL: https://issues.apache.org/jira/browse/SPARK-45781
> Project: Spark
>  Issue Type: Sub-task
>  Components: Build
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Assignee: Yang Jie
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> https://arrow.apache.org/release/14.0.0.html



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-46718) Upgrade Arrow to 15.0.0

2024-03-23 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-46718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-46718:
--
Parent: SPARK-44111
Issue Type: Sub-task  (was: Improvement)

> Upgrade Arrow to 15.0.0
> ---
>
> Key: SPARK-46718
> URL: https://issues.apache.org/jira/browse/SPARK-46718
> Project: Spark
>  Issue Type: Sub-task
>  Components: Build
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Assignee: Yang Jie
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: image-2024-01-15-14-02-57-814.png
>
>
> https://github.com/apache/arrow/releases/tag/apache-arrow-15.0.0



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47529) Use hadoop 3.4.0 in some docs

2024-03-23 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-47529:
--
Parent: SPARK-44111
Issue Type: Sub-task  (was: Improvement)

> Use hadoop 3.4.0 in some docs
> -
>
> Key: SPARK-47529
> URL: https://issues.apache.org/jira/browse/SPARK-47529
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation
>Affects Versions: 4.0.0
>Reporter: BingKun Pan
>Assignee: BingKun Pan
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-47529) Use hadoop 3.4.0 in some docs

2024-03-23 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-47529:
-

Assignee: BingKun Pan

> Use hadoop 3.4.0 in some docs
> -
>
> Key: SPARK-47529
> URL: https://issues.apache.org/jira/browse/SPARK-47529
> Project: Spark
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 4.0.0
>Reporter: BingKun Pan
>Assignee: BingKun Pan
>Priority: Minor
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-47529) Use hadoop 3.4.0 in some docs

2024-03-23 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-47529.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 45679
[https://github.com/apache/spark/pull/45679]

> Use hadoop 3.4.0 in some docs
> -
>
> Key: SPARK-47529
> URL: https://issues.apache.org/jira/browse/SPARK-47529
> Project: Spark
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: 4.0.0
>Reporter: BingKun Pan
>Assignee: BingKun Pan
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-47495) Primary resource jar added to spark.jars twice under k8s cluster mode

2024-03-23 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-47495:
-

Assignee: Jiale Tan

> Primary resource jar added to spark.jars twice under k8s cluster mode
> -
>
> Key: SPARK-47495
> URL: https://issues.apache.org/jira/browse/SPARK-47495
> Project: Spark
>  Issue Type: Bug
>  Components: Deploy, Kubernetes, Spark Core
>Affects Versions: 3.4.0, 3.5.0
>Reporter: Jiale Tan
>Assignee: Jiale Tan
>Priority: Minor
>  Labels: pull-request-available
>
> {*}Context{*}:
> To submit spark jobs to Kubernetes under cluster mode, the {{spark-submit}} 
> will be triggered twice. 
> The first time {{SparkSubmit}} will run under k8s cluster mode, it will 
> append primary resource to {{spark.jars}} and call 
> {{KubernetesClientApplication::start}} to create a driver pod. 
> The driver pod will run {{spark-submit}} again with the same primary resource 
> jar. However this time the {{SparkSubmit}} will run under client mode with 
> {{spark.kubernetes.submitInDriver}} as {{true}}, plus the updated 
> {{spark.jars}}. Under this mode, {{SparkSubmit}} will download all the jars 
> in {{spark.jars}} to driver and those {{spark.jars}} urls will be replaced by 
> the driver local paths. 
> Then SparkSubmit will append the same primary resource to spark.jars again. 
> So in this case, {{spark.jars}} will have 2 paths of duplicate copies of 
> primary resource, one with the original url user submit with, the other with 
> the driver local file path. 
> Later when driver starts the SparkContext, it will copy all the 
> {{spark.jars}} to {{spark.app.initial.jar.urls}}, and replace the driver 
> local jars paths in {{spark.app.initial.jar.urls}} with driver file service 
> paths. 
> Now all the jars in the {{--jars}} or `spark.jars` in the original user 
> submission will be replaced with a driver file service url and added to  
> {{spark.app.initial.jar.urls}}. And the primary resource jar in the original 
> submission will show up in {{spark.app.initial.jar.urls}} twice: one with the 
> original path in the user submission, the other with a driver file service 
> url.
> When executors start, they will download all the jars in the 
> {{spark.app.initial.jar.urls}}. 
> *Issue*:
> The executor will download 2 duplicate copies of primary resource, one with 
> the original url user submit with, the other with the driver local file path, 
> which leads to resource waste. This is also reported previously 
> [here|https://github.com/apache/spark/pull/37417#issuecomment-1517797912].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-47495) Primary resource jar added to spark.jars twice under k8s cluster mode

2024-03-23 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-47495.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 45607
[https://github.com/apache/spark/pull/45607]

> Primary resource jar added to spark.jars twice under k8s cluster mode
> -
>
> Key: SPARK-47495
> URL: https://issues.apache.org/jira/browse/SPARK-47495
> Project: Spark
>  Issue Type: Bug
>  Components: Deploy, Kubernetes, Spark Core
>Affects Versions: 3.4.0, 3.5.0
>Reporter: Jiale Tan
>Assignee: Jiale Tan
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> {*}Context{*}:
> To submit spark jobs to Kubernetes under cluster mode, the {{spark-submit}} 
> will be triggered twice. 
> The first time {{SparkSubmit}} will run under k8s cluster mode, it will 
> append primary resource to {{spark.jars}} and call 
> {{KubernetesClientApplication::start}} to create a driver pod. 
> The driver pod will run {{spark-submit}} again with the same primary resource 
> jar. However this time the {{SparkSubmit}} will run under client mode with 
> {{spark.kubernetes.submitInDriver}} as {{true}}, plus the updated 
> {{spark.jars}}. Under this mode, {{SparkSubmit}} will download all the jars 
> in {{spark.jars}} to driver and those {{spark.jars}} urls will be replaced by 
> the driver local paths. 
> Then SparkSubmit will append the same primary resource to spark.jars again. 
> So in this case, {{spark.jars}} will have 2 paths of duplicate copies of 
> primary resource, one with the original url user submit with, the other with 
> the driver local file path. 
> Later when driver starts the SparkContext, it will copy all the 
> {{spark.jars}} to {{spark.app.initial.jar.urls}}, and replace the driver 
> local jars paths in {{spark.app.initial.jar.urls}} with driver file service 
> paths. 
> Now all the jars in the {{--jars}} or `spark.jars` in the original user 
> submission will be replaced with a driver file service url and added to  
> {{spark.app.initial.jar.urls}}. And the primary resource jar in the original 
> submission will show up in {{spark.app.initial.jar.urls}} twice: one with the 
> original path in the user submission, the other with a driver file service 
> url.
> When executors start, they will download all the jars in the 
> {{spark.app.initial.jar.urls}}. 
> *Issue*:
> The executor will download 2 duplicate copies of primary resource, one with 
> the original url user submit with, the other with the driver local file path, 
> which leads to resource waste. This is also reported previously 
> [here|https://github.com/apache/spark/pull/37417#issuecomment-1517797912].



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-47510) Fix `DSTREAM` label pattern in `labeler.yml`

2024-03-23 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-47510:
-

Assignee: Dongjoon Hyun

> Fix `DSTREAM` label pattern in `labeler.yml`
> 
>
> Key: SPARK-47510
> URL: https://issues.apache.org/jira/browse/SPARK-47510
> Project: Spark
>  Issue Type: Bug
>  Components: Project Infra
>Affects Versions: 4.0.0
>    Reporter: Dongjoon Hyun
>    Assignee: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-47510) Fix `DSTREAM` label pattern in `labeler.yml`

2024-03-23 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-47510.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 45648
[https://github.com/apache/spark/pull/45648]

> Fix `DSTREAM` label pattern in `labeler.yml`
> 
>
> Key: SPARK-47510
> URL: https://issues.apache.org/jira/browse/SPARK-47510
> Project: Spark
>  Issue Type: Bug
>  Components: Project Infra
>Affects Versions: 4.0.0
>    Reporter: Dongjoon Hyun
>    Assignee: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-47522) Read MySQL FLOAT as FloatType to keep consistent with the write side

2024-03-22 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-47522.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 45666
[https://github.com/apache/spark/pull/45666]

> Read MySQL FLOAT as FloatType to keep consistent with the write side
> 
>
> Key: SPARK-47522
> URL: https://issues.apache.org/jira/browse/SPARK-47522
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Kent Yao
>Assignee: Kent Yao
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-47522) Read MySQL FLOAT as FloatType to keep consistent with the write side

2024-03-22 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-47522:
-

Assignee: Kent Yao

> Read MySQL FLOAT as FloatType to keep consistent with the write side
> 
>
> Key: SPARK-47522
> URL: https://issues.apache.org/jira/browse/SPARK-47522
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Kent Yao
>Assignee: Kent Yao
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-47521) Use `Utils.tryWithResource` during reading shuffle data from external storage

2024-03-22 Thread Dongjoon Hyun (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-47521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17829949#comment-17829949
 ] 

Dongjoon Hyun commented on SPARK-47521:
---

FYI, [~maheshk114], `Target Version` field is reserved for the Apache Spark 
committers. So, please don't set it with your aspiration.
- https://spark.apache.org/contributing.html

{quote}
Do not set the following fields:
Fix Version. This is assigned by committers only when resolved.
Target Version. This is assigned by committers to indicate a PR has been 
accepted for possible fix by the target version.
{quote}

> Use `Utils.tryWithResource` during reading shuffle data from external storage
> -
>
> Key: SPARK-47521
> URL: https://issues.apache.org/jira/browse/SPARK-47521
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.4.1, 3.5.0, 3.3.4
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0, 3.5.2, 3.4.3
>
>
> In method FallbackStorage.read method, the file handle is not closed if there 
> is a failure during read operation.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47521) Use `Utils.tryWithResource` during reading shuffle data from external storage

2024-03-22 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-47521:
--
Target Version/s:   (was: 4.0.0)

> Use `Utils.tryWithResource` during reading shuffle data from external storage
> -
>
> Key: SPARK-47521
> URL: https://issues.apache.org/jira/browse/SPARK-47521
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.4.1, 3.5.0, 3.3.4
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0, 3.5.2, 3.4.3
>
>
> In method FallbackStorage.read method, the file handle is not closed if there 
> is a failure during read operation.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47521) Use `Utils.tryWithResource` during reading shuffle data from external storage

2024-03-22 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-47521:
--
Summary: Use `Utils.tryWithResource` during reading shuffle data from 
external storage  (was: Fix file handle leakage during shuffle data read from 
external storage.)

> Use `Utils.tryWithResource` during reading shuffle data from external storage
> -
>
> Key: SPARK-47521
> URL: https://issues.apache.org/jira/browse/SPARK-47521
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.4.1, 3.5.0, 3.3.4
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0, 3.5.2, 3.4.3
>
>
> In method FallbackStorage.read method, the file handle is not closed if there 
> is a failure during read operation.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47521) Fix file handle leakage during shuffle data read from external storage.

2024-03-22 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-47521:
--
Issue Type: Bug  (was: Improvement)

> Fix file handle leakage during shuffle data read from external storage.
> ---
>
> Key: SPARK-47521
> URL: https://issues.apache.org/jira/browse/SPARK-47521
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 3.4.1, 3.5.0, 3.3.4
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0, 3.5.2, 3.4.3
>
>
> In method FallbackStorage.read method, the file handle is not closed if there 
> is a failure during read operation.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-47521) Fix file handle leakage during shuffle data read from external storage.

2024-03-22 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-47521.
---
Fix Version/s: 3.4.3
   3.5.2
   4.0.0
   Resolution: Fixed

Issue resolved by pull request 45663
[https://github.com/apache/spark/pull/45663]

> Fix file handle leakage during shuffle data read from external storage.
> ---
>
> Key: SPARK-47521
> URL: https://issues.apache.org/jira/browse/SPARK-47521
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.4.1, 3.5.0, 3.3.4
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.3, 3.5.2, 4.0.0
>
>
> In method FallbackStorage.read method, the file handle is not closed if there 
> is a failure during read operation.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-47521) Fix file handle leakage during shuffle data read from external storage.

2024-03-22 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-47521:
-

Assignee: mahesh kumar behera

> Fix file handle leakage during shuffle data read from external storage.
> ---
>
> Key: SPARK-47521
> URL: https://issues.apache.org/jira/browse/SPARK-47521
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.4.1, 3.5.0, 3.3.4
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
>
> In method FallbackStorage.read method, the file handle is not closed if there 
> is a failure during read operation.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-47523) Replace Deprecated `JsonParser#getCurrentName` with `JsonParser#currentName`

2024-03-22 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-47523.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 45668
[https://github.com/apache/spark/pull/45668]

> Replace Deprecated `JsonParser#getCurrentName` with `JsonParser#currentName`
> 
>
> Key: SPARK-47523
> URL: https://issues.apache.org/jira/browse/SPARK-47523
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Assignee: Yang Jie
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> [https://github.com/FasterXML/jackson-core/blob/8fba680579885bf9cdae72e93f16de557056d6e3/src/main/java/com/fasterxml/jackson/core/JsonParser.java#L1521-L1551]
>  
> {code:java}
>     /**
>      * Deprecated alias of {@link #currentName()}.
>      *
>      * @return Name of the current field in the parsing context
>      *
>      * @throws IOException for low-level read issues, or
>      *   {@link JsonParseException} for decoding problems
>      *
>      * @deprecated Since 2.17 use {@link #currentName} instead.
>      */
>     @Deprecated
>     public abstract String getCurrentName() throws IOException;    /**
>      * Method that can be called to get the name associated with
>      * the current token: for {@link JsonToken#FIELD_NAME}s it will
>      * be the same as what {@link #getText} returns;
>      * for field values it will be preceding field name;
>      * and for others (array values, root-level values) null.
>      *
>      * @return Name of the current field in the parsing context
>      *
>      * @throws IOException for low-level read issues, or
>      *   {@link JsonParseException} for decoding problems
>      *
>      * @since 2.10
>      */
>     public String currentName() throws IOException {
>         // !!! TODO: switch direction in 2.18 or later
>         return getCurrentName();
>     } {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-47523) Replace Deprecated `JsonParser#getCurrentName` with `JsonParser#currentName`

2024-03-22 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-47523:
-

Assignee: Yang Jie

> Replace Deprecated `JsonParser#getCurrentName` with `JsonParser#currentName`
> 
>
> Key: SPARK-47523
> URL: https://issues.apache.org/jira/browse/SPARK-47523
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Assignee: Yang Jie
>Priority: Major
>  Labels: pull-request-available
>
> [https://github.com/FasterXML/jackson-core/blob/8fba680579885bf9cdae72e93f16de557056d6e3/src/main/java/com/fasterxml/jackson/core/JsonParser.java#L1521-L1551]
>  
> {code:java}
>     /**
>      * Deprecated alias of {@link #currentName()}.
>      *
>      * @return Name of the current field in the parsing context
>      *
>      * @throws IOException for low-level read issues, or
>      *   {@link JsonParseException} for decoding problems
>      *
>      * @deprecated Since 2.17 use {@link #currentName} instead.
>      */
>     @Deprecated
>     public abstract String getCurrentName() throws IOException;    /**
>      * Method that can be called to get the name associated with
>      * the current token: for {@link JsonToken#FIELD_NAME}s it will
>      * be the same as what {@link #getText} returns;
>      * for field values it will be preceding field name;
>      * and for others (array values, root-level values) null.
>      *
>      * @return Name of the current field in the parsing context
>      *
>      * @throws IOException for low-level read issues, or
>      *   {@link JsonParseException} for decoding problems
>      *
>      * @since 2.10
>      */
>     public String currentName() throws IOException {
>         // !!! TODO: switch direction in 2.18 or later
>         return getCurrentName();
>     } {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (ORC-1663) [C++] Enable TestTimezone.testMissingTZDB on Windows

2024-03-22 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/ORC-1663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved ORC-1663.

Fix Version/s: 2.1.0
   Resolution: Fixed

Issue resolved by pull request 1856
[https://github.com/apache/orc/pull/1856]

> [C++] Enable TestTimezone.testMissingTZDB on Windows
> 
>
> Key: ORC-1663
> URL: https://issues.apache.org/jira/browse/ORC-1663
> Project: ORC
>  Issue Type: Sub-task
>Reporter: Gang Wu
>Assignee: Gang Wu
>Priority: Major
> Fix For: 2.1.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (SPARK-47499) Reuse `test_help_command` in Connect

2024-03-22 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-47499:
-

Assignee: Ruifeng Zheng

> Reuse `test_help_command` in Connect
> 
>
> Key: SPARK-47499
> URL: https://issues.apache.org/jira/browse/SPARK-47499
> Project: Spark
>  Issue Type: Test
>  Components: Connect, PySpark, Tests
>Affects Versions: 4.0.0
>Reporter: Ruifeng Zheng
>Assignee: Ruifeng Zheng
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-47499) Reuse `test_help_command` in Connect

2024-03-22 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-47499.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 45634
[https://github.com/apache/spark/pull/45634]

> Reuse `test_help_command` in Connect
> 
>
> Key: SPARK-47499
> URL: https://issues.apache.org/jira/browse/SPARK-47499
> Project: Spark
>  Issue Type: Test
>  Components: Connect, PySpark, Tests
>Affects Versions: 4.0.0
>Reporter: Ruifeng Zheng
>Assignee: Ruifeng Zheng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47513) Regenerate benchmark results

2024-03-21 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-47513:
--
Parent: SPARK-44111
Issue Type: Sub-task  (was: Task)

> Regenerate benchmark results
> 
>
> Key: SPARK-47513
> URL: https://issues.apache.org/jira/browse/SPARK-47513
> Project: Spark
>  Issue Type: Sub-task
>  Components: Tests
>Affects Versions: 4.0.0
>    Reporter: Dongjoon Hyun
>    Assignee: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-47514) Add a test coverage for createTable method (partitioned-table) in CatalogSuite

2024-03-21 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-47514.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 45637
[https://github.com/apache/spark/pull/45637]

> Add a test coverage for createTable method (partitioned-table) in CatalogSuite
> --
>
> Key: SPARK-47514
> URL: https://issues.apache.org/jira/browse/SPARK-47514
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL, Tests
>Affects Versions: 4.0.0
>Reporter: BingKun Pan
>Assignee: BingKun Pan
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-47514) Add a test coverage for createTable method (partitioned-table) in CatalogSuite

2024-03-21 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-47514:
-

Assignee: BingKun Pan

> Add a test coverage for createTable method (partitioned-table) in CatalogSuite
> --
>
> Key: SPARK-47514
> URL: https://issues.apache.org/jira/browse/SPARK-47514
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL, Tests
>Affects Versions: 4.0.0
>Reporter: BingKun Pan
>Assignee: BingKun Pan
>Priority: Minor
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-47513) Regenerate benchmark results

2024-03-21 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-47513.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 45654
[https://github.com/apache/spark/pull/45654]

> Regenerate benchmark results
> 
>
> Key: SPARK-47513
> URL: https://issues.apache.org/jira/browse/SPARK-47513
> Project: Spark
>  Issue Type: Task
>  Components: Tests
>Affects Versions: 4.0.0
>    Reporter: Dongjoon Hyun
>    Assignee: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-47513) Regenerate benchmark results

2024-03-21 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-47513:
-

Assignee: Dongjoon Hyun

> Regenerate benchmark results
> 
>
> Key: SPARK-47513
> URL: https://issues.apache.org/jira/browse/SPARK-47513
> Project: Spark
>  Issue Type: Task
>  Components: Tests
>Affects Versions: 4.0.0
>    Reporter: Dongjoon Hyun
>    Assignee: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-47513) Regenerate benchmark results

2024-03-21 Thread Dongjoon Hyun (Jira)
Dongjoon Hyun created SPARK-47513:
-

 Summary: Regenerate benchmark results
 Key: SPARK-47513
 URL: https://issues.apache.org/jira/browse/SPARK-47513
 Project: Spark
  Issue Type: Task
  Components: Tests
Affects Versions: 4.0.0
Reporter: Dongjoon Hyun






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-47502) Make the `size` of the installed packages output in `descending` order, and add a `header`

2024-03-21 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-47502.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 45630
[https://github.com/apache/spark/pull/45630]

> Make the `size` of the installed packages output in `descending` order, and 
> add a `header`
> --
>
> Key: SPARK-47502
> URL: https://issues.apache.org/jira/browse/SPARK-47502
> Project: Spark
>  Issue Type: Improvement
>  Components: Project Infra
>Affects Versions: 4.0.0
>Reporter: BingKun Pan
>Assignee: BingKun Pan
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-47502) Make the `size` of the installed packages output in `descending` order, and add a `header`

2024-03-21 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-47502:
-

Assignee: BingKun Pan

> Make the `size` of the installed packages output in `descending` order, and 
> add a `header`
> --
>
> Key: SPARK-47502
> URL: https://issues.apache.org/jira/browse/SPARK-47502
> Project: Spark
>  Issue Type: Improvement
>  Components: Project Infra
>Affects Versions: 4.0.0
>Reporter: BingKun Pan
>Assignee: BingKun Pan
>Priority: Minor
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-47507) Upgrade ORC to 1.9.3

2024-03-21 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-47507.
---
Fix Version/s: 3.5.2
   Resolution: Fixed

Issue resolved by pull request 45646
[https://github.com/apache/spark/pull/45646]

> Upgrade ORC to 1.9.3
> 
>
> Key: SPARK-47507
> URL: https://issues.apache.org/jira/browse/SPARK-47507
> Project: Spark
>  Issue Type: Bug
>  Components: Build
>Affects Versions: 3.5.2
>    Reporter: Dongjoon Hyun
>    Assignee: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.2
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-47510) Fix `DSTREAM` label pattern in `labeler.yml`

2024-03-21 Thread Dongjoon Hyun (Jira)
Dongjoon Hyun created SPARK-47510:
-

 Summary: Fix `DSTREAM` label pattern in `labeler.yml`
 Key: SPARK-47510
 URL: https://issues.apache.org/jira/browse/SPARK-47510
 Project: Spark
  Issue Type: Bug
  Components: Project Infra
Affects Versions: 4.0.0
Reporter: Dongjoon Hyun






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-47505) Fix `pyspark-errors` test jobs for branch-3.4

2024-03-21 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-47505:
-

Assignee: BingKun Pan

> Fix `pyspark-errors` test jobs for branch-3.4
> -
>
> Key: SPARK-47505
> URL: https://issues.apache.org/jira/browse/SPARK-47505
> Project: Spark
>  Issue Type: Improvement
>  Components: Project Infra
>Affects Versions: 3.4.3
>Reporter: BingKun Pan
>Assignee: BingKun Pan
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-47505) Fix `pyspark-errors` test jobs for branch-3.4

2024-03-21 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-47505.
---
Fix Version/s: 3.4.3
   Resolution: Fixed

Issue resolved by pull request 45624
[https://github.com/apache/spark/pull/45624]

> Fix `pyspark-errors` test jobs for branch-3.4
> -
>
> Key: SPARK-47505
> URL: https://issues.apache.org/jira/browse/SPARK-47505
> Project: Spark
>  Issue Type: Improvement
>  Components: Project Infra
>Affects Versions: 3.4.3
>Reporter: BingKun Pan
>Assignee: BingKun Pan
>Priority: Minor
> Fix For: 3.4.3
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-47046) Apache Spark 4.0.0 Dependency Audit and Cleanup

2024-03-21 Thread Dongjoon Hyun (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-47046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17829639#comment-17829639
 ] 

Dongjoon Hyun commented on SPARK-47046:
---

All known issues except SPARK-47018 are resolved as of now. SPARK-47018 will 
handle Hive 2.3.10 (and Thrift) upgrade under SPARK-44111 .

> Apache Spark 4.0.0 Dependency Audit and Cleanup
> ---
>
> Key: SPARK-47046
> URL: https://issues.apache.org/jira/browse/SPARK-47046
> Project: Spark
>  Issue Type: Umbrella
>  Components: Build
>Affects Versions: 4.0.0
>    Reporter: Dongjoon Hyun
>    Assignee: Dongjoon Hyun
>Priority: Major
>  Labels: releasenotes
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-47046) Apache Spark 4.0.0 Dependency Audit and Cleanup

2024-03-21 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-47046.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

> Apache Spark 4.0.0 Dependency Audit and Cleanup
> ---
>
> Key: SPARK-47046
> URL: https://issues.apache.org/jira/browse/SPARK-47046
> Project: Spark
>  Issue Type: Umbrella
>  Components: Build
>Affects Versions: 4.0.0
>    Reporter: Dongjoon Hyun
>    Assignee: Dongjoon Hyun
>Priority: Major
>  Labels: releasenotes
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-47046) Apache Spark 4.0.0 Dependency Audit and Cleanup

2024-03-21 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-47046:
-

Assignee: Dongjoon Hyun

> Apache Spark 4.0.0 Dependency Audit and Cleanup
> ---
>
> Key: SPARK-47046
> URL: https://issues.apache.org/jira/browse/SPARK-47046
> Project: Spark
>  Issue Type: Umbrella
>  Components: Build
>Affects Versions: 4.0.0
>    Reporter: Dongjoon Hyun
>    Assignee: Dongjoon Hyun
>Priority: Major
>  Labels: releasenotes
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



Re: [ANNOUNCE] Announcing Apache ORC 1.9.3

2024-03-21 Thread Dongjoon Hyun
Thank you!

Dongjoon.

On Wed, Mar 20, 2024 at 10:15 PM Gang Wu  wrote:

> Hi All.
>
> We are happy to announce the availability of Apache ORC 1.9.3!
>
> https://orc.apache.org/news/2024/03/20/ORC-1.9.3/
>
> 1.9.3 is a maintenance release containing important fixes.
> It's available in Apache Downloads and Maven Central.
>
> https://downloads.apache.org/orc/orc-1.9.3/
> https://repo1.maven.org/maven2/org/apache/orc/orc-core/1.9.3/
>
> Cheers,
> Gang
>


Re: [ANNOUNCE] Announcing Apache ORC 1.9.3

2024-03-21 Thread Dongjoon Hyun
Thank you!

Dongjoon.

On Wed, Mar 20, 2024 at 10:15 PM Gang Wu  wrote:

> Hi All.
>
> We are happy to announce the availability of Apache ORC 1.9.3!
>
> https://orc.apache.org/news/2024/03/20/ORC-1.9.3/
>
> 1.9.3 is a maintenance release containing important fixes.
> It's available in Apache Downloads and Maven Central.
>
> https://downloads.apache.org/orc/orc-1.9.3/
> https://repo1.maven.org/maven2/org/apache/orc/orc-core/1.9.3/
>
> Cheers,
> Gang
>


[jira] [Assigned] (SPARK-47487) simplify code in AnsiTypeCoercion

2024-03-21 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-47487:
-

Assignee: Wenchen Fan

> simplify code in AnsiTypeCoercion
> -
>
> Key: SPARK-47487
> URL: https://issues.apache.org/jira/browse/SPARK-47487
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Wenchen Fan
>Assignee: Wenchen Fan
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-47487) simplify code in AnsiTypeCoercion

2024-03-21 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-47487.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 45612
[https://github.com/apache/spark/pull/45612]

> simplify code in AnsiTypeCoercion
> -
>
> Key: SPARK-47487
> URL: https://issues.apache.org/jira/browse/SPARK-47487
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Wenchen Fan
>Assignee: Wenchen Fan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-47501) Add convertDateToDate like the existing convertTimestampToTimestamp for JdbcDialects

2024-03-21 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-47501.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 45638
[https://github.com/apache/spark/pull/45638]

> Add convertDateToDate like the existing convertTimestampToTimestamp for 
> JdbcDialects
> 
>
> Key: SPARK-47501
> URL: https://issues.apache.org/jira/browse/SPARK-47501
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Kent Yao
>Assignee: Kent Yao
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-47501) Add convertDateToDate like the existing convertTimestampToTimestamp for JdbcDialects

2024-03-21 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-47501:
-

Assignee: Kent Yao

> Add convertDateToDate like the existing convertTimestampToTimestamp for 
> JdbcDialects
> 
>
> Key: SPARK-47501
> URL: https://issues.apache.org/jira/browse/SPARK-47501
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Kent Yao
>Assignee: Kent Yao
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-41888) Support StreamingQueryListener for DataFrame.observe

2024-03-21 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-41888.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 45627
[https://github.com/apache/spark/pull/45627]

> Support StreamingQueryListener for DataFrame.observe
> 
>
> Key: SPARK-41888
> URL: https://issues.apache.org/jira/browse/SPARK-41888
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Jiaan Geng
>Assignee: Jiaan Geng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> {code:java}
> **
> 1334
> File "/__w/spark/spark/python/pyspark/sql/connect/dataframe.py", line 619, in 
> pyspark.sql.connect.dataframe.DataFrame.observe
> 1335
> Failed example:
> 1336
> observation.get
> 1337
> Exception raised:
> 1338
> Traceback (most recent call last):
> 1339
>   File "/usr/lib/python3.9/doctest.py", line 1336, in __run
> 1340
> exec(compile(example.source, filename, "single",
> 1341
>   File "", 
> line 1, in 
> 1342
> observation.get
> 1343
>   File "/__w/spark/spark/python/pyspark/sql/utils.py", line 378, in 
> wrapped
> 1344
> raise NotImplementedError()
> 1345
> NotImplementedError
> 1346
> **
> 1347
> File "/__w/spark/spark/python/pyspark/sql/connect/dataframe.py", line 642, in 
> pyspark.sql.connect.dataframe.DataFrame.observe
> 1348
> Failed example:
> 1349
> spark.streams.addListener(MyErrorListener())
> 1350
> Exception raised:
> 1351
> Traceback (most recent call last):
> 1352
>   File "/usr/lib/python3.9/doctest.py", line 1336, in __run
> 1353
> exec(compile(example.source, filename, "single",
> 1354
>   File "", 
> line 1, in 
> 1355
> spark.streams.addListener(MyErrorListener())
> 1356
> AttributeError: 'SparkSession' object has no attribute 'streams'
> 1357
> **
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-41888) Support StreamingQueryListener for DataFrame.observe

2024-03-21 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-41888:
-

Assignee: Jiaan Geng

> Support StreamingQueryListener for DataFrame.observe
> 
>
> Key: SPARK-41888
> URL: https://issues.apache.org/jira/browse/SPARK-41888
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Jiaan Geng
>Assignee: Jiaan Geng
>Priority: Major
>  Labels: pull-request-available
>
> {code:java}
> **
> 1334
> File "/__w/spark/spark/python/pyspark/sql/connect/dataframe.py", line 619, in 
> pyspark.sql.connect.dataframe.DataFrame.observe
> 1335
> Failed example:
> 1336
> observation.get
> 1337
> Exception raised:
> 1338
> Traceback (most recent call last):
> 1339
>   File "/usr/lib/python3.9/doctest.py", line 1336, in __run
> 1340
> exec(compile(example.source, filename, "single",
> 1341
>   File "", 
> line 1, in 
> 1342
> observation.get
> 1343
>   File "/__w/spark/spark/python/pyspark/sql/utils.py", line 378, in 
> wrapped
> 1344
> raise NotImplementedError()
> 1345
> NotImplementedError
> 1346
> **
> 1347
> File "/__w/spark/spark/python/pyspark/sql/connect/dataframe.py", line 642, in 
> pyspark.sql.connect.dataframe.DataFrame.observe
> 1348
> Failed example:
> 1349
> spark.streams.addListener(MyErrorListener())
> 1350
> Exception raised:
> 1351
> Traceback (most recent call last):
> 1352
>   File "/usr/lib/python3.9/doctest.py", line 1336, in __run
> 1353
> exec(compile(example.source, filename, "single",
> 1354
>   File "", 
> line 1, in 
> 1355
> spark.streams.addListener(MyErrorListener())
> 1356
> AttributeError: 'SparkSession' object has no attribute 'streams'
> 1357
> **
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47008) Spark to support S3 Express One Zone Storage

2024-03-20 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-47008:
--
Affects Version/s: 4.0.0
   (was: 3.5.1)

> Spark to support S3 Express One Zone Storage
> 
>
> Key: SPARK-47008
> URL: https://issues.apache.org/jira/browse/SPARK-47008
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 4.0.0
>Reporter: Steve Loughran
>Priority: Major
>
> Hadoop 3.4.0 adds support for AWS S3 Express One Zone Storage.
> Most of this is transparent. However, one aspect which can surface as an 
> issue is that these stores report prefixes in a listing when there are 
> pending uploads, *even when there are no files underneath*
> This leads to a situation where a listStatus of a path returns a list of file 
> status entries which appears to contain one or more directories -but a 
> listStatus on that path raises a FileNotFoundException: there is nothing 
> there.
> HADOOP-18996 handles this in all of hadoop code, including FileInputFormat, 
> A filesystem can now be probed for inconsistent directoriy listings through 
> {{fs.hasPathCapability(path, "fs.capability.directory.listing.inconsistent")}}
> If true, then treewalking code SHOULD NOT report a failure if, when walking 
> into a subdirectory, a list/getFileStatus on that directory raises a 
> FileNotFoundException.
> Although most of this is handled in the hadoop code, but there some places 
> where treewalking is done inside spark These need to be identified and make 
> resilient to failure on the recurse down the tree
> * SparkHadoopUtil list methods , 
> * especially listLeafStatuses used by OrcFileOperator
> org.apache.spark.util.Utils#fetchHcfsFile
> {{org.apache.hadoop.fs.FileUtil.maybeIgnoreMissingDirectory()}} can assist 
> here, or the logic can be replicated. Using the hadoop implementation would 
> be better from a maintenance perspective



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47008) Spark to support S3 Express One Zone Storage

2024-03-20 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-47008:
--
Parent Issue: SPARK-44111  (was: SPARK-44124)

> Spark to support S3 Express One Zone Storage
> 
>
> Key: SPARK-47008
> URL: https://issues.apache.org/jira/browse/SPARK-47008
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 4.0.0
>Reporter: Steve Loughran
>Priority: Major
>
> Hadoop 3.4.0 adds support for AWS S3 Express One Zone Storage.
> Most of this is transparent. However, one aspect which can surface as an 
> issue is that these stores report prefixes in a listing when there are 
> pending uploads, *even when there are no files underneath*
> This leads to a situation where a listStatus of a path returns a list of file 
> status entries which appears to contain one or more directories -but a 
> listStatus on that path raises a FileNotFoundException: there is nothing 
> there.
> HADOOP-18996 handles this in all of hadoop code, including FileInputFormat, 
> A filesystem can now be probed for inconsistent directoriy listings through 
> {{fs.hasPathCapability(path, "fs.capability.directory.listing.inconsistent")}}
> If true, then treewalking code SHOULD NOT report a failure if, when walking 
> into a subdirectory, a list/getFileStatus on that directory raises a 
> FileNotFoundException.
> Although most of this is handled in the hadoop code, but there some places 
> where treewalking is done inside spark These need to be identified and make 
> resilient to failure on the recurse down the tree
> * SparkHadoopUtil list methods , 
> * especially listLeafStatuses used by OrcFileOperator
> org.apache.spark.util.Utils#fetchHcfsFile
> {{org.apache.hadoop.fs.FileUtil.maybeIgnoreMissingDirectory()}} can assist 
> here, or the logic can be replicated. Using the hadoop implementation would 
> be better from a maintenance perspective



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Closed] (SPARK-46793) Revert S3A endpoint fixup logic of SPARK-35878

2024-03-20 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-46793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun closed SPARK-46793.
-

> Revert S3A endpoint fixup logic of SPARK-35878
> --
>
> Key: SPARK-46793
> URL: https://issues.apache.org/jira/browse/SPARK-46793
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 3.5.0, 3.4.3
>Reporter: Steve Loughran
>Priority: Major
>
> The v2 SDK does its region resolution "differently", and the changes of 
> SPARK-35878 actually create problems.
> That PR went in to fix  a regression in Hadoop 3.3.1 which has been fixed 
> since 3.3.2; removing it is not going to cause problems on anyone not using 
> the 3.3.1 release, which is 3 years old and replaced by multiple follow on 
> 3.3.x releases



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-46793) Revert S3A endpoint fixup logic of SPARK-35878

2024-03-20 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-46793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-46793.
---
Resolution: Duplicate

> Revert S3A endpoint fixup logic of SPARK-35878
> --
>
> Key: SPARK-46793
> URL: https://issues.apache.org/jira/browse/SPARK-46793
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 3.5.0, 3.4.3
>Reporter: Steve Loughran
>Priority: Major
>
> The v2 SDK does its region resolution "differently", and the changes of 
> SPARK-35878 actually create problems.
> That PR went in to fix  a regression in Hadoop 3.3.1 which has been fixed 
> since 3.3.2; removing it is not going to cause problems on anyone not using 
> the 3.3.1 release, which is 3 years old and replaced by multiple follow on 
> 3.3.x releases



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-45721) Upgrade AWS SDK to v2 for Hadoop dependency

2024-03-20 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-45721.
---
Resolution: Duplicate

> Upgrade AWS SDK to v2 for Hadoop dependency
> ---
>
> Key: SPARK-45721
> URL: https://issues.apache.org/jira/browse/SPARK-45721
> Project: Spark
>  Issue Type: Sub-task
>  Components: Build
>Affects Versions: 3.5.0
>Reporter: Lantao Jin
>Priority: Major
>
> Hadoop is planning to ship the SDKv2 upgrade in 3.4.0, as shown in this 
> [HADOOP-18073|https://issues.apache.org/jira/browse/HADOOP-18073]. One of the 
> Hadoop modules that Spark relies on is *hadoop-aws*, which comes with S3A 
> connector that allows Spark to access data in S3 buckets. Hadoop-aws contains 
> dependency on AWS SDKv1, thus we should also update the Hadoop version to 
> 3.4.0.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Closed] (SPARK-45721) Upgrade AWS SDK to v2 for Hadoop dependency

2024-03-20 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun closed SPARK-45721.
-

> Upgrade AWS SDK to v2 for Hadoop dependency
> ---
>
> Key: SPARK-45721
> URL: https://issues.apache.org/jira/browse/SPARK-45721
> Project: Spark
>  Issue Type: Sub-task
>  Components: Build
>Affects Versions: 3.5.0
>Reporter: Lantao Jin
>Priority: Major
>
> Hadoop is planning to ship the SDKv2 upgrade in 3.4.0, as shown in this 
> [HADOOP-18073|https://issues.apache.org/jira/browse/HADOOP-18073]. One of the 
> Hadoop modules that Spark relies on is *hadoop-aws*, which comes with S3A 
> connector that allows Spark to access data in S3 buckets. Hadoop-aws contains 
> dependency on AWS SDKv1, thus we should also update the Hadoop version to 
> 3.4.0.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-47491) Re-enable `driver log links` test in YarnClusterSuite

2024-03-20 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-47491:
-

Assignee: Dongjoon Hyun

> Re-enable `driver log links` test in YarnClusterSuite
> -
>
> Key: SPARK-47491
> URL: https://issues.apache.org/jira/browse/SPARK-47491
> Project: Spark
>  Issue Type: Sub-task
>  Components: Tests, YARN
>Affects Versions: 4.0.0
>    Reporter: Dongjoon Hyun
>    Assignee: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-47054) Remove pinned version of torch for Python 3.12 support

2024-03-20 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-47054:
--
Parent: SPARK-44111
Issue Type: Sub-task  (was: Bug)

> Remove pinned version of torch for Python 3.12 support
> --
>
> Key: SPARK-47054
> URL: https://issues.apache.org/jira/browse/SPARK-47054
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark, Tests
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Assignee: Hyukjin Kwon
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Basically a revert of SPARK-45409



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-47494) Add migration doc for the behavior change of Parquet timestamp inference since Spark 3.3

2024-03-20 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-47494.
---
Fix Version/s: 3.4.3
   3.5.2
   4.0.0
   Resolution: Fixed

Issue resolved by pull request 45623
[https://github.com/apache/spark/pull/45623]

> Add migration doc for the behavior change of Parquet timestamp inference 
> since Spark 3.3
> 
>
> Key: SPARK-47494
> URL: https://issues.apache.org/jira/browse/SPARK-47494
> Project: Spark
>  Issue Type: Sub-task
>  Components: Documentation
>Affects Versions: 4.0.0, 3.5.2, 3.4.3
>Reporter: Gengliang Wang
>Assignee: Gengliang Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.3, 3.5.2, 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-47490) Fix RocksDB Logger deprecation warning

2024-03-20 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-47490.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 45616
[https://github.com/apache/spark/pull/45616]

> Fix RocksDB Logger deprecation warning
> --
>
> Key: SPARK-47490
> URL: https://issues.apache.org/jira/browse/SPARK-47490
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Affects Versions: 4.0.0
>Reporter: Anish Shrigondekar
>Assignee: Anish Shrigondekar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Fix RocksDB Logger deprecation warning



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-47490) Fix RocksDB Logger deprecation warning

2024-03-20 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-47490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-47490:
-

Assignee: Anish Shrigondekar

> Fix RocksDB Logger deprecation warning
> --
>
> Key: SPARK-47490
> URL: https://issues.apache.org/jira/browse/SPARK-47490
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Affects Versions: 4.0.0
>Reporter: Anish Shrigondekar
>Assignee: Anish Shrigondekar
>Priority: Major
>  Labels: pull-request-available
>
> Fix RocksDB Logger deprecation warning



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



<    4   5   6   7   8   9   10   11   12   13   >