[jira] [Commented] (SPARK-48149) Serialize `build_python.yml` to run a single Python version per cron schedule

2024-05-08 Thread Dongjoon Hyun (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-48149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17844713#comment-17844713
 ] 

Dongjoon Hyun commented on SPARK-48149:
---

This is technically reverted via SPARK-48200

> Serialize `build_python.yml` to run a single Python version per cron schedule
> -
>
> Key: SPARK-48149
> URL: https://issues.apache.org/jira/browse/SPARK-48149
> Project: Spark
>  Issue Type: Sub-task
>  Components: Project Infra
>Affects Versions: 4.0.0
>    Reporter: Dongjoon Hyun
>    Assignee: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48200) Split `build_python.yml` into per-version cron jobs

2024-05-08 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-48200.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46477
[https://github.com/apache/spark/pull/46477]

> Split `build_python.yml` into per-version cron jobs
> ---
>
> Key: SPARK-48200
> URL: https://issues.apache.org/jira/browse/SPARK-48200
> Project: Spark
>  Issue Type: Sub-task
>  Components: Project Infra
>Affects Versions: 4.0.0
>    Reporter: Dongjoon Hyun
>    Assignee: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-48149) Serialize `build_python.yml` to run a single Python version per cron schedule

2024-05-08 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-48149:
-

Assignee: (was: Dongjoon Hyun)

> Serialize `build_python.yml` to run a single Python version per cron schedule
> -
>
> Key: SPARK-48149
> URL: https://issues.apache.org/jira/browse/SPARK-48149
> Project: Spark
>  Issue Type: Sub-task
>  Components: Project Infra
>Affects Versions: 4.0.0
>    Reporter: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-48200) Split `build_python.yml` into per-version cron jobs

2024-05-08 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-48200:
-

Assignee: Dongjoon Hyun

> Split `build_python.yml` into per-version cron jobs
> ---
>
> Key: SPARK-48200
> URL: https://issues.apache.org/jira/browse/SPARK-48200
> Project: Spark
>  Issue Type: Sub-task
>  Components: Project Infra
>Affects Versions: 4.0.0
>    Reporter: Dongjoon Hyun
>    Assignee: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-48200) Split `build_python.yml` into per-version cron jobs

2024-05-08 Thread Dongjoon Hyun (Jira)
Dongjoon Hyun created SPARK-48200:
-

 Summary: Split `build_python.yml` into per-version cron jobs
 Key: SPARK-48200
 URL: https://issues.apache.org/jira/browse/SPARK-48200
 Project: Spark
  Issue Type: Sub-task
  Components: Project Infra
Affects Versions: 4.0.0
Reporter: Dongjoon Hyun






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-48198) Upgrade jackson to 2.17.1

2024-05-08 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-48198:
--
Parent: SPARK-47046
Issue Type: Sub-task  (was: Improvement)

> Upgrade jackson to 2.17.1
> -
>
> Key: SPARK-48198
> URL: https://issues.apache.org/jira/browse/SPARK-48198
> Project: Spark
>  Issue Type: Sub-task
>  Components: Build
>Affects Versions: 4.0.0
>Reporter: BingKun Pan
>Assignee: BingKun Pan
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48198) Upgrade jackson to 2.17.1

2024-05-08 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-48198.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46476
[https://github.com/apache/spark/pull/46476]

> Upgrade jackson to 2.17.1
> -
>
> Key: SPARK-48198
> URL: https://issues.apache.org/jira/browse/SPARK-48198
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 4.0.0
>Reporter: BingKun Pan
>Assignee: BingKun Pan
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-48094) Reduce GitHub Action usage according to ASF project allowance

2024-05-08 Thread Dongjoon Hyun (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-48094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17844690#comment-17844690
 ] 

Dongjoon Hyun commented on SPARK-48094:
---

Actually, I'm afraid not because we still have some violations.

ASF INFRA policy follows IETF terminology
{quote}1. MUST This word, or the terms "REQUIRED" or "SHALL", mean that the 
definition is an absolute requirement of the specification.
{quote}
And, 3 of 4 policies are `MUST` level like the following.
 - All workflows MUST have a job concurrency level less than or equal to 20. 
This means a workflow cannot have more than 20 jobs running at the same time 
across all matrices.
 - The average number of minutes a project uses per calendar week MUST NOT 
exceed the equivalent of 25 full-time runners (250,000 minutes, or 4,200 hours).
 - The average number of minutes a project uses in any consecutive five-day 
period MUST NOT exceed the equivalent of 30 full-time runners (216,000 minutes, 
or 3,600 hours).

Let me reopen this. We need to do audit and add a comment to all YAML files in 
order to prevent a future regression, [~gurwls223] .

> Reduce GitHub Action usage according to ASF project allowance
> -
>
> Key: SPARK-48094
> URL: https://issues.apache.org/jira/browse/SPARK-48094
> Project: Spark
>  Issue Type: Umbrella
>  Components: Project Infra
>Affects Versions: 4.0.0
>    Reporter: Dongjoon Hyun
>Assignee: Dongjoon Hyun
>Priority: Blocker
> Fix For: 4.0.0
>
> Attachments: Screenshot 2024-05-02 at 23.56.05.png
>
>
> h2. ASF INFRA POLICY
> - https://infra.apache.org/github-actions-policy.html
> h2. MONITORING
> - https://infra-reports.apache.org/#ghactions=spark=168
>  !Screenshot 2024-05-02 at 23.56.05.png|width=100%! 
> h2. TARGET
> * All workflows MUST have a job concurrency level less than or equal to 20. 
> This means a workflow cannot have more than 20 jobs running at the same time 
> across all matrices.
> * All workflows SHOULD have a job concurrency level less than or equal to 15. 
> Just because 20 is the max, doesn't mean you should strive for 20.
> * The average number of minutes a project uses per calendar week MUST NOT 
> exceed the equivalent of 25 full-time runners (250,000 minutes, or 4,200 
> hours).
> * The average number of minutes a project uses in any consecutive five-day 
> period MUST NOT exceed the equivalent of 30 full-time runners (216,000 
> minutes, or 3,600 hours).
> h2. DEADLINE
> bq. 17th of May, 2024
> Since the deadline is 17th of May, 2024, I set this as the highest priority, 
> `Blocker`.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Reopened] (SPARK-48094) Reduce GitHub Action usage according to ASF project allowance

2024-05-08 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reopened SPARK-48094:
---
  Assignee: (was: Dongjoon Hyun)

> Reduce GitHub Action usage according to ASF project allowance
> -
>
> Key: SPARK-48094
> URL: https://issues.apache.org/jira/browse/SPARK-48094
> Project: Spark
>  Issue Type: Umbrella
>  Components: Project Infra
>Affects Versions: 4.0.0
>    Reporter: Dongjoon Hyun
>Priority: Blocker
> Fix For: 4.0.0
>
> Attachments: Screenshot 2024-05-02 at 23.56.05.png
>
>
> h2. ASF INFRA POLICY
> - https://infra.apache.org/github-actions-policy.html
> h2. MONITORING
> - https://infra-reports.apache.org/#ghactions=spark=168
>  !Screenshot 2024-05-02 at 23.56.05.png|width=100%! 
> h2. TARGET
> * All workflows MUST have a job concurrency level less than or equal to 20. 
> This means a workflow cannot have more than 20 jobs running at the same time 
> across all matrices.
> * All workflows SHOULD have a job concurrency level less than or equal to 15. 
> Just because 20 is the max, doesn't mean you should strive for 20.
> * The average number of minutes a project uses per calendar week MUST NOT 
> exceed the equivalent of 25 full-time runners (250,000 minutes, or 4,200 
> hours).
> * The average number of minutes a project uses in any consecutive five-day 
> period MUST NOT exceed the equivalent of 30 full-time runners (216,000 
> minutes, or 3,600 hours).
> h2. DEADLINE
> bq. 17th of May, 2024
> Since the deadline is 17th of May, 2024, I set this as the highest priority, 
> `Blocker`.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48184) Always set the seed of dataframe.sample in Client side

2024-05-08 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-48184.
---
Fix Version/s: 3.5.2
   4.0.0
   Resolution: Fixed

Issue resolved by pull request 46456
[https://github.com/apache/spark/pull/46456]

> Always set the seed of dataframe.sample in Client side
> --
>
> Key: SPARK-48184
> URL: https://issues.apache.org/jira/browse/SPARK-48184
> Project: Spark
>  Issue Type: Bug
>  Components: Connect, PySpark
>Affects Versions: 4.0.0, 3.5.1, 3.4.3
>Reporter: Ruifeng Zheng
>Assignee: Ruifeng Zheng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.2, 4.0.0
>
>
> the output dataframe of `sample` is not immutable in Spark Connect
>  
> In Spark Classic:
> {code:java}
> In [1]: df = spark.range(1).sample(0.1)
> In [2]: [df.count() for i in range(10)]
> Out[2]: [1006, 1006, 1006, 1006, 1006, 1006, 1006, 1006, 1006, 1006]{code}
>  
> In Spark Connect:
> {code:java}
> In [1]: df = spark.range(1).sample(0.1)
> In [2]: [df.count() for i in range(10)]
> Out[2]: [969, 1005, 958, 996, 987, 1026, 991, 1020, 1012, 979]
>  {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-48193) Make `maven-deploy-plugin` retry 3 times

2024-05-08 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-48193:
--
Parent: SPARK-48094
Issue Type: Sub-task  (was: Improvement)

> Make `maven-deploy-plugin` retry 3 times
> 
>
> Key: SPARK-48193
> URL: https://issues.apache.org/jira/browse/SPARK-48193
> Project: Spark
>  Issue Type: Sub-task
>  Components: Project Infra
>Affects Versions: 4.0.0
>Reporter: BingKun Pan
>Assignee: BingKun Pan
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-48037) SortShuffleWriter lacks shuffle write related metrics resulting in potentially inaccurate data

2024-05-08 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-48037:
--
Fix Version/s: 3.4.4

> SortShuffleWriter lacks shuffle write related metrics resulting in 
> potentially inaccurate data
> --
>
> Key: SPARK-48037
> URL: https://issues.apache.org/jira/browse/SPARK-48037
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core, SQL
>Affects Versions: 3.3.0, 4.0.0, 3.5.1, 3.4.3
>Reporter: dzcxzl
>Assignee: dzcxzl
>Priority: Blocker
>  Labels: correctness, pull-request-available
> Fix For: 4.0.0, 3.5.2, 3.4.4
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-48187) Run `docs` only in PR builders and `build_non_ansi` Daily CI

2024-05-08 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-48187:
--
Summary: Run `docs` only in PR builders and `build_non_ansi` Daily CI  
(was: Run `docs` only in PR builders and Java 21 Daily CI)

> Run `docs` only in PR builders and `build_non_ansi` Daily CI
> 
>
> Key: SPARK-48187
> URL: https://issues.apache.org/jira/browse/SPARK-48187
> Project: Spark
>  Issue Type: Sub-task
>  Components: Project Infra
>Affects Versions: 4.0.0
>    Reporter: Dongjoon Hyun
>    Assignee: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48187) Run `docs` only in PR builders and Java 21 Daily CI

2024-05-08 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-48187.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46463
[https://github.com/apache/spark/pull/46463]

> Run `docs` only in PR builders and Java 21 Daily CI
> ---
>
> Key: SPARK-48187
> URL: https://issues.apache.org/jira/browse/SPARK-48187
> Project: Spark
>  Issue Type: Sub-task
>  Components: Project Infra
>Affects Versions: 4.0.0
>    Reporter: Dongjoon Hyun
>    Assignee: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-48187) Run `docs` only in PR builders and Java 21 Daily CI

2024-05-08 Thread Dongjoon Hyun (Jira)
Dongjoon Hyun created SPARK-48187:
-

 Summary: Run `docs` only in PR builders and Java 21 Daily CI
 Key: SPARK-48187
 URL: https://issues.apache.org/jira/browse/SPARK-48187
 Project: Spark
  Issue Type: Sub-task
  Components: Project Infra
Affects Versions: 4.0.0
Reporter: Dongjoon Hyun






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-48138) Disable a flaky `SparkSessionE2ESuite.interrupt tag` test

2024-05-07 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-48138:
--
Fix Version/s: 3.5.2

> Disable a flaky `SparkSessionE2ESuite.interrupt tag` test
> -
>
> Key: SPARK-48138
> URL: https://issues.apache.org/jira/browse/SPARK-48138
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect, Tests
>Affects Versions: 4.0.0
>    Reporter: Dongjoon Hyun
>    Assignee: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0, 3.5.2
>
>
> - https://github.com/apache/spark/actions/runs/8962353911/job/24611130573 
> (Master, 5/5)
> - https://github.com/apache/spark/actions/runs/8948176536/job/24581022674 
> (Master, 5/4)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-48139) Re-enable `SparkSessionE2ESuite.interrupt tag`

2024-05-07 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-48139:
--
Affects Version/s: 3.5.2

> Re-enable `SparkSessionE2ESuite.interrupt tag`
> --
>
> Key: SPARK-48139
> URL: https://issues.apache.org/jira/browse/SPARK-48139
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect, Tests
>Affects Versions: 4.0.0, 3.5.2
>    Reporter: Dongjoon Hyun
>Priority: Blocker
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-48037) SortShuffleWriter lacks shuffle write related metrics resulting in potentially inaccurate data

2024-05-07 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-48037:
--
Fix Version/s: 3.5.2

> SortShuffleWriter lacks shuffle write related metrics resulting in 
> potentially inaccurate data
> --
>
> Key: SPARK-48037
> URL: https://issues.apache.org/jira/browse/SPARK-48037
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core, SQL
>Affects Versions: 3.3.0, 4.0.0, 3.5.1, 3.4.3
>Reporter: dzcxzl
>Assignee: dzcxzl
>Priority: Blocker
>  Labels: correctness, pull-request-available
> Fix For: 4.0.0, 3.5.2
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48183) Update error contribution guide to respect new error class file

2024-05-07 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-48183.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46455
[https://github.com/apache/spark/pull/46455]

> Update error contribution guide to respect new error class file
> ---
>
> Key: SPARK-48183
> URL: https://issues.apache.org/jira/browse/SPARK-48183
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark
>Affects Versions: 4.0.0
>Reporter: Haejoon Lee
>Assignee: Haejoon Lee
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> We moved error class definition from .py to .json but documentation still 
> shows old behavior. We should update it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-48183) Update error contribution guide to respect new error class file

2024-05-07 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-48183:
-

Assignee: Haejoon Lee

> Update error contribution guide to respect new error class file
> ---
>
> Key: SPARK-48183
> URL: https://issues.apache.org/jira/browse/SPARK-48183
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark
>Affects Versions: 4.0.0
>Reporter: Haejoon Lee
>Assignee: Haejoon Lee
>Priority: Major
>  Labels: pull-request-available
>
> We moved error class definition from .py to .json but documentation still 
> shows old behavior. We should update it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48152) Make spark-profiler as a part of release and publish to maven central repo

2024-05-07 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-48152.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46402
[https://github.com/apache/spark/pull/46402]

> Make spark-profiler as a part of release and publish to maven central repo
> --
>
> Key: SPARK-48152
> URL: https://issues.apache.org/jira/browse/SPARK-48152
> Project: Spark
>  Issue Type: Improvement
>  Components: Build, Documentation
>Affects Versions: 4.0.0
>Reporter: BingKun Pan
>Assignee: BingKun Pan
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-48152) Make spark-profiler as a part of release and publish to maven central repo

2024-05-07 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-48152:
-

Assignee: BingKun Pan

> Make spark-profiler as a part of release and publish to maven central repo
> --
>
> Key: SPARK-48152
> URL: https://issues.apache.org/jira/browse/SPARK-48152
> Project: Spark
>  Issue Type: Improvement
>  Components: Build, Documentation
>Affects Versions: 4.0.0
>Reporter: BingKun Pan
>Assignee: BingKun Pan
>Priority: Minor
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48178) Run `build/scala-213/java-11-17` jobs of branch-3.5 only if needed

2024-05-07 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-48178.
---
Fix Version/s: 3.5.2
   Resolution: Fixed

Issue resolved by pull request 46449
[https://github.com/apache/spark/pull/46449]

> Run `build/scala-213/java-11-17` jobs of branch-3.5 only if needed
> --
>
> Key: SPARK-48178
> URL: https://issues.apache.org/jira/browse/SPARK-48178
> Project: Spark
>  Issue Type: Sub-task
>  Components: Project Infra
>Affects Versions: 3.5.2
>    Reporter: Dongjoon Hyun
>    Assignee: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.2
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-48178) Run `build/scala-213/java-11-17` jobs of branch-3.5 only if needed

2024-05-07 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-48178:
-

Assignee: Dongjoon Hyun

> Run `build/scala-213/java-11-17` jobs of branch-3.5 only if needed
> --
>
> Key: SPARK-48178
> URL: https://issues.apache.org/jira/browse/SPARK-48178
> Project: Spark
>  Issue Type: Sub-task
>  Components: Project Infra
>Affects Versions: 3.5.2
>    Reporter: Dongjoon Hyun
>    Assignee: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-48178) Run `build/scala-213/java-11-17` jobs of branch-3.5 only if needed

2024-05-07 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-48178:
--
Summary: Run `build/scala-213/java-11-17` jobs of branch-3.5 only if needed 
 (was: Run `build/scala-211/java-11-17` jobs of branch-3.5 only if needed)

> Run `build/scala-213/java-11-17` jobs of branch-3.5 only if needed
> --
>
> Key: SPARK-48178
> URL: https://issues.apache.org/jira/browse/SPARK-48178
> Project: Spark
>  Issue Type: Sub-task
>  Components: Project Infra
>Affects Versions: 3.5.2
>    Reporter: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48179) Pin `nbsphinx` to `0.9.3`

2024-05-07 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-48179.
---
Fix Version/s: 3.5.2
   Resolution: Fixed

Issue resolved by pull request 46448
[https://github.com/apache/spark/pull/46448]

>  Pin `nbsphinx` to `0.9.3`
> --
>
> Key: SPARK-48179
> URL: https://issues.apache.org/jira/browse/SPARK-48179
> Project: Spark
>  Issue Type: Bug
>  Components: Project Infra
>Affects Versions: 3.5.2
>    Reporter: Dongjoon Hyun
>    Assignee: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.2
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-48179) Pin `nbsphinx` to `0.9.3`

2024-05-07 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-48179:
-

Assignee: Dongjoon Hyun

>  Pin `nbsphinx` to `0.9.3`
> --
>
> Key: SPARK-48179
> URL: https://issues.apache.org/jira/browse/SPARK-48179
> Project: Spark
>  Issue Type: Bug
>  Components: Project Infra
>Affects Versions: 3.5.2
>    Reporter: Dongjoon Hyun
>    Assignee: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-48179) Pin `nbsphinx` to `0.9.3`

2024-05-07 Thread Dongjoon Hyun (Jira)
Dongjoon Hyun created SPARK-48179:
-

 Summary:  Pin `nbsphinx` to `0.9.3`
 Key: SPARK-48179
 URL: https://issues.apache.org/jira/browse/SPARK-48179
 Project: Spark
  Issue Type: Bug
  Components: Project Infra
Affects Versions: 3.5.2
Reporter: Dongjoon Hyun






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



Re: [DISCUSS] Spark 4.0.0 release

2024-05-07 Thread Dongjoon Hyun
Thank you so much for the update, Wenchen!

Dongjoon.

On Tue, May 7, 2024 at 10:49 AM Wenchen Fan  wrote:

> UPDATE:
>
> Unfortunately, it took me quite some time to set up my laptop and get it
> ready for the release process (docker desktop doesn't work anymore, my pgp
> key is lost, etc.). I'll start the RC process at my tomorrow. Thanks for
> your patience!
>
> Wenchen
>
> On Fri, May 3, 2024 at 7:47 AM yangjie01  wrote:
>
>> +1
>>
>>
>>
>> *发件人**: *Jungtaek Lim 
>> *日期**: *2024年5月2日 星期四 10:21
>> *收件人**: *Holden Karau 
>> *抄送**: *Chao Sun , Xiao Li ,
>> Tathagata Das , Wenchen Fan <
>> cloud0...@gmail.com>, Cheng Pan , Nicholas Chammas <
>> nicholas.cham...@gmail.com>, Dongjoon Hyun ,
>> Cheng Pan , Spark dev list ,
>> Anish Shrigondekar 
>> *主题**: *Re: [DISCUSS] Spark 4.0.0 release
>>
>>
>>
>> +1 love to see it!
>>
>>
>>
>> On Thu, May 2, 2024 at 10:08 AM Holden Karau 
>> wrote:
>>
>> +1 :) yay previews
>>
>>
>>
>> On Wed, May 1, 2024 at 5:36 PM Chao Sun  wrote:
>>
>> +1
>>
>>
>>
>> On Wed, May 1, 2024 at 5:23 PM Xiao Li  wrote:
>>
>> +1 for next Monday.
>>
>>
>>
>> We can do more previews when the other features are ready for preview.
>>
>>
>>
>> Tathagata Das  于2024年5月1日周三 08:46写道:
>>
>> Next week sounds great! Thank you Wenchen!
>>
>>
>>
>> On Wed, May 1, 2024 at 11:16 AM Wenchen Fan  wrote:
>>
>> Yea I think a preview release won't hurt (without a branch cut). We don't
>> need to wait for all the ongoing projects to be ready. How about we do a
>> 4.0 preview release based on the current master branch next Monday?
>>
>>
>>
>> On Wed, May 1, 2024 at 11:06 PM Tathagata Das <
>> tathagata.das1...@gmail.com> wrote:
>>
>> Hey all,
>>
>>
>>
>> Reviving this thread, but Spark master has already accumulated a huge
>> amount of changes.  As a downstream project maintainer, I want to really
>> start testing the new features and other breaking changes, and it's hard to
>> do that without a Preview release. So the sooner we make a Preview release,
>> the faster we can start getting feedback for fixing things for a great
>> Spark 4.0 final release.
>>
>>
>>
>> So I urge the community to produce a Spark 4.0 Preview soon even if
>> certain features targeting the Delta 4.0 release are still incomplete.
>>
>>
>>
>> Thanks!
>>
>>
>>
>>
>>
>> On Wed, Apr 17, 2024 at 8:35 AM Wenchen Fan  wrote:
>>
>> Thank you all for the replies!
>>
>>
>>
>> To @Nicholas Chammas  : Thanks for cleaning
>> up the error terminology and documentation! I've merged the first PR and
>> let's finish others before the 4.0 release.
>>
>> To @Dongjoon Hyun  : Thanks for driving the
>> ANSI on by default effort! Now the vote has passed, let's flip the config
>> and finish the DataFrame error context feature before 4.0.
>>
>> To @Jungtaek Lim  : Ack. We can treat the
>> Streaming state store data source as completed for 4.0 then.
>>
>> To @Cheng Pan  : Yea we definitely should have a
>> preview release. Let's collect more feedback on the ongoing projects and
>> then we can propose a date for the preview release.
>>
>>
>>
>> On Wed, Apr 17, 2024 at 1:22 PM Cheng Pan  wrote:
>>
>> will we have preview release for 4.0.0 like we did for 2.0.0 and 3.0.0?
>>
>> Thanks,
>> Cheng Pan
>>
>>
>> > On Apr 15, 2024, at 09:58, Jungtaek Lim 
>> wrote:
>> >
>> > W.r.t. state data source - reader (SPARK-45511), there are several
>> follow-up tickets, but we don't plan to address them soon. The current
>> implementation is the final shape for Spark 4.0.0, unless there are demands
>> on the follow-up tickets.
>> >
>> > We may want to check the plan for transformWithState - my understanding
>> is that we want to release the feature to 4.0.0, but there are several
>> remaining works to be done. While the tentative timeline for releasing is
>> June 2024, what would be the tentative timeline for the RC cut?
>> > (cc. Anish to add more context on the plan for transformWithState)
>> >
>> > On Sat, Apr 13, 2024 at 3:15 AM Wenchen Fan 
>> wrote:
>> > Hi all,
>> >
>> > It's close to the previously proposed 4.0.0 release date (June 2024),
>> and I think it's time to prepare for it and 

[jira] [Created] (SPARK-48178) Run `build/scala-211/java-11-17` jobs of branch-3.5 only if needed

2024-05-07 Thread Dongjoon Hyun (Jira)
Dongjoon Hyun created SPARK-48178:
-

 Summary: Run `build/scala-211/java-11-17` jobs of branch-3.5 only 
if needed
 Key: SPARK-48178
 URL: https://issues.apache.org/jira/browse/SPARK-48178
 Project: Spark
  Issue Type: Sub-task
  Components: Project Infra
Affects Versions: 3.5.2
Reporter: Dongjoon Hyun






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-48177) Upgrade `Parquet` to 1.14.0

2024-05-07 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-48177:
--
Summary: Upgrade `Parquet` to 1.14.0  (was: Bump Parquet to 1.14.0)

> Upgrade `Parquet` to 1.14.0
> ---
>
> Key: SPARK-48177
> URL: https://issues.apache.org/jira/browse/SPARK-48177
> Project: Spark
>  Issue Type: Sub-task
>  Components: Build
>Affects Versions: 4.0.0
>Reporter: Fokko Driesprong
>Assignee: Fokko Driesprong
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-48177) Bump Parquet to 1.14.0

2024-05-07 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-48177:
--
Affects Version/s: 4.0.0
   (was: 3.5.2)

> Bump Parquet to 1.14.0
> --
>
> Key: SPARK-48177
> URL: https://issues.apache.org/jira/browse/SPARK-48177
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 4.0.0
>Reporter: Fokko Driesprong
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-48177) Bump Parquet to 1.14.0

2024-05-07 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-48177:
-

Assignee: Fokko Driesprong

> Bump Parquet to 1.14.0
> --
>
> Key: SPARK-48177
> URL: https://issues.apache.org/jira/browse/SPARK-48177
> Project: Spark
>  Issue Type: Sub-task
>  Components: Build
>Affects Versions: 4.0.0
>Reporter: Fokko Driesprong
>Assignee: Fokko Driesprong
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-48177) Bump Parquet to 1.14.0

2024-05-07 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-48177:
--
Parent: SPARK-44111
Issue Type: Sub-task  (was: Improvement)

> Bump Parquet to 1.14.0
> --
>
> Key: SPARK-48177
> URL: https://issues.apache.org/jira/browse/SPARK-48177
> Project: Spark
>  Issue Type: Sub-task
>  Components: Build
>Affects Versions: 4.0.0
>Reporter: Fokko Driesprong
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-48177) Bump Parquet to 1.14.0

2024-05-07 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-48177:
--
Fix Version/s: (was: 4.0.0)

> Bump Parquet to 1.14.0
> --
>
> Key: SPARK-48177
> URL: https://issues.apache.org/jira/browse/SPARK-48177
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 4.0.0
>Reporter: Fokko Driesprong
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (ORC-1709) Upgrade GitHub Action `setup-java` to v4 and use built-in cache feature

2024-05-07 Thread Dongjoon Hyun (Jira)
Dongjoon Hyun created ORC-1709:
--

 Summary: Upgrade GitHub Action `setup-java` to v4 and use built-in 
cache feature
 Key: ORC-1709
 URL: https://issues.apache.org/jira/browse/ORC-1709
 Project: ORC
  Issue Type: Task
  Components: Infra
Affects Versions: 2.1.0
Reporter: Dongjoon Hyun






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (SPARK-48037) SortShuffleWriter lacks shuffle write related metrics resulting in potentially inaccurate data

2024-05-07 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-48037.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46273
[https://github.com/apache/spark/pull/46273]

> SortShuffleWriter lacks shuffle write related metrics resulting in 
> potentially inaccurate data
> --
>
> Key: SPARK-48037
> URL: https://issues.apache.org/jira/browse/SPARK-48037
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core, SQL
>Affects Versions: 3.3.0, 4.0.0, 3.5.1, 3.4.3
>Reporter: dzcxzl
>Assignee: dzcxzl
>Priority: Blocker
>  Labels: correctness, pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-48037) SortShuffleWriter lacks shuffle write related metrics resulting in potentially inaccurate data

2024-05-07 Thread Dongjoon Hyun (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-48037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17844388#comment-17844388
 ] 

Dongjoon Hyun commented on SPARK-48037:
---

Thank you, [~dzcxzl]. I raised the priority to `Blocker` for all future 
releases and added a label, `correctness`.

> SortShuffleWriter lacks shuffle write related metrics resulting in 
> potentially inaccurate data
> --
>
> Key: SPARK-48037
> URL: https://issues.apache.org/jira/browse/SPARK-48037
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core, SQL
>Affects Versions: 3.3.0, 4.0.0, 3.5.1, 3.4.3
>Reporter: dzcxzl
>Priority: Blocker
>  Labels: correctness
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-48037) SortShuffleWriter lacks shuffle write related metrics resulting in potentially inaccurate data

2024-05-07 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-48037:
-

Assignee: dzcxzl

> SortShuffleWriter lacks shuffle write related metrics resulting in 
> potentially inaccurate data
> --
>
> Key: SPARK-48037
> URL: https://issues.apache.org/jira/browse/SPARK-48037
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core, SQL
>Affects Versions: 3.3.0, 4.0.0, 3.5.1, 3.4.3
>Reporter: dzcxzl
>Assignee: dzcxzl
>Priority: Blocker
>  Labels: correctness
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-48037) SortShuffleWriter lacks shuffle write related metrics resulting in potentially inaccurate data

2024-05-07 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-48037:
--
Affects Version/s: 3.4.3
   3.5.1
   4.0.0

> SortShuffleWriter lacks shuffle write related metrics resulting in 
> potentially inaccurate data
> --
>
> Key: SPARK-48037
> URL: https://issues.apache.org/jira/browse/SPARK-48037
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core, SQL
>Affects Versions: 3.3.0, 4.0.0, 3.5.1, 3.4.3
>Reporter: dzcxzl
>Priority: Blocker
>  Labels: correctness
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-48037) SortShuffleWriter lacks shuffle write related metrics resulting in potentially inaccurate data

2024-05-07 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-48037:
--
Target Version/s: 4.0.0, 3.5.2, 3.4.4

> SortShuffleWriter lacks shuffle write related metrics resulting in 
> potentially inaccurate data
> --
>
> Key: SPARK-48037
> URL: https://issues.apache.org/jira/browse/SPARK-48037
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core, SQL
>Affects Versions: 3.3.0, 4.0.0, 3.5.1, 3.4.3
>Reporter: dzcxzl
>Priority: Blocker
>  Labels: correctness
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-48037) SortShuffleWriter lacks shuffle write related metrics resulting in potentially inaccurate data

2024-05-07 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-48037:
--
Labels: correctness  (was: pull-request-available)

> SortShuffleWriter lacks shuffle write related metrics resulting in 
> potentially inaccurate data
> --
>
> Key: SPARK-48037
> URL: https://issues.apache.org/jira/browse/SPARK-48037
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core, SQL
>Affects Versions: 3.3.0
>Reporter: dzcxzl
>Priority: Major
>  Labels: correctness
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-48037) SortShuffleWriter lacks shuffle write related metrics resulting in potentially inaccurate data

2024-05-07 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-48037:
--
Priority: Blocker  (was: Major)

> SortShuffleWriter lacks shuffle write related metrics resulting in 
> potentially inaccurate data
> --
>
> Key: SPARK-48037
> URL: https://issues.apache.org/jira/browse/SPARK-48037
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core, SQL
>Affects Versions: 3.3.0
>Reporter: dzcxzl
>Priority: Blocker
>  Labels: correctness
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-41547) Reenable ANSI mode in pyspark.sql.tests.connect.test_connect_functions

2024-05-07 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-41547.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46432
[https://github.com/apache/spark/pull/46432]

> Reenable ANSI mode in pyspark.sql.tests.connect.test_connect_functions
> --
>
> Key: SPARK-41547
> URL: https://issues.apache.org/jira/browse/SPARK-41547
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect, Tests
>Affects Versions: 3.4.0
>Reporter: Hyukjin Kwon
>Assignee: Xinrong Meng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> See https://issues.apache.org/jira/browse/SPARK-41548
> We should fix the tests.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48169) Use lazy BadRecordException cause for StaxXmlParser and JacksonParser

2024-05-07 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-48169.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46438
[https://github.com/apache/spark/pull/46438]

> Use lazy BadRecordException cause for StaxXmlParser and JacksonParser
> -
>
> Key: SPARK-48169
> URL: https://issues.apache.org/jira/browse/SPARK-48169
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Vladimir Golubev
>Assignee: Vladimir Golubev
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> For now since the https://issues.apache.org/jira/browse/SPARK-48143, the old 
> constructor is used



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48165) Update `ap-loader` to 3.0-9

2024-05-07 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-48165.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46427
[https://github.com/apache/spark/pull/46427]

> Update `ap-loader` to 3.0-9
> ---
>
> Key: SPARK-48165
> URL: https://issues.apache.org/jira/browse/SPARK-48165
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 4.0.0
>Reporter: BingKun Pan
>Assignee: BingKun Pan
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48173) CheckAnalsis should see the entire query plan

2024-05-07 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-48173.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46439
[https://github.com/apache/spark/pull/46439]

> CheckAnalsis should see the entire query plan
> -
>
> Key: SPARK-48173
> URL: https://issues.apache.org/jira/browse/SPARK-48173
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: Wenchen Fan
>Assignee: Wenchen Fan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-48173) CheckAnalsis should see the entire query plan

2024-05-07 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-48173:
-

Assignee: Wenchen Fan

> CheckAnalsis should see the entire query plan
> -
>
> Key: SPARK-48173
> URL: https://issues.apache.org/jira/browse/SPARK-48173
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: Wenchen Fan
>Assignee: Wenchen Fan
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-48171) Clean up the use of deprecated APIs related to `o.rocksdb.Logger`

2024-05-07 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-48171:
-

Assignee: Yang Jie

> Clean up the use of deprecated APIs related to `o.rocksdb.Logger`
> -
>
> Key: SPARK-48171
> URL: https://issues.apache.org/jira/browse/SPARK-48171
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Assignee: Yang Jie
>Priority: Major
>  Labels: pull-request-available
>
> {code:java}
> /**
>  * AbstractLogger constructor.
>  *
>  * Important: the log level set within
>  * the {@link org.rocksdb.Options} instance will be used as
>  * maximum log level of RocksDB.
>  *
>  * @param options {@link org.rocksdb.Options} instance.
>  *
>  * @deprecated Use {@link Logger#Logger(InfoLogLevel)} instead, e.g. {@code 
> new
>  * Logger(options.infoLogLevel())}.
>  */
> @Deprecated
> public Logger(final Options options) {
>   this(options.infoLogLevel());
> } {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48171) Clean up the use of deprecated APIs related to `o.rocksdb.Logger`

2024-05-07 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-48171.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46436
[https://github.com/apache/spark/pull/46436]

> Clean up the use of deprecated APIs related to `o.rocksdb.Logger`
> -
>
> Key: SPARK-48171
> URL: https://issues.apache.org/jira/browse/SPARK-48171
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 4.0.0
>Reporter: Yang Jie
>Assignee: Yang Jie
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> {code:java}
> /**
>  * AbstractLogger constructor.
>  *
>  * Important: the log level set within
>  * the {@link org.rocksdb.Options} instance will be used as
>  * maximum log level of RocksDB.
>  *
>  * @param options {@link org.rocksdb.Options} instance.
>  *
>  * @deprecated Use {@link Logger#Logger(InfoLogLevel)} instead, e.g. {@code 
> new
>  * Logger(options.infoLogLevel())}.
>  */
> @Deprecated
> public Logger(final Options options) {
>   this(options.infoLogLevel());
> } {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48163) Disable `SparkConnectServiceSuite.SPARK-43923: commands send events - get_resources_command`

2024-05-06 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-48163.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46425
[https://github.com/apache/spark/pull/46425]

> Disable `SparkConnectServiceSuite.SPARK-43923: commands send events - 
> get_resources_command`
> 
>
> Key: SPARK-48163
> URL: https://issues.apache.org/jira/browse/SPARK-48163
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL, Tests
>Affects Versions: 4.0.0
>    Reporter: Dongjoon Hyun
>Assignee: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> {code}
> - SPARK-43923: commands send events ((get_resources_command {
> [info] }
> [info] ,None)) *** FAILED *** (35 milliseconds)
> [info]   VerifyEvents.this.listener.executeHolder.isDefined was false 
> (SparkConnectServiceSuite.scala:873)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-48139) Re-enable `SparkSessionE2ESuite.interrupt tag`

2024-05-06 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-48139:
--
Parent: SPARK-44111
Issue Type: Sub-task  (was: Bug)

> Re-enable `SparkSessionE2ESuite.interrupt tag`
> --
>
> Key: SPARK-48139
> URL: https://issues.apache.org/jira/browse/SPARK-48139
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect, Tests
>Affects Versions: 4.0.0
>    Reporter: Dongjoon Hyun
>Priority: Blocker
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-48164) Re-enable `SparkConnectServiceSuite.SPARK-43923: commands send events - get_resources_command`

2024-05-06 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-48164:
--
Component/s: Tests

> Re-enable `SparkConnectServiceSuite.SPARK-43923: commands send events - 
> get_resources_command`
> --
>
> Key: SPARK-48164
> URL: https://issues.apache.org/jira/browse/SPARK-48164
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect, Tests
>Affects Versions: 4.0.0
>    Reporter: Dongjoon Hyun
>Priority: Blocker
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-48164) Re-enable `SparkConnectServiceSuite.SPARK-43923: commands send events - get_resources_command`

2024-05-06 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-48164:
--
Target Version/s: 4.0.0

> Re-enable `SparkConnectServiceSuite.SPARK-43923: commands send events - 
> get_resources_command`
> --
>
> Key: SPARK-48164
> URL: https://issues.apache.org/jira/browse/SPARK-48164
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect
>Affects Versions: 4.0.0
>    Reporter: Dongjoon Hyun
>Priority: Blocker
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-48163) Disable `SparkConnectServiceSuite.SPARK-43923: commands send events - get_resources_command`

2024-05-06 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-48163:
-

Assignee: Dongjoon Hyun

> Disable `SparkConnectServiceSuite.SPARK-43923: commands send events - 
> get_resources_command`
> 
>
> Key: SPARK-48163
> URL: https://issues.apache.org/jira/browse/SPARK-48163
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL, Tests
>Affects Versions: 4.0.0
>    Reporter: Dongjoon Hyun
>Assignee: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
>
> {code}
> - SPARK-43923: commands send events ((get_resources_command {
> [info] }
> [info] ,None)) *** FAILED *** (35 milliseconds)
> [info]   VerifyEvents.this.listener.executeHolder.isDefined was false 
> (SparkConnectServiceSuite.scala:873)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-48164) Re-enable `SparkConnectServiceSuite.SPARK-43923: commands send events - get_resources_command`

2024-05-06 Thread Dongjoon Hyun (Jira)
Dongjoon Hyun created SPARK-48164:
-

 Summary: Re-enable `SparkConnectServiceSuite.SPARK-43923: commands 
send events - get_resources_command`
 Key: SPARK-48164
 URL: https://issues.apache.org/jira/browse/SPARK-48164
 Project: Spark
  Issue Type: Sub-task
  Components: Connect
Affects Versions: 4.0.0
Reporter: Dongjoon Hyun






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-48163) Disable `SparkConnectServiceSuite.SPARK-43923: commands send events - get_resources_command`

2024-05-06 Thread Dongjoon Hyun (Jira)
Dongjoon Hyun created SPARK-48163:
-

 Summary: Disable `SparkConnectServiceSuite.SPARK-43923: commands 
send events - get_resources_command`
 Key: SPARK-48163
 URL: https://issues.apache.org/jira/browse/SPARK-48163
 Project: Spark
  Issue Type: Sub-task
  Components: SQL, Tests
Affects Versions: 4.0.0
Reporter: Dongjoon Hyun


{code}
- SPARK-43923: commands send events ((get_resources_command {
[info] }
[info] ,None)) *** FAILED *** (35 milliseconds)
[info]   VerifyEvents.this.listener.executeHolder.isDefined was false 
(SparkConnectServiceSuite.scala:873)
{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48141) Update the Oracle docker image version used for test and integration to use Oracle Database 23ai Free

2024-05-06 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-48141.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46399
[https://github.com/apache/spark/pull/46399]

> Update the Oracle docker image version used for test and integration to use 
> Oracle Database 23ai Free
> -
>
> Key: SPARK-48141
> URL: https://issues.apache.org/jira/browse/SPARK-48141
> Project: Spark
>  Issue Type: Improvement
>  Components: Tests
>Affects Versions: 4.0.0
>Reporter: Luca Canali
>Assignee: Luca Canali
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> {color:#0d0d0d}Oracle recently released Oracle Database 23ai Free, 
> specifically version 23.4, as their latest free database version. {color}
> {color:#0d0d0d}We should update our testing infrastructure to utilize this 
> free version, using the Docker image available at{color}[ {color:#0d0d0d} 
> {color}|https://github.com/gvenzl/oci-oracle-free] 
> [https://github.com/gvenzl/oci-oracle-free 
> |https://github.com/gvenzl/oci-oracle-free]
> {color:#0d0d0d}This repository is known for being a reliable and 
> well-maintained source for Oracle Database images.{color}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-48141) Update the Oracle docker image version used for test and integration to use Oracle Database 23ai Free

2024-05-06 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-48141:
-

Assignee: Luca Canali

> Update the Oracle docker image version used for test and integration to use 
> Oracle Database 23ai Free
> -
>
> Key: SPARK-48141
> URL: https://issues.apache.org/jira/browse/SPARK-48141
> Project: Spark
>  Issue Type: Improvement
>  Components: Tests
>Affects Versions: 4.0.0
>Reporter: Luca Canali
>Assignee: Luca Canali
>Priority: Minor
>  Labels: pull-request-available
>
> {color:#0d0d0d}Oracle recently released Oracle Database 23ai Free, 
> specifically version 23.4, as their latest free database version. {color}
> {color:#0d0d0d}We should update our testing infrastructure to utilize this 
> free version, using the Docker image available at{color}[ {color:#0d0d0d} 
> {color}|https://github.com/gvenzl/oci-oracle-free] 
> [https://github.com/gvenzl/oci-oracle-free 
> |https://github.com/gvenzl/oci-oracle-free]
> {color:#0d0d0d}This repository is known for being a reliable and 
> well-maintained source for Oracle Database images.{color}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48150) Fix nullability of try_parse_json

2024-05-06 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-48150.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46409
[https://github.com/apache/spark/pull/46409]

> Fix nullability of try_parse_json
> -
>
> Key: SPARK-48150
> URL: https://issues.apache.org/jira/browse/SPARK-48150
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 4.0.0
>Reporter: Josh Rosen
>Assignee: Josh Rosen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Followup for SPARK-47922: `try_parse_json` must declare a nullable output.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48153) Run `build` job of `build_and_test.yml` only if needed

2024-05-06 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-48153.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46412
[https://github.com/apache/spark/pull/46412]

> Run `build` job of `build_and_test.yml` only if needed
> --
>
> Key: SPARK-48153
> URL: https://issues.apache.org/jira/browse/SPARK-48153
> Project: Spark
>  Issue Type: Sub-task
>  Components: Project Infra
>Affects Versions: 4.0.0
>    Reporter: Dongjoon Hyun
>    Assignee: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-48153) Run `build` job of `build_and_test.yml` only if needed

2024-05-06 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-48153:
--
Summary: Run `build` job of `build_and_test.yml` only if needed  (was: Run 
`build` job only if needed)

> Run `build` job of `build_and_test.yml` only if needed
> --
>
> Key: SPARK-48153
> URL: https://issues.apache.org/jira/browse/SPARK-48153
> Project: Spark
>  Issue Type: Sub-task
>  Components: Project Infra
>Affects Versions: 4.0.0
>    Reporter: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-48151) `build_and_test.yml` should use `Volcano` 1.7.0 for `branch-3.4/3.5`

2024-05-06 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-48151:
-

Assignee: Dongjoon Hyun

> `build_and_test.yml` should use `Volcano` 1.7.0 for `branch-3.4/3.5`
> 
>
> Key: SPARK-48151
> URL: https://issues.apache.org/jira/browse/SPARK-48151
> Project: Spark
>  Issue Type: Sub-task
>  Components: Project Infra
>Affects Versions: 4.0.0
>    Reporter: Dongjoon Hyun
>    Assignee: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48151) `build_and_test.yml` should use `Volcano` 1.7.0 for `branch-3.4/3.5`

2024-05-06 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-48151.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46410
[https://github.com/apache/spark/pull/46410]

> `build_and_test.yml` should use `Volcano` 1.7.0 for `branch-3.4/3.5`
> 
>
> Key: SPARK-48151
> URL: https://issues.apache.org/jira/browse/SPARK-48151
> Project: Spark
>  Issue Type: Sub-task
>  Components: Project Infra
>Affects Versions: 4.0.0
>    Reporter: Dongjoon Hyun
>    Assignee: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-48151) `build_and_test.yml` should use `Volcano` 1.7.0 for `branch-3.4/3.5`

2024-05-06 Thread Dongjoon Hyun (Jira)
Dongjoon Hyun created SPARK-48151:
-

 Summary: `build_and_test.yml` should use `Volcano` 1.7.0 for 
`branch-3.4/3.5`
 Key: SPARK-48151
 URL: https://issues.apache.org/jira/browse/SPARK-48151
 Project: Spark
  Issue Type: Sub-task
  Components: Project Infra
Affects Versions: 4.0.0
Reporter: Dongjoon Hyun






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48149) Serialize `build_python.yml` to run a single Python version per cron schedule

2024-05-06 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-48149.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46407
[https://github.com/apache/spark/pull/46407]

> Serialize `build_python.yml` to run a single Python version per cron schedule
> -
>
> Key: SPARK-48149
> URL: https://issues.apache.org/jira/browse/SPARK-48149
> Project: Spark
>  Issue Type: Sub-task
>  Components: Project Infra
>Affects Versions: 4.0.0
>    Reporter: Dongjoon Hyun
>    Assignee: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-48149) Serialize `build_python.yml` to run a single Python version per cron schedule

2024-05-06 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-48149:
-

Assignee: Dongjoon Hyun

> Serialize `build_python.yml` to run a single Python version per cron schedule
> -
>
> Key: SPARK-48149
> URL: https://issues.apache.org/jira/browse/SPARK-48149
> Project: Spark
>  Issue Type: Sub-task
>  Components: Project Infra
>Affects Versions: 4.0.0
>    Reporter: Dongjoon Hyun
>    Assignee: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-48149) Serialize `build_python.yml` to run a single Python version per cron schedule

2024-05-06 Thread Dongjoon Hyun (Jira)
Dongjoon Hyun created SPARK-48149:
-

 Summary: Serialize `build_python.yml` to run a single Python 
version per cron schedule
 Key: SPARK-48149
 URL: https://issues.apache.org/jira/browse/SPARK-48149
 Project: Spark
  Issue Type: Sub-task
  Components: Project Infra
Affects Versions: 4.0.0
Reporter: Dongjoon Hyun






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48145) Remove logDebug and logTrace with MDC in java structured logging framework

2024-05-06 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-48145.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46405
[https://github.com/apache/spark/pull/46405]

> Remove logDebug and logTrace with MDC in java structured logging framework
> --
>
> Key: SPARK-48145
> URL: https://issues.apache.org/jira/browse/SPARK-48145
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 4.0.0
>Reporter: Gengliang Wang
>Assignee: Gengliang Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (HIVE-25468) Create/Drop functions should be authorized in HMS

2024-05-06 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated HIVE-25468:
-
Fix Version/s: 4.0.0
   3.1.3

> Create/Drop functions should be authorized in HMS
> -
>
> Key: HIVE-25468
> URL: https://issues.apache.org/jira/browse/HIVE-25468
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Sai Hemanth Gantasala
>Assignee: Sai Hemanth Gantasala
>Priority: Major
>  Labels: pull-request-available, release-3.1.3
> Fix For: 3.1.3, 4.0.0
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Create function func_name using class 'org.someclass' using jar 
> '/path_to_jar';
> Drop function func_name;
> These commands are currently authorized in HS2 but not in HiveMetastore. 
> These commands should be authorized for HMS clients for (eg:spark-shell) on 
> the end-user.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] Apache Hive 2.3.10 Release Candidate 1

2024-05-06 Thread Dongjoon Hyun
+1 (non-binding)

Thank you so much all!

Dongjoon.

On 2024/05/06 13:57:09 Cheng Pan wrote:
> +1 (non-binding)
> 
> Pass integration test with Apache Spark[1] and Apache Kyuubi[2].
> 
> [1] https://github.com/apache/spark/pull/45372
> [2] https://github.com/apache/kyuubi/pull/6328
> 
> Thanks,
> Cheng Pan
> 
> 
> 


Re: Orc 1.9.3 - changes around precision of decimal numbers?

2024-05-06 Thread Dongjoon Hyun
Your PR seems to have irrelevant change.

For example, upgrading `net.sf.opencsv: opencsv:2.3` to 
`com.opencsv:opencsv:5.9`.

Could you remove `opencsv` change first to narrow down?

Dongjoon.

On 2024/05/06 14:55:20 Dmitriy Fingerman wrote:
> Hello ORC Devs,
> 
> I am working on upgrading the Orc version in Hive from 1.8.5 to 1.9.3.
> (To 1.9.3 and not to 2.0.0 because Hive still supports Java 8 and Orc 2.0.0
> doesn't).
> Link to the Hive pull request: https://github.com/apache/hive/pull/5218
> One of the tests in this Hive PR is failing because some queries' results
> have changed.
> The differences are in the fractional parts of the results of Hive's
> functions STDDEV_POP and STDDEV_SAMP.
> (These functions are implemented with operations that include sqrt, sum,
> division, multiplication, count, etc).
> Here is the link to the differences with and without the upgraded Orc
> version:
> https://github.com/apache/hive/pull/5218/files#diff-7c779589551c5224644bfe786d1f03a5e3aa18b219b28ae18f89fffea01ef483
> 
> Can you please advise if Orc had some changes around precision that could
> explain these differences in query results?
> 
> Thanks,
> Dmitriy Fingerman
> 


Re: ORC 2.0.1 Release

2024-05-05 Thread Dongjoon Hyun
Thank you so much, William.

It would be helpful for preparing Apache Spark 4.0.0 release too.

Dongjoon.


On Sun, May 5, 2024 at 9:19 PM Gang Wu  wrote:

> Thanks William!
>
> There are some fixes on the C++ side waiting for the release.
>
> Best,
> Gang
>
> On Mon, May 6, 2024 at 12:12 PM William H.  wrote:
>
> > Hey All!
> >
> > The scheduled date for ORC 2.0.1 is just around the corner on the 17th.
> > I will be volunteering as the release manager for this release.
> >
> > https://github.com/apache/orc/milestone/29
> >
> > Bests,
> > William
> >
>


[jira] [Updated] (SPARK-48138) Disable a flaky `SparkSessionE2ESuite.interrupt tag` test

2024-05-05 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-48138:
--
Description: 
- https://github.com/apache/spark/actions/runs/8962353911/job/24611130573 
(Master, 5/5)
- https://github.com/apache/spark/actions/runs/8948176536/job/24581022674 
(Master, 5/4)

> Disable a flaky `SparkSessionE2ESuite.interrupt tag` test
> -
>
> Key: SPARK-48138
> URL: https://issues.apache.org/jira/browse/SPARK-48138
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect, Tests
>Affects Versions: 4.0.0
>    Reporter: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
>
> - https://github.com/apache/spark/actions/runs/8962353911/job/24611130573 
> (Master, 5/5)
> - https://github.com/apache/spark/actions/runs/8948176536/job/24581022674 
> (Master, 5/4)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-48139) Re-enable `SparkSessionE2ESuite.interrupt tag`

2024-05-05 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-48139:
--
Description: (was: - 
https://github.com/apache/spark/actions/runs/8962353911/job/24611130573 
(Master, 5/5)
- https://github.com/apache/spark/actions/runs/8948176536/job/24581022674 
(Master, 5/4))

> Re-enable `SparkSessionE2ESuite.interrupt tag`
> --
>
> Key: SPARK-48139
> URL: https://issues.apache.org/jira/browse/SPARK-48139
> Project: Spark
>  Issue Type: Bug
>  Components: Connect, Tests
>Affects Versions: 4.0.0
>    Reporter: Dongjoon Hyun
>Priority: Blocker
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-48139) Re-enable `SparkSessionE2ESuite.interrupt tag`

2024-05-05 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-48139:
--
Description: 
- https://github.com/apache/spark/actions/runs/8962353911/job/24611130573 
(Master, 5/5)
- https://github.com/apache/spark/actions/runs/8948176536/job/24581022674 
(Master, 5/4)

> Re-enable `SparkSessionE2ESuite.interrupt tag`
> --
>
> Key: SPARK-48139
> URL: https://issues.apache.org/jira/browse/SPARK-48139
> Project: Spark
>  Issue Type: Bug
>  Components: Connect, Tests
>Affects Versions: 4.0.0
>    Reporter: Dongjoon Hyun
>Priority: Blocker
>
> - https://github.com/apache/spark/actions/runs/8962353911/job/24611130573 
> (Master, 5/5)
> - https://github.com/apache/spark/actions/runs/8948176536/job/24581022674 
> (Master, 5/4)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-48138) Disable a flaky `SparkSessionE2ESuite.interrupt tag` test

2024-05-05 Thread Dongjoon Hyun (Jira)
Dongjoon Hyun created SPARK-48138:
-

 Summary: Disable a flaky `SparkSessionE2ESuite.interrupt tag` test
 Key: SPARK-48138
 URL: https://issues.apache.org/jira/browse/SPARK-48138
 Project: Spark
  Issue Type: Sub-task
  Components: Connect, Tests
Affects Versions: 4.0.0
Reporter: Dongjoon Hyun






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48136) Always upload Spark Connect log files

2024-05-05 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-48136.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46393
[https://github.com/apache/spark/pull/46393]

> Always upload Spark Connect log files
> -
>
> Key: SPARK-48136
> URL: https://issues.apache.org/jira/browse/SPARK-48136
> Project: Spark
>  Issue Type: Improvement
>  Components: Connect, Project Infra, PySpark
>Affects Versions: 4.0.0
>Reporter: Hyukjin Kwon
>Assignee: Hyukjin Kwon
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> We should always upload log files if it is not success



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



Re: ASF board report draft for May

2024-05-05 Thread Dongjoon Hyun
+1 for Holden's comment. Yes, it would be great to mention `it` as "soon".
(If Wenchen release it on Monday, we can simply mention the release)

In addition, Apache Spark PMC received an official notice from ASF Infra
team.

https://lists.apache.org/thread/rgy1cg17tkd3yox7qfq87ht12sqclkbg
> [NOTICE] Apache Spark's GitHub Actions usage exceeds allowances for ASF
projects

To track and comply with the new ASF Infra Policy as much as possible, we
opened a blocker-level JIRA issue and have been working on it.
- https://infra.apache.org/github-actions-policy.html

Please include a sentence that Apache Spark PMC is working on under the
following umbrella JIRA issue.

https://issues.apache.org/jira/browse/SPARK-48094
> Reduce GitHub Action usage according to ASF project allowance

Thanks,
Dongjoon.


On Sun, May 5, 2024 at 3:45 PM Holden Karau  wrote:

> Do we want to include that we’re planning on having a preview release of
> Spark 4 so folks can see the APIs “soon”?
>
> Twitter: https://twitter.com/holdenkarau
> Books (Learning Spark, High Performance Spark, etc.):
> https://amzn.to/2MaRAG9  
> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>
>
> On Sun, May 5, 2024 at 3:24 PM Matei Zaharia 
> wrote:
>
>> It’s time for our quarterly ASF board report on Apache Spark this
>> Wednesday. Here’s a draft, feel free to suggest changes.
>>
>> 
>>
>> Description:
>>
>> Apache Spark is a fast and general purpose engine for large-scale data
>> processing. It offers high-level APIs in Java, Scala, Python, R and SQL as
>> well as a rich set of libraries including stream processing, machine
>> learning, and graph analytics.
>>
>> Issues for the board:
>>
>> - None
>>
>> Project status:
>>
>> - We made two patch releases: Spark 3.5.1 on February 28, 2024, and Spark
>> 3.4.2 on April 18, 2024.
>> - The votes on "SPIP: Structured Logging Framework for Apache Spark" and
>> "Pure Python Package in PyPI (Spark Connect)" have passed.
>> - The votes for two behavior changes have passed: "SPARK-4: Use ANSI
>> SQL mode by default" and "SPARK-46122: Set
>> spark.sql.legacy.createHiveTableByDefault to false".
>> - The community decided that upcoming Spark 4.0 release will drop support
>> for Python 3.8.
>> - We started a discussion about the definition of behavior changes that
>> is critical for version upgrades and user experience.
>> - We've opened a dedicated repository for the Spark Kubernetes Operator
>> at https://github.com/apache/spark-kubernetes-operator. We added a new
>> version in Apache Spark JIRA for versioning of the Spark operator based on
>> a vote result.
>>
>> Trademarks:
>>
>> - No changes since the last report.
>>
>> Latest releases:
>> - Spark 3.4.3 was released on April 18, 2024
>> - Spark 3.5.1 was released on February 28, 2024
>> - Spark 3.3.4 was released on December 16, 2023
>>
>> Committers and PMC:
>>
>> - The latest committer was added on Oct 2nd, 2023 (Jiaan Geng).
>> - The latest PMC members were added on Oct 2nd, 2023 (Yuanjian Li and
>> Yikun Jiang).
>>
>> 
>> -
>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>
>>


[jira] [Updated] (SPARK-48135) Run `buf` and `ui` only in PR builders and Java 21 Daily CI

2024-05-05 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-48135:
--
Summary: Run `buf` and `ui` only in PR builders and Java 21 Daily CI  (was: 
Run `but` and `ui` only in PR builders and Java 21 Daily CI)

> Run `buf` and `ui` only in PR builders and Java 21 Daily CI
> ---
>
> Key: SPARK-48135
> URL: https://issues.apache.org/jira/browse/SPARK-48135
> Project: Spark
>  Issue Type: Sub-task
>  Components: Project Infra
>Affects Versions: 4.0.0
>    Reporter: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-48132) Run `k8s-integration-tests` only in PR builder and Daily CIs

2024-05-04 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-48132:
--
Summary: Run `k8s-integration-tests` only in PR builder and Daily CIs  
(was: Run `k8s-integration-tests` in PR builder and Daily CIs)

> Run `k8s-integration-tests` only in PR builder and Daily CIs
> 
>
> Key: SPARK-48132
> URL: https://issues.apache.org/jira/browse/SPARK-48132
> Project: Spark
>  Issue Type: Sub-task
>  Components: Project Infra
>Affects Versions: 4.0.0
>    Reporter: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-48116) Run `pyspark-pandas*` only in PR builder and Daily Python CIs

2024-05-04 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-48116:
--
Summary: Run `pyspark-pandas*` only in PR builder and Daily Python CIs  
(was: Run `pyspark-pandas*` in PR builder and Daily Python CIs)

> Run `pyspark-pandas*` only in PR builder and Daily Python CIs
> -
>
> Key: SPARK-48116
> URL: https://issues.apache.org/jira/browse/SPARK-48116
> Project: Spark
>  Issue Type: Sub-task
>  Components: Project Infra
>Affects Versions: 4.0.0
>    Reporter: Dongjoon Hyun
>    Assignee: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-48094) Reduce GitHub Action usage according to ASF project allowance

2024-05-04 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-48094:
--
Description: 
h2. ASF INFRA POLICY
- https://infra.apache.org/github-actions-policy.html

h2. MONITORING
- https://infra-reports.apache.org/#ghactions=spark=168

 !Screenshot 2024-05-02 at 23.56.05.png|width=100%! 

h2. TARGET
* All workflows MUST have a job concurrency level less than or equal to 20. 
This means a workflow cannot have more than 20 jobs running at the same time 
across all matrices.
* All workflows SHOULD have a job concurrency level less than or equal to 15. 
Just because 20 is the max, doesn't mean you should strive for 20.
* The average number of minutes a project uses per calendar week MUST NOT 
exceed the equivalent of 25 full-time runners (250,000 minutes, or 4,200 hours).
* The average number of minutes a project uses in any consecutive five-day 
period MUST NOT exceed the equivalent of 30 full-time runners (216,000 minutes, 
or 3,600 hours).

h2. DEADLINE
bq. 17th of May, 2024

Since the deadline is 17th of May, 2024, I set this as the highest priority, 
`Blocker`.



  was:
h2. ASF INFRA POLICY
- https://infra.apache.org/github-actions-policy.html

h2. MONITORING
[https://infra-reports.apache.org/#ghactions=spark=168]

 !Screenshot 2024-05-02 at 23.56.05.png|width=100%! 

h2. TARGET
* All workflows MUST have a job concurrency level less than or equal to 20. 
This means a workflow cannot have more than 20 jobs running at the same time 
across all matrices.
* All workflows SHOULD have a job concurrency level less than or equal to 15. 
Just because 20 is the max, doesn't mean you should strive for 20.
* The average number of minutes a project uses per calendar week MUST NOT 
exceed the equivalent of 25 full-time runners (250,000 minutes, or 4,200 hours).
* The average number of minutes a project uses in any consecutive five-day 
period MUST NOT exceed the equivalent of 30 full-time runners (216,000 minutes, 
or 3,600 hours).

h2. DEADLINE
bq. 17th of May, 2024

Since the deadline is 17th of May, 2024, I set this as the highest priority, 
`Blocker`.




> Reduce GitHub Action usage according to ASF project allowance
> -
>
> Key: SPARK-48094
> URL: https://issues.apache.org/jira/browse/SPARK-48094
> Project: Spark
>  Issue Type: Umbrella
>  Components: Project Infra
>Affects Versions: 4.0.0
>    Reporter: Dongjoon Hyun
>Priority: Blocker
> Attachments: Screenshot 2024-05-02 at 23.56.05.png
>
>
> h2. ASF INFRA POLICY
> - https://infra.apache.org/github-actions-policy.html
> h2. MONITORING
> - https://infra-reports.apache.org/#ghactions=spark=168
>  !Screenshot 2024-05-02 at 23.56.05.png|width=100%! 
> h2. TARGET
> * All workflows MUST have a job concurrency level less than or equal to 20. 
> This means a workflow cannot have more than 20 jobs running at the same time 
> across all matrices.
> * All workflows SHOULD have a job concurrency level less than or equal to 15. 
> Just because 20 is the max, doesn't mean you should strive for 20.
> * The average number of minutes a project uses per calendar week MUST NOT 
> exceed the equivalent of 25 full-time runners (250,000 minutes, or 4,200 
> hours).
> * The average number of minutes a project uses in any consecutive five-day 
> period MUST NOT exceed the equivalent of 30 full-time runners (216,000 
> minutes, or 3,600 hours).
> h2. DEADLINE
> bq. 17th of May, 2024
> Since the deadline is 17th of May, 2024, I set this as the highest priority, 
> `Blocker`.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48131) Unify MDC key `mdc.taskName` and `task_name`

2024-05-04 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-48131.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46386
[https://github.com/apache/spark/pull/46386]

> Unify MDC key `mdc.taskName` and `task_name`
> 
>
> Key: SPARK-48131
> URL: https://issues.apache.org/jira/browse/SPARK-48131
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 4.0.0
>Reporter: Gengliang Wang
>Assignee: Gengliang Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Rename the MDC key `mdc.taskName` as `task_name`, so that it is consistent 
> with all the MDC keys used in the structured logging framework.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-48128) BitwiseCount / bit_count generated code for boolean inputs fails to compile

2024-05-04 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-48128:
--
Affects Version/s: 3.4.3
   3.3.4
   3.5.1
   3.2.4
   3.1.3

> BitwiseCount / bit_count generated code for boolean inputs fails to compile
> ---
>
> Key: SPARK-48128
> URL: https://issues.apache.org/jira/browse/SPARK-48128
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.0.0, 3.1.3, 3.2.4, 3.5.1, 3.3.4, 3.4.3
>Reporter: Josh Rosen
>Assignee: Josh Rosen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0, 3.5.2, 3.4.4
>
>
> If the `BitwiseCount` / `bit_count` expresison is applied to a boolean type 
> column then then it will trigger codegen fallback to interpreted because the 
> generated code contains invalid Java syntax, triggering errors like
> {code}
>  java.util.concurrent.ExecutionException: 
> org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 
> 41, Column 11: Failed to compile: 
> org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 
> 41, Column 11: Unexpected token "if" in primary
> {code}
> This problem was masked because the QueryTest framework may not be fully 
> exercising codegen paths (e.g. if constant folding occurs).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-48128) BitwiseCount / bit_count generated code for boolean inputs fails to compile

2024-05-04 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-48128:
--
Issue Type: Bug  (was: Improvement)

> BitwiseCount / bit_count generated code for boolean inputs fails to compile
> ---
>
> Key: SPARK-48128
> URL: https://issues.apache.org/jira/browse/SPARK-48128
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Josh Rosen
>Assignee: Josh Rosen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0, 3.5.2, 3.4.4
>
>
> If the `BitwiseCount` / `bit_count` expresison is applied to a boolean type 
> column then then it will trigger codegen fallback to interpreted because the 
> generated code contains invalid Java syntax, triggering errors like
> {code}
>  java.util.concurrent.ExecutionException: 
> org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 
> 41, Column 11: Failed to compile: 
> org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 
> 41, Column 11: Unexpected token "if" in primary
> {code}
> This problem was masked because the QueryTest framework may not be fully 
> exercising codegen paths (e.g. if constant folding occurs).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48129) Provide a constant table schema in PySpark for querying structured logs

2024-05-04 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-48129.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46384
[https://github.com/apache/spark/pull/46384]

> Provide a constant table schema in PySpark for querying structured logs
> ---
>
> Key: SPARK-48129
> URL: https://issues.apache.org/jira/browse/SPARK-48129
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 4.0.0
>Reporter: Gengliang Wang
>Assignee: Gengliang Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48128) BitwiseCount / bit_count generated code for boolean inputs fails to compile

2024-05-04 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-48128.
---
Fix Version/s: 3.4.4
   3.5.2
   4.0.0
   Resolution: Fixed

Issue resolved by pull request 46382
[https://github.com/apache/spark/pull/46382]

> BitwiseCount / bit_count generated code for boolean inputs fails to compile
> ---
>
> Key: SPARK-48128
> URL: https://issues.apache.org/jira/browse/SPARK-48128
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Josh Rosen
>Assignee: Josh Rosen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.4, 3.5.2, 4.0.0
>
>
> If the `BitwiseCount` / `bit_count` expresison is applied to a boolean type 
> column then then it will trigger codegen fallback to interpreted because the 
> generated code contains invalid Java syntax, triggering errors like
> {code}
>  java.util.concurrent.ExecutionException: 
> org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 
> 41, Column 11: Failed to compile: 
> org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 
> 41, Column 11: Unexpected token "if" in primary
> {code}
> This problem was masked because the QueryTest framework may not be fully 
> exercising codegen paths (e.g. if constant folding occurs).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48124) Disable structured logging for Interpreter by default

2024-05-04 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-48124.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46383
[https://github.com/apache/spark/pull/46383]

> Disable structured logging for Interpreter by default
> -
>
> Key: SPARK-48124
> URL: https://issues.apache.org/jira/browse/SPARK-48124
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 4.0.0
>Reporter: Gengliang Wang
>Assignee: Gengliang Wang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> Since there are plain text output from 
> Interpreters(spark-shell/spark-sql/pyspark), it makes more sense to disable 
> structured logging for Interpreters by default.
>  
> spark-shell output when with structured logging enabled:
> ```
> Setting default log level to "WARN".
> To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use 
> setLogLevel(newLevel).
> Welcome to
>                     __
>      / __/__  ___ _/ /__
>     _\ \/ _ \/ _ `/ __/  '_/
>    /___/ .__/\_,_/_/ /_/\_\   version 4.0.0-SNAPSHOT
>       /_/
>          
> Using Scala version 2.13.13 (OpenJDK 64-Bit Server VM, Java 17.0.9)
> Type in expressions to have them evaluated.
> Type :help for more information.
> {"ts":"2024-05-04T01:11:03.797Z","level":"WARN","msg":"Unable to load 
> native-hadoop library for your platform... using builtin-java classes where 
> applicable","logger":"NativeCodeLoader"}
> {"ts":"2024-05-04T01:11:04.104Z","level":"WARN","msg":"Service 'SparkUI' 
> could not bind on port 4040. Attempting port 4041.","logger":"Utils"}
> Spark context Web UI available at http://10.10.114.155:4041
> Spark context available as 'sc' (master = local[*], app id = 
> local-1714785064155).
> Spark session available as 'spark'.
> ```
>  
> spark-shell output when without structured logging enabled:
> ```
> Setting default log level to "WARN".
> To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use 
> setLogLevel(newLevel).
> Welcome to
>                     __
>      / __/__  ___ _/ /__
>     _\ \/ _ \/ _ `/ __/  '_/
>    /___/ .__/\_,_/_/ /_/\_\   version 4.0.0-SNAPSHOT
>       /_/
>          
> Using Scala version 2.13.13 (OpenJDK 64-Bit Server VM, Java 17.0.9)
> Type in expressions to have them evaluated.
> Type :help for more information.
> 24/05/03 18:11:35 WARN NativeCodeLoader: Unable to load native-hadoop library 
> for your platform... using builtin-java classes where applicable
> 24/05/03 18:11:35 WARN Utils: Service 'SparkUI' could not bind on port 4040. 
> Attempting port 4041.
> Spark context Web UI available at http://10.10.114.155:4041
> Spark context available as 'sc' (master = local[*], app id = 
> local-1714785095892).
> Spark session available as 'spark'.
> ```



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48127) Fix `dev/scalastyle` to check `hadoop-cloud` and `jvm-profiler` modules

2024-05-03 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-48127.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46376
[https://github.com/apache/spark/pull/46376]

> Fix `dev/scalastyle` to check `hadoop-cloud` and `jvm-profiler` modules
> ---
>
> Key: SPARK-48127
> URL: https://issues.apache.org/jira/browse/SPARK-48127
> Project: Spark
>  Issue Type: Sub-task
>  Components: Project Infra
>Affects Versions: 4.0.0
>    Reporter: Dongjoon Hyun
>    Assignee: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48116) Run `pyspark-pandas*` in PR builder and Daily Python CIs

2024-05-03 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-48116.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46367
[https://github.com/apache/spark/pull/46367]

> Run `pyspark-pandas*` in PR builder and Daily Python CIs
> 
>
> Key: SPARK-48116
> URL: https://issues.apache.org/jira/browse/SPARK-48116
> Project: Spark
>  Issue Type: Sub-task
>  Components: Project Infra
>Affects Versions: 4.0.0
>    Reporter: Dongjoon Hyun
>    Assignee: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-48127) Fix `dev/scalastyle` to check `hadoop-cloud` and `jvm-profiler` modules

2024-05-03 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-48127:
--
Summary: Fix `dev/scalastyle` to check `hadoop-cloud` and `jvm-profiler` 
modules  (was: Fix `dev/scalastyle` to check `hadoop-cloud` and `jvm-profile` 
modules)

> Fix `dev/scalastyle` to check `hadoop-cloud` and `jvm-profiler` modules
> ---
>
> Key: SPARK-48127
> URL: https://issues.apache.org/jira/browse/SPARK-48127
> Project: Spark
>  Issue Type: Sub-task
>  Components: Project Infra
>Affects Versions: 4.0.0
>    Reporter: Dongjoon Hyun
>    Assignee: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-48127) Fix `dev/scalastyle` to check `hadoop-cloud` and `jvm-profile` modules

2024-05-03 Thread Dongjoon Hyun (Jira)
Dongjoon Hyun created SPARK-48127:
-

 Summary: Fix `dev/scalastyle` to check `hadoop-cloud` and 
`jvm-profile` modules
 Key: SPARK-48127
 URL: https://issues.apache.org/jira/browse/SPARK-48127
 Project: Spark
  Issue Type: Sub-task
  Components: Project Infra
Affects Versions: 4.0.0
Reporter: Dongjoon Hyun






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48121) Promote ` KubernetesDriverConf` to `DeveloperApi`

2024-05-03 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-48121.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46373
[https://github.com/apache/spark/pull/46373]

> Promote ` KubernetesDriverConf` to `DeveloperApi`
> -
>
> Key: SPARK-48121
> URL: https://issues.apache.org/jira/browse/SPARK-48121
> Project: Spark
>  Issue Type: Sub-task
>  Components: k8s
>Affects Versions: kubernetes-operator-0.1.0
>Reporter: Zhou JIANG
>Assignee: Zhou JIANG
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48120) Enable autolink to SPARK jira issue

2024-05-03 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-48120.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 11
[https://github.com/apache/spark-kubernetes-operator/pull/11]

> Enable autolink to SPARK jira issue
> ---
>
> Key: SPARK-48120
> URL: https://issues.apache.org/jira/browse/SPARK-48120
> Project: Spark
>  Issue Type: Sub-task
>  Components: Project Infra
>Affects Versions: kubernetes-operator-0.1.0
>    Reporter: Dongjoon Hyun
>    Assignee: Dongjoon Hyun
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48114) ErrorClassesJsonReader complies template regex on every template resolution

2024-05-03 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-48114.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46365
[https://github.com/apache/spark/pull/46365]

> ErrorClassesJsonReader complies template regex on every template resolution
> ---
>
> Key: SPARK-48114
> URL: https://issues.apache.org/jira/browse/SPARK-48114
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 4.0.0
>Reporter: Vladimir Golubev
>Assignee: Vladimir Golubev
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> `SparkRuntimeException` uses `SparkThrowableHelper`, which uses 
> `ErrorClassesJsonReader` to create error message string from templates in 
> `error-conditions.json`, but template regex is compiled on every 
> `SparkRuntimeException` constructor invocation.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-45923) Spark Kubernetes Operator

2024-05-03 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-45923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-45923:
--
Affects Version/s: kubernetes-operator-0.1.0
   (was: 4.0.0)

> Spark Kubernetes Operator
> -
>
> Key: SPARK-45923
> URL: https://issues.apache.org/jira/browse/SPARK-45923
> Project: Spark
>  Issue Type: New Feature
>  Components: Kubernetes
>Affects Versions: kubernetes-operator-0.1.0
>Reporter: Zhou Jiang
>Assignee: Zhou Jiang
>Priority: Major
>  Labels: SPIP
>
> We would like to develop a Java-based Kubernetes operator for Apache Spark. 
> Following the operator pattern 
> (https://kubernetes.io/docs/concepts/extend-kubernetes/operator/), Spark 
> users may manage applications and related components seamlessly using native 
> tools like kubectl. The primary goal is to simplify the Spark user experience 
> on Kubernetes, minimizing the learning curve and operational complexities and 
> therefore enable users to focus on the Spark application development.
> Ideally, it would reside in a separate repository (like Spark docker or Spark 
> connect golang) and be loosely connected to the Spark release cycle while 
> supporting multiple Spark versions.
> SPIP doc: 
> [https://docs.google.com/document/d/1f5mm9VpSKeWC72Y9IiKN2jbBn32rHxjWKUfLRaGEcLE|https://docs.google.com/document/d/1f5mm9VpSKeWC72Y9IiKN2jbBn32rHxjWKUfLRaGEcLE/edit#heading=h.hhham7siu2vi]
> Dev email discussion : 
> [https://lists.apache.org/thread/wdy7jfhf7m8jy74p6s0npjfd15ym5rxz]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-48120) Enable autolink to SPARK jira issue

2024-05-03 Thread Dongjoon Hyun (Jira)
Dongjoon Hyun created SPARK-48120:
-

 Summary: Enable autolink to SPARK jira issue
 Key: SPARK-48120
 URL: https://issues.apache.org/jira/browse/SPARK-48120
 Project: Spark
  Issue Type: Sub-task
  Components: Project Infra
Affects Versions: kubernetes-operator-0.1.0
Reporter: Dongjoon Hyun






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48119) Promote ` KubernetesDriverSpec` to `DeveloperApi`

2024-05-03 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-48119.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46371
[https://github.com/apache/spark/pull/46371]

> Promote ` KubernetesDriverSpec` to `DeveloperApi`
> -
>
> Key: SPARK-48119
> URL: https://issues.apache.org/jira/browse/SPARK-48119
> Project: Spark
>  Issue Type: Sub-task
>  Components: k8s
>Affects Versions: kubernetes-operator-0.1.0
>Reporter: Zhou JIANG
>Assignee: Zhou JIANG
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-48118) Support SPARK_SQL_LEGACY_CREATE_HIVE_TABLE env variable

2024-05-03 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-48118.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Issue resolved by pull request 46369
[https://github.com/apache/spark/pull/46369]

> Support SPARK_SQL_LEGACY_CREATE_HIVE_TABLE env variable
> ---
>
> Key: SPARK-48118
> URL: https://issues.apache.org/jira/browse/SPARK-48118
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 4.0.0
>    Reporter: Dongjoon Hyun
>    Assignee: Dongjoon Hyun
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>
> This issue aims to support `SPARK_SQL_LEGACY_CREATE_HIVE_TABLE` env variable 
> to provide more easier migration.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



<    1   2   3   4   5   6   7   8   9   10   >