[jira] [Updated] (SPARK-40943) Make MSCK optional in MSCK REPAIR TABLE commands

2023-02-15 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-40943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-40943:
--
Issue Type: Improvement  (was: Task)

> Make MSCK optional in MSCK REPAIR TABLE commands
> 
>
> Key: SPARK-40943
> URL: https://issues.apache.org/jira/browse/SPARK-40943
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: Ben Zhang
>Assignee: Ben Zhang
>Priority: Major
> Fix For: 3.4.0
>
>
> The current syntax for `MSCK REPAIR TABLE` is complex and difficult to 
> understand. The proposal is to make the `MSCK` keyword optional so that 
> `REPAIR TABLE` may be used in its stead.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-40943) Make MSCK optional in MSCK REPAIR TABLE commands

2023-02-15 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-40943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-40943:
--
Affects Version/s: 3.4.0
   (was: 3.3.1)

> Make MSCK optional in MSCK REPAIR TABLE commands
> 
>
> Key: SPARK-40943
> URL: https://issues.apache.org/jira/browse/SPARK-40943
> Project: Spark
>  Issue Type: Task
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: Ben Zhang
>Assignee: Ben Zhang
>Priority: Major
> Fix For: 3.4.0
>
>
> The current syntax for `MSCK REPAIR TABLE` is complex and difficult to 
> understand. The proposal is to make the `MSCK` keyword optional so that 
> `REPAIR TABLE` may be used in its stead.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-40943) Make MSCK optional in MSCK REPAIR TABLE commands

2023-02-15 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-40943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-40943.
---
Fix Version/s: 3.4.0
   Resolution: Fixed

Issue resolved by pull request 38433
[https://github.com/apache/spark/pull/38433]

> Make MSCK optional in MSCK REPAIR TABLE commands
> 
>
> Key: SPARK-40943
> URL: https://issues.apache.org/jira/browse/SPARK-40943
> Project: Spark
>  Issue Type: Task
>  Components: SQL
>Affects Versions: 3.3.1
>Reporter: Ben Zhang
>Assignee: Ben Zhang
>Priority: Major
> Fix For: 3.4.0
>
>
> The current syntax for `MSCK REPAIR TABLE` is complex and difficult to 
> understand. The proposal is to make the `MSCK` keyword optional so that 
> `REPAIR TABLE` may be used in its stead.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-40943) Make MSCK optional in MSCK REPAIR TABLE commands

2023-02-15 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-40943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-40943:
-

Assignee: Ben Zhang

> Make MSCK optional in MSCK REPAIR TABLE commands
> 
>
> Key: SPARK-40943
> URL: https://issues.apache.org/jira/browse/SPARK-40943
> Project: Spark
>  Issue Type: Task
>  Components: SQL
>Affects Versions: 3.3.1
>Reporter: Ben Zhang
>Assignee: Ben Zhang
>Priority: Major
>
> The current syntax for `MSCK REPAIR TABLE` is complex and difficult to 
> understand. The proposal is to make the `MSCK` keyword optional so that 
> `REPAIR TABLE` may be used in its stead.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-42463) Clean up the third-party Java source code introduced by SPARK-27180

2023-02-15 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-42463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17689571#comment-17689571
 ] 

Apache Spark commented on SPARK-42463:
--

User 'LuciferYang' has created a pull request for this issue:
https://github.com/apache/spark/pull/40052

> Clean up the third-party Java source code introduced by SPARK-27180
> ---
>
> Key: SPARK-42463
> URL: https://issues.apache.org/jira/browse/SPARK-42463
> Project: Spark
>  Issue Type: Improvement
>  Components: Tests, YARN
>Affects Versions: 3.5.0
>Reporter: Yang Jie
>Priority: Minor
>
> * 
> resource-managers/yarn/src/test/java/org/apache/hadoop/net/ServerSocketUtil.java
>  * 
> resource-managers/yarn/src/test/java/org/eclipse/jetty/server/SessionManager.java
>  * 
> resource-managers/yarn/src/test/java/org/eclipse/jetty/server/session/SessionHandler.java



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-42463) Clean up the third-party Java source code introduced by SPARK-27180

2023-02-15 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-42463:


Assignee: Apache Spark

> Clean up the third-party Java source code introduced by SPARK-27180
> ---
>
> Key: SPARK-42463
> URL: https://issues.apache.org/jira/browse/SPARK-42463
> Project: Spark
>  Issue Type: Improvement
>  Components: Tests, YARN
>Affects Versions: 3.5.0
>Reporter: Yang Jie
>Assignee: Apache Spark
>Priority: Minor
>
> * 
> resource-managers/yarn/src/test/java/org/apache/hadoop/net/ServerSocketUtil.java
>  * 
> resource-managers/yarn/src/test/java/org/eclipse/jetty/server/SessionManager.java
>  * 
> resource-managers/yarn/src/test/java/org/eclipse/jetty/server/session/SessionHandler.java



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-42463) Clean up the third-party Java source code introduced by SPARK-27180

2023-02-15 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-42463:


Assignee: (was: Apache Spark)

> Clean up the third-party Java source code introduced by SPARK-27180
> ---
>
> Key: SPARK-42463
> URL: https://issues.apache.org/jira/browse/SPARK-42463
> Project: Spark
>  Issue Type: Improvement
>  Components: Tests, YARN
>Affects Versions: 3.5.0
>Reporter: Yang Jie
>Priority: Minor
>
> * 
> resource-managers/yarn/src/test/java/org/apache/hadoop/net/ServerSocketUtil.java
>  * 
> resource-managers/yarn/src/test/java/org/eclipse/jetty/server/SessionManager.java
>  * 
> resource-managers/yarn/src/test/java/org/eclipse/jetty/server/session/SessionHandler.java



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-27180) Fix testing issues with yarn module in Hadoop-3

2023-02-15 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-27180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17689569#comment-17689569
 ] 

Apache Spark commented on SPARK-27180:
--

User 'LuciferYang' has created a pull request for this issue:
https://github.com/apache/spark/pull/40052

> Fix testing issues with yarn module in Hadoop-3
> ---
>
> Key: SPARK-27180
> URL: https://issues.apache.org/jira/browse/SPARK-27180
> Project: Spark
>  Issue Type: Sub-task
>  Components: Build, Spark Core, YARN
>Affects Versions: 3.0.0
>Reporter: Yuming Wang
>Assignee: Yuming Wang
>Priority: Major
> Fix For: 3.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-42463) Clean up the third-party Java source code introduced by SPARK-27180

2023-02-15 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-42463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17689568#comment-17689568
 ] 

Apache Spark commented on SPARK-42463:
--

User 'LuciferYang' has created a pull request for this issue:
https://github.com/apache/spark/pull/40052

> Clean up the third-party Java source code introduced by SPARK-27180
> ---
>
> Key: SPARK-42463
> URL: https://issues.apache.org/jira/browse/SPARK-42463
> Project: Spark
>  Issue Type: Improvement
>  Components: Tests, YARN
>Affects Versions: 3.5.0
>Reporter: Yang Jie
>Priority: Minor
>
> * 
> resource-managers/yarn/src/test/java/org/apache/hadoop/net/ServerSocketUtil.java
>  * 
> resource-managers/yarn/src/test/java/org/eclipse/jetty/server/SessionManager.java
>  * 
> resource-managers/yarn/src/test/java/org/eclipse/jetty/server/session/SessionHandler.java



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-42460) E2E test should clean-up results

2023-02-15 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-42460:
-

Assignee: Herman van Hövell

> E2E test should clean-up results
> 
>
> Key: SPARK-42460
> URL: https://issues.apache.org/jira/browse/SPARK-42460
> Project: Spark
>  Issue Type: Task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Herman van Hövell
>Assignee: Herman van Hövell
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-42460) E2E test should clean-up results

2023-02-15 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-42460.
---
Fix Version/s: 3.4.0
   Resolution: Fixed

Issue resolved by pull request 40048
[https://github.com/apache/spark/pull/40048]

> E2E test should clean-up results
> 
>
> Key: SPARK-42460
> URL: https://issues.apache.org/jira/browse/SPARK-42460
> Project: Spark
>  Issue Type: Task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Herman van Hövell
>Assignee: Herman van Hövell
>Priority: Major
> Fix For: 3.4.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-42463) Clean up the third-party Java source code introduced by SPARK-27180

2023-02-15 Thread Yang Jie (Jira)
Yang Jie created SPARK-42463:


 Summary: Clean up the third-party Java source code introduced by 
SPARK-27180
 Key: SPARK-42463
 URL: https://issues.apache.org/jira/browse/SPARK-42463
 Project: Spark
  Issue Type: Improvement
  Components: Tests, YARN
Affects Versions: 3.5.0
Reporter: Yang Jie


* 
resource-managers/yarn/src/test/java/org/apache/hadoop/net/ServerSocketUtil.java
 * 
resource-managers/yarn/src/test/java/org/eclipse/jetty/server/SessionManager.java
 * 
resource-managers/yarn/src/test/java/org/eclipse/jetty/server/session/SessionHandler.java



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-42462) Prevent `docker-image-tool.sh` from publishing OCI manifests

2023-02-15 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-42462:
--
Fix Version/s: 3.3.3
   (was: 3.3.2)

> Prevent `docker-image-tool.sh` from publishing OCI manifests
> 
>
> Key: SPARK-42462
> URL: https://issues.apache.org/jira/browse/SPARK-42462
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes
>Affects Versions: 3.2.4, 3.4.0, 3.3.3
>Reporter: Dongjoon Hyun
>Assignee: Dongjoon Hyun
>Priority: Blocker
> Fix For: 3.2.4, 3.4.0, 3.3.3
>
>
> https://github.com/docker/buildx/issues/1509



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-42462) Prevent `docker-image-tool.sh` from publishing OCI manifests

2023-02-15 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-42462:
-

Assignee: Dongjoon Hyun

> Prevent `docker-image-tool.sh` from publishing OCI manifests
> 
>
> Key: SPARK-42462
> URL: https://issues.apache.org/jira/browse/SPARK-42462
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes
>Affects Versions: 3.2.4, 3.4.0, 3.3.3
>Reporter: Dongjoon Hyun
>Assignee: Dongjoon Hyun
>Priority: Blocker
>
> https://github.com/docker/buildx/issues/1509



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-42462) Prevent `docker-image-tool.sh` from publishing OCI manifests

2023-02-15 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-42462.
---
Fix Version/s: 3.2.4
   3.3.2
   3.4.0
   Resolution: Fixed

Issue resolved by pull request 40051
[https://github.com/apache/spark/pull/40051]

> Prevent `docker-image-tool.sh` from publishing OCI manifests
> 
>
> Key: SPARK-42462
> URL: https://issues.apache.org/jira/browse/SPARK-42462
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes
>Affects Versions: 3.2.4, 3.4.0, 3.3.3
>Reporter: Dongjoon Hyun
>Assignee: Dongjoon Hyun
>Priority: Blocker
> Fix For: 3.2.4, 3.3.2, 3.4.0
>
>
> https://github.com/docker/buildx/issues/1509



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-42462) Prevent `docker-image-tool.sh` from publishing OCI manifests

2023-02-15 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-42462:
--
Affects Version/s: 3.2.3

> Prevent `docker-image-tool.sh` from publishing OCI manifests
> 
>
> Key: SPARK-42462
> URL: https://issues.apache.org/jira/browse/SPARK-42462
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes
>Affects Versions: 3.2.3, 3.3.2, 3.4.0
>Reporter: Dongjoon Hyun
>Priority: Blocker
>
> https://github.com/docker/buildx/issues/1509



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-42462) Prevent `docker-image-tool.sh` from publishing OCI manifests

2023-02-15 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-42462:
--
Affects Version/s: 3.2.4
   3.3.3
   (was: 3.2.3)
   (was: 3.3.2)

> Prevent `docker-image-tool.sh` from publishing OCI manifests
> 
>
> Key: SPARK-42462
> URL: https://issues.apache.org/jira/browse/SPARK-42462
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes
>Affects Versions: 3.2.4, 3.4.0, 3.3.3
>Reporter: Dongjoon Hyun
>Priority: Blocker
>
> https://github.com/docker/buildx/issues/1509



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-42462) Prevent `docker-image-tool.sh` from publishing OCI manifests

2023-02-15 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-42462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17689511#comment-17689511
 ] 

Apache Spark commented on SPARK-42462:
--

User 'dongjoon-hyun' has created a pull request for this issue:
https://github.com/apache/spark/pull/40051

> Prevent `docker-image-tool.sh` from publishing OCI manifests
> 
>
> Key: SPARK-42462
> URL: https://issues.apache.org/jira/browse/SPARK-42462
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes
>Affects Versions: 3.3.2, 3.4.0
>Reporter: Dongjoon Hyun
>Priority: Blocker
>
> https://github.com/docker/buildx/issues/1509



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-42462) Prevent `docker-image-tool.sh` from publishing OCI manifests

2023-02-15 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-42462:


Assignee: Apache Spark

> Prevent `docker-image-tool.sh` from publishing OCI manifests
> 
>
> Key: SPARK-42462
> URL: https://issues.apache.org/jira/browse/SPARK-42462
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes
>Affects Versions: 3.3.2, 3.4.0
>Reporter: Dongjoon Hyun
>Assignee: Apache Spark
>Priority: Blocker
>
> https://github.com/docker/buildx/issues/1509



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-42462) Prevent `docker-image-tool.sh` from publishing OCI manifests

2023-02-15 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-42462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17689509#comment-17689509
 ] 

Apache Spark commented on SPARK-42462:
--

User 'dongjoon-hyun' has created a pull request for this issue:
https://github.com/apache/spark/pull/40051

> Prevent `docker-image-tool.sh` from publishing OCI manifests
> 
>
> Key: SPARK-42462
> URL: https://issues.apache.org/jira/browse/SPARK-42462
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes
>Affects Versions: 3.3.2, 3.4.0
>Reporter: Dongjoon Hyun
>Priority: Blocker
>
> https://github.com/docker/buildx/issues/1509



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-42462) Prevent `docker-image-tool.sh` from publishing OCI manifests

2023-02-15 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-42462:


Assignee: (was: Apache Spark)

> Prevent `docker-image-tool.sh` from publishing OCI manifests
> 
>
> Key: SPARK-42462
> URL: https://issues.apache.org/jira/browse/SPARK-42462
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes
>Affects Versions: 3.3.2, 3.4.0
>Reporter: Dongjoon Hyun
>Priority: Blocker
>
> https://github.com/docker/buildx/issues/1509



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-42462) Prevent `docker-image-tool.sh` from publishing OCI manifests

2023-02-15 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-42462:
--
Summary: Prevent `docker-image-tool.sh` from publishing OCI manifests  
(was: Prevent `docker buildx` from publishing OCI manifests)

> Prevent `docker-image-tool.sh` from publishing OCI manifests
> 
>
> Key: SPARK-42462
> URL: https://issues.apache.org/jira/browse/SPARK-42462
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes
>Affects Versions: 3.3.2, 3.4.0
>Reporter: Dongjoon Hyun
>Priority: Blocker
>
> https://github.com/docker/buildx/issues/1509



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-42462) Prevent `docker buildx` from publishing OCI manifests

2023-02-15 Thread Dongjoon Hyun (Jira)
Dongjoon Hyun created SPARK-42462:
-

 Summary: Prevent `docker buildx` from publishing OCI manifests
 Key: SPARK-42462
 URL: https://issues.apache.org/jira/browse/SPARK-42462
 Project: Spark
  Issue Type: Bug
  Components: Kubernetes
Affects Versions: 3.3.2, 3.4.0
Reporter: Dongjoon Hyun


https://github.com/docker/buildx/issues/1509



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-42452) Remove hadoop-2 profile from Apache Spark

2023-02-15 Thread Yang Jie (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-42452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17689484#comment-17689484
 ] 

Yang Jie commented on SPARK-42452:
--

Thanks for your explanation [~dongjoon] . Let's wait until the right time :D

> Remove hadoop-2 profile from Apache Spark
> -
>
> Key: SPARK-42452
> URL: https://issues.apache.org/jira/browse/SPARK-42452
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 3.5.0
>Reporter: Yang Jie
>Priority: Major
>
> SPARK-40651 Drop Hadoop2 binary distribtuion from release process and 
> SPARK-42447 Remove Hadoop 2 GitHub Action job
>   



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-42461) Scala Client - Initial Set of Functions

2023-02-15 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-42461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17689476#comment-17689476
 ] 

Apache Spark commented on SPARK-42461:
--

User 'hvanhovell' has created a pull request for this issue:
https://github.com/apache/spark/pull/40050

> Scala Client - Initial Set of Functions
> ---
>
> Key: SPARK-42461
> URL: https://issues.apache.org/jira/browse/SPARK-42461
> Project: Spark
>  Issue Type: Task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Herman van Hövell
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-42461) Scala Client - Initial Set of Functions

2023-02-15 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-42461:


Assignee: Apache Spark

> Scala Client - Initial Set of Functions
> ---
>
> Key: SPARK-42461
> URL: https://issues.apache.org/jira/browse/SPARK-42461
> Project: Spark
>  Issue Type: Task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Herman van Hövell
>Assignee: Apache Spark
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-42461) Scala Client - Initial Set of Functions

2023-02-15 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-42461:


Assignee: (was: Apache Spark)

> Scala Client - Initial Set of Functions
> ---
>
> Key: SPARK-42461
> URL: https://issues.apache.org/jira/browse/SPARK-42461
> Project: Spark
>  Issue Type: Task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Herman van Hövell
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-42398) refine default column value framework

2023-02-15 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-42398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17689474#comment-17689474
 ] 

Apache Spark commented on SPARK-42398:
--

User 'cloud-fan' has created a pull request for this issue:
https://github.com/apache/spark/pull/40049

> refine default column value framework
> -
>
> Key: SPARK-42398
> URL: https://issues.apache.org/jira/browse/SPARK-42398
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: Wenchen Fan
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-42398) refine default column value framework

2023-02-15 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-42398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17689473#comment-17689473
 ] 

Apache Spark commented on SPARK-42398:
--

User 'cloud-fan' has created a pull request for this issue:
https://github.com/apache/spark/pull/40049

> refine default column value framework
> -
>
> Key: SPARK-42398
> URL: https://issues.apache.org/jira/browse/SPARK-42398
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: Wenchen Fan
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-42461) Scala Client - Initial Set of Functions

2023-02-15 Thread Jira
Herman van Hövell created SPARK-42461:
-

 Summary: Scala Client - Initial Set of Functions
 Key: SPARK-42461
 URL: https://issues.apache.org/jira/browse/SPARK-42461
 Project: Spark
  Issue Type: Task
  Components: Connect
Affects Versions: 3.4.0
Reporter: Herman van Hövell






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-42459) Create pyspark.sql.connect.utils to keep common codes

2023-02-15 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-42459:


Assignee: Hyukjin Kwon

> Create pyspark.sql.connect.utils to keep common codes
> -
>
> Key: SPARK-42459
> URL: https://issues.apache.org/jira/browse/SPARK-42459
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Hyukjin Kwon
>Assignee: Hyukjin Kwon
>Priority: Major
>
> SPARK-41457 added `require_minimum_grpc_version` in pandas.utils which is 
> actually unrelated from connect module. we should move all to a separate 
> utils directory



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-42459) Create pyspark.sql.connect.utils to keep common codes

2023-02-15 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-42459.
--
Fix Version/s: 3.4.0
   Resolution: Fixed

Issue resolved by pull request 40047
[https://github.com/apache/spark/pull/40047]

> Create pyspark.sql.connect.utils to keep common codes
> -
>
> Key: SPARK-42459
> URL: https://issues.apache.org/jira/browse/SPARK-42459
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Hyukjin Kwon
>Assignee: Hyukjin Kwon
>Priority: Major
> Fix For: 3.4.0
>
>
> SPARK-41457 added `require_minimum_grpc_version` in pandas.utils which is 
> actually unrelated from connect module. we should move all to a separate 
> utils directory



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-42460) E2E test should clean-up results

2023-02-15 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-42460:


Assignee: (was: Apache Spark)

> E2E test should clean-up results
> 
>
> Key: SPARK-42460
> URL: https://issues.apache.org/jira/browse/SPARK-42460
> Project: Spark
>  Issue Type: Task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Herman van Hövell
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-42460) E2E test should clean-up results

2023-02-15 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-42460:


Assignee: Apache Spark

> E2E test should clean-up results
> 
>
> Key: SPARK-42460
> URL: https://issues.apache.org/jira/browse/SPARK-42460
> Project: Spark
>  Issue Type: Task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Herman van Hövell
>Assignee: Apache Spark
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-42460) E2E test should clean-up results

2023-02-15 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-42460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17689459#comment-17689459
 ] 

Apache Spark commented on SPARK-42460:
--

User 'hvanhovell' has created a pull request for this issue:
https://github.com/apache/spark/pull/40048

> E2E test should clean-up results
> 
>
> Key: SPARK-42460
> URL: https://issues.apache.org/jira/browse/SPARK-42460
> Project: Spark
>  Issue Type: Task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Herman van Hövell
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-42460) E2E test should clean-up results

2023-02-15 Thread Jira
Herman van Hövell created SPARK-42460:
-

 Summary: E2E test should clean-up results
 Key: SPARK-42460
 URL: https://issues.apache.org/jira/browse/SPARK-42460
 Project: Spark
  Issue Type: Task
  Components: Connect
Affects Versions: 3.4.0
Reporter: Herman van Hövell






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-42451) Remove 3.1 and Java 17 check from filter condition of `testingVersions` in `HiveExternalCatalogVersionsSuite`

2023-02-15 Thread Yuming Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuming Wang reassigned SPARK-42451:
---

Assignee: Yang Jie

> Remove 3.1 and Java 17 check from  filter condition of `testingVersions` in 
> `HiveExternalCatalogVersionsSuite`
> --
>
> Key: SPARK-42451
> URL: https://issues.apache.org/jira/browse/SPARK-42451
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL, Tests
>Affects Versions: 3.5.0
>Reporter: Yang Jie
>Assignee: Yang Jie
>Priority: Minor
>
> Spark 3.1 already EOL and has been deleted from 
> [https://dist.apache.org/repos/dist/release/spark,] we can simplifies the 
> filter conditions of `testingVersions`, all version already support Java 17
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-42451) Remove 3.1 and Java 17 check from filter condition of `testingVersions` in `HiveExternalCatalogVersionsSuite`

2023-02-15 Thread Yuming Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuming Wang resolved SPARK-42451.
-
Fix Version/s: 3.5.0
   Resolution: Fixed

Issue resolved by pull request 40039
[https://github.com/apache/spark/pull/40039]

> Remove 3.1 and Java 17 check from  filter condition of `testingVersions` in 
> `HiveExternalCatalogVersionsSuite`
> --
>
> Key: SPARK-42451
> URL: https://issues.apache.org/jira/browse/SPARK-42451
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL, Tests
>Affects Versions: 3.5.0
>Reporter: Yang Jie
>Assignee: Yang Jie
>Priority: Minor
> Fix For: 3.5.0
>
>
> Spark 3.1 already EOL and has been deleted from 
> [https://dist.apache.org/repos/dist/release/spark,] we can simplifies the 
> filter conditions of `testingVersions`, all version already support Java 17
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-41817) SparkSession.read support reading with schema

2023-02-15 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-41817:


Assignee: Sandeep Singh

> SparkSession.read support reading with schema
> -
>
> Key: SPARK-41817
> URL: https://issues.apache.org/jira/browse/SPARK-41817
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Sandeep Singh
>Assignee: Sandeep Singh
>Priority: Major
>
> {code:java}
> File 
> "/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/readwriter.py", 
> line 122, in pyspark.sql.connect.readwriter.DataFrameReader.load
> Failed example:
> with tempfile.TemporaryDirectory() as d:
> # Write a DataFrame into a CSV file with a header
> df = spark.createDataFrame([{"age": 100, "name": "Hyukjin Kwon"}])
> df.write.option("header", 
> True).mode("overwrite").format("csv").save(d)
> # Read the CSV file as a DataFrame with 'nullValue' option set to 
> 'Hyukjin Kwon',
> # and 'header' option set to `True`.
> df = spark.read.load(
> d, schema=df.schema, format="csv", nullValue="Hyukjin Kwon", 
> header=True)
> df.printSchema()
> df.show()
> Exception raised:
> Traceback (most recent call last):
>   File 
> "/usr/local/Cellar/python@3.10/3.10.8/Frameworks/Python.framework/Versions/3.10/lib/python3.10/doctest.py",
>  line 1350, in __run
> exec(compile(example.source, filename, "single",
>   File " pyspark.sql.connect.readwriter.DataFrameReader.load[1]>", line 10, in 
> df.printSchema()
>   File 
> "/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/dataframe.py", 
> line 1039, in printSchema
> print(self._tree_string())
>   File 
> "/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/dataframe.py", 
> line 1035, in _tree_string
> query = self._plan.to_proto(self._session.client)
>   File 
> "/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/plan.py", line 
> 92, in to_proto
> plan.root.CopyFrom(self.plan(session))
>   File 
> "/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/plan.py", line 
> 245, in plan
> plan.read.data_source.schema = self.schema
> TypeError: bad argument type for built-in operation {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-41817) SparkSession.read support reading with schema

2023-02-15 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-41817.
--
Fix Version/s: 3.4.0
   Resolution: Fixed

Issue resolved by pull request 40046
[https://github.com/apache/spark/pull/40046]

> SparkSession.read support reading with schema
> -
>
> Key: SPARK-41817
> URL: https://issues.apache.org/jira/browse/SPARK-41817
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Sandeep Singh
>Assignee: Sandeep Singh
>Priority: Major
> Fix For: 3.4.0
>
>
> {code:java}
> File 
> "/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/readwriter.py", 
> line 122, in pyspark.sql.connect.readwriter.DataFrameReader.load
> Failed example:
> with tempfile.TemporaryDirectory() as d:
> # Write a DataFrame into a CSV file with a header
> df = spark.createDataFrame([{"age": 100, "name": "Hyukjin Kwon"}])
> df.write.option("header", 
> True).mode("overwrite").format("csv").save(d)
> # Read the CSV file as a DataFrame with 'nullValue' option set to 
> 'Hyukjin Kwon',
> # and 'header' option set to `True`.
> df = spark.read.load(
> d, schema=df.schema, format="csv", nullValue="Hyukjin Kwon", 
> header=True)
> df.printSchema()
> df.show()
> Exception raised:
> Traceback (most recent call last):
>   File 
> "/usr/local/Cellar/python@3.10/3.10.8/Frameworks/Python.framework/Versions/3.10/lib/python3.10/doctest.py",
>  line 1350, in __run
> exec(compile(example.source, filename, "single",
>   File " pyspark.sql.connect.readwriter.DataFrameReader.load[1]>", line 10, in 
> df.printSchema()
>   File 
> "/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/dataframe.py", 
> line 1039, in printSchema
> print(self._tree_string())
>   File 
> "/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/dataframe.py", 
> line 1035, in _tree_string
> query = self._plan.to_proto(self._session.client)
>   File 
> "/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/plan.py", line 
> 92, in to_proto
> plan.root.CopyFrom(self.plan(session))
>   File 
> "/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/plan.py", line 
> 245, in plan
> plan.read.data_source.schema = self.schema
> TypeError: bad argument type for built-in operation {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-42453) Implement function max in Scala client

2023-02-15 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-42453.
-
Fix Version/s: 3.4.0
   Resolution: Fixed

Issue resolved by pull request 40041
[https://github.com/apache/spark/pull/40041]

> Implement function max in Scala client
> --
>
> Key: SPARK-42453
> URL: https://issues.apache.org/jira/browse/SPARK-42453
> Project: Spark
>  Issue Type: Task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
> Fix For: 3.4.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-42453) Implement function max in Scala client

2023-02-15 Thread Wenchen Fan (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-42453:
---

Assignee: Rui Wang

> Implement function max in Scala client
> --
>
> Key: SPARK-42453
> URL: https://issues.apache.org/jira/browse/SPARK-42453
> Project: Spark
>  Issue Type: Task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-42456) Consolidating the PySpark version upgrade note pages into a single page to make it easier to read

2023-02-15 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-42456.
--
Fix Version/s: 3.4.0
   Resolution: Fixed

Issue resolved by pull request 40044
[https://github.com/apache/spark/pull/40044]

> Consolidating the PySpark version upgrade note pages into a single page to 
> make it easier to read
> -
>
> Key: SPARK-42456
> URL: https://issues.apache.org/jira/browse/SPARK-42456
> Project: Spark
>  Issue Type: Documentation
>  Components: PySpark
>Affects Versions: 3.4.0
>Reporter: Allan Folting
>Assignee: Allan Folting
>Priority: Major
> Fix For: 3.4.0
>
>
> Creating a new PySpark migration guide sub page and consolidating the 
> existing 9 separate pages into this one new page. This makes it easier to 
> take a look across multiple version upgrades by simply scrolling on the page.
> Also, this is similar to the Spark Core Migration Guide page here:
> [https://spark.apache.org/docs/latest/core-migration-guide.html]
>  
> Updating the existing main Migration Guide page to point to this new sub page 
> and also making some minor language updates to help readers.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-42456) Consolidating the PySpark version upgrade note pages into a single page to make it easier to read

2023-02-15 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-42456:


Assignee: Allan Folting

> Consolidating the PySpark version upgrade note pages into a single page to 
> make it easier to read
> -
>
> Key: SPARK-42456
> URL: https://issues.apache.org/jira/browse/SPARK-42456
> Project: Spark
>  Issue Type: Documentation
>  Components: PySpark
>Affects Versions: 3.4.0
>Reporter: Allan Folting
>Assignee: Allan Folting
>Priority: Major
>
> Creating a new PySpark migration guide sub page and consolidating the 
> existing 9 separate pages into this one new page. This makes it easier to 
> take a look across multiple version upgrades by simply scrolling on the page.
> Also, this is similar to the Spark Core Migration Guide page here:
> [https://spark.apache.org/docs/latest/core-migration-guide.html]
>  
> Updating the existing main Migration Guide page to point to this new sub page 
> and also making some minor language updates to help readers.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-42459) Create pyspark.sql.connect.utils to keep common codes

2023-02-15 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-42459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17689421#comment-17689421
 ] 

Apache Spark commented on SPARK-42459:
--

User 'HyukjinKwon' has created a pull request for this issue:
https://github.com/apache/spark/pull/40047

> Create pyspark.sql.connect.utils to keep common codes
> -
>
> Key: SPARK-42459
> URL: https://issues.apache.org/jira/browse/SPARK-42459
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Hyukjin Kwon
>Priority: Major
>
> SPARK-41457 added `require_minimum_grpc_version` in pandas.utils which is 
> actually unrelated from connect module. we should move all to a separate 
> utils directory



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-42459) Create pyspark.sql.connect.utils to keep common codes

2023-02-15 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-42459:


Assignee: Apache Spark

> Create pyspark.sql.connect.utils to keep common codes
> -
>
> Key: SPARK-42459
> URL: https://issues.apache.org/jira/browse/SPARK-42459
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Hyukjin Kwon
>Assignee: Apache Spark
>Priority: Major
>
> SPARK-41457 added `require_minimum_grpc_version` in pandas.utils which is 
> actually unrelated from connect module. we should move all to a separate 
> utils directory



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-42459) Create pyspark.sql.connect.utils to keep common codes

2023-02-15 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-42459:


Assignee: (was: Apache Spark)

> Create pyspark.sql.connect.utils to keep common codes
> -
>
> Key: SPARK-42459
> URL: https://issues.apache.org/jira/browse/SPARK-42459
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Hyukjin Kwon
>Priority: Major
>
> SPARK-41457 added `require_minimum_grpc_version` in pandas.utils which is 
> actually unrelated from connect module. we should move all to a separate 
> utils directory



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-41817) SparkSession.read support reading with schema

2023-02-15 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-41817:


Assignee: (was: Apache Spark)

> SparkSession.read support reading with schema
> -
>
> Key: SPARK-41817
> URL: https://issues.apache.org/jira/browse/SPARK-41817
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Sandeep Singh
>Priority: Major
>
> {code:java}
> File 
> "/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/readwriter.py", 
> line 122, in pyspark.sql.connect.readwriter.DataFrameReader.load
> Failed example:
> with tempfile.TemporaryDirectory() as d:
> # Write a DataFrame into a CSV file with a header
> df = spark.createDataFrame([{"age": 100, "name": "Hyukjin Kwon"}])
> df.write.option("header", 
> True).mode("overwrite").format("csv").save(d)
> # Read the CSV file as a DataFrame with 'nullValue' option set to 
> 'Hyukjin Kwon',
> # and 'header' option set to `True`.
> df = spark.read.load(
> d, schema=df.schema, format="csv", nullValue="Hyukjin Kwon", 
> header=True)
> df.printSchema()
> df.show()
> Exception raised:
> Traceback (most recent call last):
>   File 
> "/usr/local/Cellar/python@3.10/3.10.8/Frameworks/Python.framework/Versions/3.10/lib/python3.10/doctest.py",
>  line 1350, in __run
> exec(compile(example.source, filename, "single",
>   File " pyspark.sql.connect.readwriter.DataFrameReader.load[1]>", line 10, in 
> df.printSchema()
>   File 
> "/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/dataframe.py", 
> line 1039, in printSchema
> print(self._tree_string())
>   File 
> "/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/dataframe.py", 
> line 1035, in _tree_string
> query = self._plan.to_proto(self._session.client)
>   File 
> "/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/plan.py", line 
> 92, in to_proto
> plan.root.CopyFrom(self.plan(session))
>   File 
> "/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/plan.py", line 
> 245, in plan
> plan.read.data_source.schema = self.schema
> TypeError: bad argument type for built-in operation {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-41817) SparkSession.read support reading with schema

2023-02-15 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-41817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-41817:


Assignee: Apache Spark

> SparkSession.read support reading with schema
> -
>
> Key: SPARK-41817
> URL: https://issues.apache.org/jira/browse/SPARK-41817
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Sandeep Singh
>Assignee: Apache Spark
>Priority: Major
>
> {code:java}
> File 
> "/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/readwriter.py", 
> line 122, in pyspark.sql.connect.readwriter.DataFrameReader.load
> Failed example:
> with tempfile.TemporaryDirectory() as d:
> # Write a DataFrame into a CSV file with a header
> df = spark.createDataFrame([{"age": 100, "name": "Hyukjin Kwon"}])
> df.write.option("header", 
> True).mode("overwrite").format("csv").save(d)
> # Read the CSV file as a DataFrame with 'nullValue' option set to 
> 'Hyukjin Kwon',
> # and 'header' option set to `True`.
> df = spark.read.load(
> d, schema=df.schema, format="csv", nullValue="Hyukjin Kwon", 
> header=True)
> df.printSchema()
> df.show()
> Exception raised:
> Traceback (most recent call last):
>   File 
> "/usr/local/Cellar/python@3.10/3.10.8/Frameworks/Python.framework/Versions/3.10/lib/python3.10/doctest.py",
>  line 1350, in __run
> exec(compile(example.source, filename, "single",
>   File " pyspark.sql.connect.readwriter.DataFrameReader.load[1]>", line 10, in 
> df.printSchema()
>   File 
> "/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/dataframe.py", 
> line 1039, in printSchema
> print(self._tree_string())
>   File 
> "/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/dataframe.py", 
> line 1035, in _tree_string
> query = self._plan.to_proto(self._session.client)
>   File 
> "/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/plan.py", line 
> 92, in to_proto
> plan.root.CopyFrom(self.plan(session))
>   File 
> "/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/plan.py", line 
> 245, in plan
> plan.read.data_source.schema = self.schema
> TypeError: bad argument type for built-in operation {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-41817) SparkSession.read support reading with schema

2023-02-15 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-41817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17689420#comment-17689420
 ] 

Apache Spark commented on SPARK-41817:
--

User 'ueshin' has created a pull request for this issue:
https://github.com/apache/spark/pull/40046

> SparkSession.read support reading with schema
> -
>
> Key: SPARK-41817
> URL: https://issues.apache.org/jira/browse/SPARK-41817
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Sandeep Singh
>Priority: Major
>
> {code:java}
> File 
> "/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/readwriter.py", 
> line 122, in pyspark.sql.connect.readwriter.DataFrameReader.load
> Failed example:
> with tempfile.TemporaryDirectory() as d:
> # Write a DataFrame into a CSV file with a header
> df = spark.createDataFrame([{"age": 100, "name": "Hyukjin Kwon"}])
> df.write.option("header", 
> True).mode("overwrite").format("csv").save(d)
> # Read the CSV file as a DataFrame with 'nullValue' option set to 
> 'Hyukjin Kwon',
> # and 'header' option set to `True`.
> df = spark.read.load(
> d, schema=df.schema, format="csv", nullValue="Hyukjin Kwon", 
> header=True)
> df.printSchema()
> df.show()
> Exception raised:
> Traceback (most recent call last):
>   File 
> "/usr/local/Cellar/python@3.10/3.10.8/Frameworks/Python.framework/Versions/3.10/lib/python3.10/doctest.py",
>  line 1350, in __run
> exec(compile(example.source, filename, "single",
>   File " pyspark.sql.connect.readwriter.DataFrameReader.load[1]>", line 10, in 
> df.printSchema()
>   File 
> "/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/dataframe.py", 
> line 1039, in printSchema
> print(self._tree_string())
>   File 
> "/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/dataframe.py", 
> line 1035, in _tree_string
> query = self._plan.to_proto(self._session.client)
>   File 
> "/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/plan.py", line 
> 92, in to_proto
> plan.root.CopyFrom(self.plan(session))
>   File 
> "/Users/s.singh/personal/spark-oss/python/pyspark/sql/connect/plan.py", line 
> 245, in plan
> plan.read.data_source.schema = self.schema
> TypeError: bad argument type for built-in operation {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-42459) Create pyspark.sql.connect.utils to keep common codes

2023-02-15 Thread Hyukjin Kwon (Jira)
Hyukjin Kwon created SPARK-42459:


 Summary: Create pyspark.sql.connect.utils to keep common codes
 Key: SPARK-42459
 URL: https://issues.apache.org/jira/browse/SPARK-42459
 Project: Spark
  Issue Type: Sub-task
  Components: Connect
Affects Versions: 3.4.0
Reporter: Hyukjin Kwon


SPARK-41457 added `require_minimum_grpc_version` in pandas.utils which is 
actually unrelated from connect module. we should move all to a separate utils 
directory



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-42458) createDataFrame should support DDL string as schema

2023-02-15 Thread Takuya Ueshin (Jira)
Takuya Ueshin created SPARK-42458:
-

 Summary: createDataFrame should support DDL string as schema
 Key: SPARK-42458
 URL: https://issues.apache.org/jira/browse/SPARK-42458
 Project: Spark
  Issue Type: Sub-task
  Components: Connect
Affects Versions: 3.4.0
Reporter: Takuya Ueshin


{code:python}
File "/.../python/pyspark/sql/connect/readwriter.py", line 393, in 
pyspark.sql.connect.readwriter.DataFrameWriter.option
Failed example:
with tempfile.TemporaryDirectory() as d:
# Write a DataFrame into a CSV file with 'nullValue' option set to 
'Hyukjin Kwon'.
df = spark.createDataFrame([(100, None)], "age INT, name STRING")
df.write.option("nullValue", "Hyukjin 
Kwon").mode("overwrite").format("csv").save(d)

# Read the CSV file as a DataFrame.
spark.read.schema(df.schema).format('csv').load(d).show()
Exception raised:
Traceback (most recent call last):
  File "/.../lib/python3.9/doctest.py", line 1334, in __run
exec(compile(example.source, filename, "single",
  File "", line 3, in 
df = spark.createDataFrame([(100, None)], "age INT, name STRING")
  File "/.../python/pyspark/sql/connect/session.py", line 312, in 
createDataFrame
raise ValueError(
ValueError: Some of types cannot be determined after inferring, a 
StructType Schema is required in this case
{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-42426) insertInto fails when the column names are different from the table columns

2023-02-15 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-42426.
--
Fix Version/s: 3.4.0
   Resolution: Fixed

Issue resolved by pull request 40024
[https://github.com/apache/spark/pull/40024]

> insertInto fails when the column names are different from the table columns
> ---
>
> Key: SPARK-42426
> URL: https://issues.apache.org/jira/browse/SPARK-42426
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Takuya Ueshin
>Assignee: Takuya Ueshin
>Priority: Major
> Fix For: 3.4.0
>
>
> {noformat}
> File "/.../python/pyspark/sql/connect/readwriter.py", line 518, in 
> pyspark.sql.connect.readwriter.DataFrameWriter.insertInto
> Failed example:
> df.selectExpr("age AS col1", "name AS col2").write.insertInto("tblA")
> Exception raised:
> Traceback (most recent call last):
>   File "/.../lib/python3.9/doctest.py", line 1334, in __run
> exec(compile(example.source, filename, "single",
>   File " pyspark.sql.connect.readwriter.DataFrameWriter.insertInto[3]>", line 1, in 
> 
> df.selectExpr("age AS col1", "name AS col2").write.insertInto("tblA")
>   File "/.../python/pyspark/sql/connect/readwriter.py", line 477, in 
> insertInto
> self.saveAsTable(tableName)
>   File "/.../python/pyspark/sql/connect/readwriter.py", line 495, in 
> saveAsTable
> 
> self._spark.client.execute_command(self._write.command(self._spark.client))
>   File "/.../python/pyspark/sql/connect/client.py", line 553, in 
> execute_command
> self._execute(req)
>   File "/.../python/pyspark/sql/connect/client.py", line 648, in _execute
> self._handle_error(rpc_error)
>   File "/.../python/pyspark/sql/connect/client.py", line 718, in 
> _handle_error
> raise convert_exception(info, status.message) from None
> pyspark.errors.exceptions.connect.AnalysisException: Cannot resolve 'age' 
> given input columns: [col1, col2].
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-42426) insertInto fails when the column names are different from the table columns

2023-02-15 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-42426:


Assignee: Takuya Ueshin

> insertInto fails when the column names are different from the table columns
> ---
>
> Key: SPARK-42426
> URL: https://issues.apache.org/jira/browse/SPARK-42426
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Takuya Ueshin
>Assignee: Takuya Ueshin
>Priority: Major
>
> {noformat}
> File "/.../python/pyspark/sql/connect/readwriter.py", line 518, in 
> pyspark.sql.connect.readwriter.DataFrameWriter.insertInto
> Failed example:
> df.selectExpr("age AS col1", "name AS col2").write.insertInto("tblA")
> Exception raised:
> Traceback (most recent call last):
>   File "/.../lib/python3.9/doctest.py", line 1334, in __run
> exec(compile(example.source, filename, "single",
>   File " pyspark.sql.connect.readwriter.DataFrameWriter.insertInto[3]>", line 1, in 
> 
> df.selectExpr("age AS col1", "name AS col2").write.insertInto("tblA")
>   File "/.../python/pyspark/sql/connect/readwriter.py", line 477, in 
> insertInto
> self.saveAsTable(tableName)
>   File "/.../python/pyspark/sql/connect/readwriter.py", line 495, in 
> saveAsTable
> 
> self._spark.client.execute_command(self._write.command(self._spark.client))
>   File "/.../python/pyspark/sql/connect/client.py", line 553, in 
> execute_command
> self._execute(req)
>   File "/.../python/pyspark/sql/connect/client.py", line 648, in _execute
> self._handle_error(rpc_error)
>   File "/.../python/pyspark/sql/connect/client.py", line 718, in 
> _handle_error
> raise convert_exception(info, status.message) from None
> pyspark.errors.exceptions.connect.AnalysisException: Cannot resolve 'age' 
> given input columns: [col1, col2].
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-42455) Rename JDBC option inferTimestampNTZType as preferTimestampNTZ

2023-02-15 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-42455.
--
Fix Version/s: 3.4.0
   Resolution: Fixed

Issue resolved by pull request 40042
[https://github.com/apache/spark/pull/40042]

> Rename JDBC option inferTimestampNTZType as preferTimestampNTZ
> --
>
> Key: SPARK-42455
> URL: https://issues.apache.org/jira/browse/SPARK-42455
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: Gengliang Wang
>Assignee: Gengliang Wang
>Priority: Major
> Fix For: 3.4.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-42384) Mask function's generated code does not handle null input

2023-02-15 Thread Gengliang Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gengliang Wang resolved SPARK-42384.

Fix Version/s: 3.4.0
   Resolution: Fixed

Issue resolved by pull request 39945
[https://github.com/apache/spark/pull/39945]

> Mask function's generated code does not handle null input
> -
>
> Key: SPARK-42384
> URL: https://issues.apache.org/jira/browse/SPARK-42384
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.4.0, 3.5.0
>Reporter: Bruce Robbins
>Assignee: Bruce Robbins
>Priority: Major
> Fix For: 3.4.0
>
>
> Example:
> {noformat}
> create or replace temp view v1 as
> select * from values
> (null),
> ('AbCD123-@$#')
> as data(col1);
> cache table v1;
> select mask(col1) from v1;
> {noformat}
> This query results in a {{NullPointerException}}:
> {noformat}
> 23/02/07 16:36:06 ERROR Executor: Exception in task 0.0 in stage 3.0 (TID 3)
> java.lang.NullPointerException
>   at 
> org.apache.spark.sql.catalyst.expressions.codegen.UnsafeWriter.write(UnsafeWriter.java:110)
>   at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown
>  Source)
>   at 
> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
>   at 
> org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:760)
> {noformat}
> The generated code calls {{UnsafeWriter.write(0, value_0)}} regardless of 
> whether {{Mask.transformInput}} returns null or not. The 
> {{UnsafeWriter.write}} method for {{UTF8String}} does not expect a null 
> pointer.
> {noformat}
> /* 031 */ boolean isNull_1 = i.isNullAt(0);
> /* 032 */ UTF8String value_1 = isNull_1 ?
> /* 033 */ null : (i.getUTF8String(0));
> /* 034 */
> /* 035 */
> /* 036 */
> /* 037 */
> /* 038 */ UTF8String value_0 = null;
> /* 039 */ value_0 = 
> org.apache.spark.sql.catalyst.expressions.Mask.transformInput(value_1, 
> ((UTF8String) references[0] /* literal */), ((UTF8String) references[1] /* 
> literal */), ((UTF8String) references[2] /* literal */), ((UTF8String) 
> references[3] /* literal */));;
> /* 040 */ if (false) {
> /* 041 */   mutableStateArray_0[0].setNullAt(0);
> /* 042 */ } else {
> /* 043 */   mutableStateArray_0[0].write(0, value_0);
> /* 044 */ }
> /* 045 */ return (mutableStateArray_0[0].getRow());
> /* 046 */   }
> {noformat}
> The bug is not exercised by a literal null input value, since there appears 
> to be some optimization that simply replaces the entire function call with a 
> null literal:
> {noformat}
> spark-sql> explain SELECT mask(NULL);
> == Physical Plan ==
> *(1) Project [null AS mask(NULL, X, x, n, NULL)#47]
> +- *(1) Scan OneRowRelation[]
> Time taken: 0.026 seconds, Fetched 1 row(s)
> spark-sql> SELECT mask(NULL);
> NULL
> Time taken: 0.042 seconds, Fetched 1 row(s)
> spark-sql> 
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-42384) Mask function's generated code does not handle null input

2023-02-15 Thread Gengliang Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gengliang Wang reassigned SPARK-42384:
--

Assignee: Bruce Robbins

> Mask function's generated code does not handle null input
> -
>
> Key: SPARK-42384
> URL: https://issues.apache.org/jira/browse/SPARK-42384
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.4.0, 3.5.0
>Reporter: Bruce Robbins
>Assignee: Bruce Robbins
>Priority: Major
>
> Example:
> {noformat}
> create or replace temp view v1 as
> select * from values
> (null),
> ('AbCD123-@$#')
> as data(col1);
> cache table v1;
> select mask(col1) from v1;
> {noformat}
> This query results in a {{NullPointerException}}:
> {noformat}
> 23/02/07 16:36:06 ERROR Executor: Exception in task 0.0 in stage 3.0 (TID 3)
> java.lang.NullPointerException
>   at 
> org.apache.spark.sql.catalyst.expressions.codegen.UnsafeWriter.write(UnsafeWriter.java:110)
>   at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown
>  Source)
>   at 
> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
>   at 
> org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:760)
> {noformat}
> The generated code calls {{UnsafeWriter.write(0, value_0)}} regardless of 
> whether {{Mask.transformInput}} returns null or not. The 
> {{UnsafeWriter.write}} method for {{UTF8String}} does not expect a null 
> pointer.
> {noformat}
> /* 031 */ boolean isNull_1 = i.isNullAt(0);
> /* 032 */ UTF8String value_1 = isNull_1 ?
> /* 033 */ null : (i.getUTF8String(0));
> /* 034 */
> /* 035 */
> /* 036 */
> /* 037 */
> /* 038 */ UTF8String value_0 = null;
> /* 039 */ value_0 = 
> org.apache.spark.sql.catalyst.expressions.Mask.transformInput(value_1, 
> ((UTF8String) references[0] /* literal */), ((UTF8String) references[1] /* 
> literal */), ((UTF8String) references[2] /* literal */), ((UTF8String) 
> references[3] /* literal */));;
> /* 040 */ if (false) {
> /* 041 */   mutableStateArray_0[0].setNullAt(0);
> /* 042 */ } else {
> /* 043 */   mutableStateArray_0[0].write(0, value_0);
> /* 044 */ }
> /* 045 */ return (mutableStateArray_0[0].getRow());
> /* 046 */   }
> {noformat}
> The bug is not exercised by a literal null input value, since there appears 
> to be some optimization that simply replaces the entire function call with a 
> null literal:
> {noformat}
> spark-sql> explain SELECT mask(NULL);
> == Physical Plan ==
> *(1) Project [null AS mask(NULL, X, x, n, NULL)#47]
> +- *(1) Scan OneRowRelation[]
> Time taken: 0.026 seconds, Fetched 1 row(s)
> spark-sql> SELECT mask(NULL);
> NULL
> Time taken: 0.042 seconds, Fetched 1 row(s)
> spark-sql> 
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-41591) Implement functionality for training a PyTorch file locally

2023-02-15 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-41591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17689407#comment-17689407
 ] 

Apache Spark commented on SPARK-41591:
--

User 'rithwik-db' has created a pull request for this issue:
https://github.com/apache/spark/pull/40045

> Implement functionality for training a PyTorch file locally
> ---
>
> Key: SPARK-41591
> URL: https://issues.apache.org/jira/browse/SPARK-41591
> Project: Spark
>  Issue Type: Sub-task
>  Components: ML
>Affects Versions: 3.4.0
>Reporter: Rithwik Ediga Lakhamsani
>Assignee: Rithwik Ediga Lakhamsani
>Priority: Major
> Fix For: 3.4.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-41591) Implement functionality for training a PyTorch file locally

2023-02-15 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-41591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17689405#comment-17689405
 ] 

Apache Spark commented on SPARK-41591:
--

User 'rithwik-db' has created a pull request for this issue:
https://github.com/apache/spark/pull/40045

> Implement functionality for training a PyTorch file locally
> ---
>
> Key: SPARK-41591
> URL: https://issues.apache.org/jira/browse/SPARK-41591
> Project: Spark
>  Issue Type: Sub-task
>  Components: ML
>Affects Versions: 3.4.0
>Reporter: Rithwik Ediga Lakhamsani
>Assignee: Rithwik Ediga Lakhamsani
>Priority: Major
> Fix For: 3.4.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-42457) Scala Client Session Read API

2023-02-15 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-42457:


Assignee: Apache Spark

> Scala Client Session Read API
> -
>
> Key: SPARK-42457
> URL: https://issues.apache.org/jira/browse/SPARK-42457
> Project: Spark
>  Issue Type: Improvement
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Zhen Li
>Assignee: Apache Spark
>Priority: Major
>
> Add SparkSession#read impl to be able to read data.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-42457) Scala Client Session Read API

2023-02-15 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-42457:


Assignee: (was: Apache Spark)

> Scala Client Session Read API
> -
>
> Key: SPARK-42457
> URL: https://issues.apache.org/jira/browse/SPARK-42457
> Project: Spark
>  Issue Type: Improvement
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Zhen Li
>Priority: Major
>
> Add SparkSession#read impl to be able to read data.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-42457) Scala Client Session Read API

2023-02-15 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-42457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17689370#comment-17689370
 ] 

Apache Spark commented on SPARK-42457:
--

User 'zhenlineo' has created a pull request for this issue:
https://github.com/apache/spark/pull/40025

> Scala Client Session Read API
> -
>
> Key: SPARK-42457
> URL: https://issues.apache.org/jira/browse/SPARK-42457
> Project: Spark
>  Issue Type: Improvement
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Zhen Li
>Priority: Major
>
> Add SparkSession#read impl to be able to read data.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-42457) Scala Client Session Read API

2023-02-15 Thread Zhen Li (Jira)
Zhen Li created SPARK-42457:
---

 Summary: Scala Client Session Read API
 Key: SPARK-42457
 URL: https://issues.apache.org/jira/browse/SPARK-42457
 Project: Spark
  Issue Type: Improvement
  Components: Connect
Affects Versions: 3.4.0
Reporter: Zhen Li


Add SparkSession#read impl to be able to read data.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-42456) Consolidating the PySpark version upgrade note pages into a single page to make it easier to read

2023-02-15 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-42456:


Assignee: (was: Apache Spark)

> Consolidating the PySpark version upgrade note pages into a single page to 
> make it easier to read
> -
>
> Key: SPARK-42456
> URL: https://issues.apache.org/jira/browse/SPARK-42456
> Project: Spark
>  Issue Type: Documentation
>  Components: PySpark
>Affects Versions: 3.4.0
>Reporter: Allan Folting
>Priority: Major
>
> Creating a new PySpark migration guide sub page and consolidating the 
> existing 9 separate pages into this one new page. This makes it easier to 
> take a look across multiple version upgrades by simply scrolling on the page.
> Also, this is similar to the Spark Core Migration Guide page here:
> [https://spark.apache.org/docs/latest/core-migration-guide.html]
>  
> Updating the existing main Migration Guide page to point to this new sub page 
> and also making some minor language updates to help readers.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-42456) Consolidating the PySpark version upgrade note pages into a single page to make it easier to read

2023-02-15 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-42456:


Assignee: Apache Spark

> Consolidating the PySpark version upgrade note pages into a single page to 
> make it easier to read
> -
>
> Key: SPARK-42456
> URL: https://issues.apache.org/jira/browse/SPARK-42456
> Project: Spark
>  Issue Type: Documentation
>  Components: PySpark
>Affects Versions: 3.4.0
>Reporter: Allan Folting
>Assignee: Apache Spark
>Priority: Major
>
> Creating a new PySpark migration guide sub page and consolidating the 
> existing 9 separate pages into this one new page. This makes it easier to 
> take a look across multiple version upgrades by simply scrolling on the page.
> Also, this is similar to the Spark Core Migration Guide page here:
> [https://spark.apache.org/docs/latest/core-migration-guide.html]
>  
> Updating the existing main Migration Guide page to point to this new sub page 
> and also making some minor language updates to help readers.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-42456) Consolidating the PySpark version upgrade note pages into a single page to make it easier to read

2023-02-15 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-42456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17689365#comment-17689365
 ] 

Apache Spark commented on SPARK-42456:
--

User 'allanf-db' has created a pull request for this issue:
https://github.com/apache/spark/pull/40044

> Consolidating the PySpark version upgrade note pages into a single page to 
> make it easier to read
> -
>
> Key: SPARK-42456
> URL: https://issues.apache.org/jira/browse/SPARK-42456
> Project: Spark
>  Issue Type: Documentation
>  Components: PySpark
>Affects Versions: 3.4.0
>Reporter: Allan Folting
>Priority: Major
>
> Creating a new PySpark migration guide sub page and consolidating the 
> existing 9 separate pages into this one new page. This makes it easier to 
> take a look across multiple version upgrades by simply scrolling on the page.
> Also, this is similar to the Spark Core Migration Guide page here:
> [https://spark.apache.org/docs/latest/core-migration-guide.html]
>  
> Updating the existing main Migration Guide page to point to this new sub page 
> and also making some minor language updates to help readers.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-42456) Consolidating the PySpark version upgrade note pages into a single page to make it easier to read

2023-02-15 Thread Allan Folting (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allan Folting updated SPARK-42456:
--
Description: 
Creating a new PySpark migration guide sub page and consolidating the existing 
9 separate pages into this one new page. This makes it easier to take a look 
across multiple version upgrades by simply scrolling on the page.

Also, this is similar to the Spark Core Migration Guide page here:

[https://spark.apache.org/docs/latest/core-migration-guide.html]

 

Updating the existing main Migration Guide page to point to this new sub page 
and also making some minor language updates to help readers.

  was:
Creating a new PySpark migration guide and consolidating the existing 9 
separate pages into this one new page. This makes it easier to take a look 
across multiple version upgrades by simply scrolling on the page.

Also, this is similar to the Spark Core Migration Guide page here:

[https://spark.apache.org/docs/latest/core-migration-guide.html]

 

Updating the existing main Migration Guide page to point to this new sub page 
and also making some minor language updates to help readers.


> Consolidating the PySpark version upgrade note pages into a single page to 
> make it easier to read
> -
>
> Key: SPARK-42456
> URL: https://issues.apache.org/jira/browse/SPARK-42456
> Project: Spark
>  Issue Type: Documentation
>  Components: PySpark
>Affects Versions: 3.4.0
>Reporter: Allan Folting
>Priority: Major
>
> Creating a new PySpark migration guide sub page and consolidating the 
> existing 9 separate pages into this one new page. This makes it easier to 
> take a look across multiple version upgrades by simply scrolling on the page.
> Also, this is similar to the Spark Core Migration Guide page here:
> [https://spark.apache.org/docs/latest/core-migration-guide.html]
>  
> Updating the existing main Migration Guide page to point to this new sub page 
> and also making some minor language updates to help readers.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-42456) Consolidating the PySpark version upgrade note pages into a single page to make it easier to read

2023-02-15 Thread Allan Folting (Jira)
Allan Folting created SPARK-42456:
-

 Summary: Consolidating the PySpark version upgrade note pages into 
a single page to make it easier to read
 Key: SPARK-42456
 URL: https://issues.apache.org/jira/browse/SPARK-42456
 Project: Spark
  Issue Type: Documentation
  Components: PySpark
Affects Versions: 3.4.0
Reporter: Allan Folting


Creating a new PySpark migration guide and consolidating the existing 9 
separate pages into this one new page. This makes it easier to take a look 
across multiple version upgrades by simply scrolling on the page.

Also, this is similar to the Spark Core Migration Guide page here:

[https://spark.apache.org/docs/latest/core-migration-guide.html]

 

Updating the existing main Migration Guide page to point to this new sub page 
and also making some minor language updates to help readers.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-42452) Remove hadoop-2 profile from Apache Spark

2023-02-15 Thread Dongjoon Hyun (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-42452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17689354#comment-17689354
 ] 

Dongjoon Hyun commented on SPARK-42452:
---

Not yet~ I simply removed the broken one because no one will take a look.
Before Apache Spark 3.4 release, we cannot change `master` branch dramatically.
We still need to back-port many bug fixes during RC1 ~ RCx, [~LuciferYang].
So, please hold on your passion a little more. ;)

> Remove hadoop-2 profile from Apache Spark
> -
>
> Key: SPARK-42452
> URL: https://issues.apache.org/jira/browse/SPARK-42452
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 3.5.0
>Reporter: Yang Jie
>Priority: Major
>
> SPARK-40651 Drop Hadoop2 binary distribtuion from release process and 
> SPARK-42447 Remove Hadoop 2 GitHub Action job
>   



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-39904) Rename inferDate to preferDate and fix an issue when inferring schema

2023-02-15 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-39904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17689349#comment-17689349
 ] 

Apache Spark commented on SPARK-39904:
--

User 'gengliangwang' has created a pull request for this issue:
https://github.com/apache/spark/pull/40043

> Rename inferDate to preferDate and fix an issue when inferring schema
> -
>
> Key: SPARK-39904
> URL: https://issues.apache.org/jira/browse/SPARK-39904
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: Ivan Sadikov
>Assignee: Ivan Sadikov
>Priority: Major
> Fix For: 3.4.0
>
>
> Follow-up for https://issues.apache.org/jira/browse/SPARK-39469.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-42455) Rename JDBC option inferTimestampNTZType as preferTimestampNTZ

2023-02-15 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-42455:


Assignee: Gengliang Wang  (was: Apache Spark)

> Rename JDBC option inferTimestampNTZType as preferTimestampNTZ
> --
>
> Key: SPARK-42455
> URL: https://issues.apache.org/jira/browse/SPARK-42455
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: Gengliang Wang
>Assignee: Gengliang Wang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-42455) Rename JDBC option inferTimestampNTZType as preferTimestampNTZ

2023-02-15 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-42455:


Assignee: Apache Spark  (was: Gengliang Wang)

> Rename JDBC option inferTimestampNTZType as preferTimestampNTZ
> --
>
> Key: SPARK-42455
> URL: https://issues.apache.org/jira/browse/SPARK-42455
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: Gengliang Wang
>Assignee: Apache Spark
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-42455) Rename JDBC option inferTimestampNTZType as preferTimestampNTZ

2023-02-15 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-42455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17689343#comment-17689343
 ] 

Apache Spark commented on SPARK-42455:
--

User 'gengliangwang' has created a pull request for this issue:
https://github.com/apache/spark/pull/40042

> Rename JDBC option inferTimestampNTZType as preferTimestampNTZ
> --
>
> Key: SPARK-42455
> URL: https://issues.apache.org/jira/browse/SPARK-42455
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: Gengliang Wang
>Assignee: Gengliang Wang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-42455) Rename JDBC option inferTimestampNTZType as preferTimestampNTZ

2023-02-15 Thread Gengliang Wang (Jira)
Gengliang Wang created SPARK-42455:
--

 Summary: Rename JDBC option inferTimestampNTZType as 
preferTimestampNTZ
 Key: SPARK-42455
 URL: https://issues.apache.org/jira/browse/SPARK-42455
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.4.0
Reporter: Gengliang Wang
Assignee: Gengliang Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-42454) SPJ: encapsulate all SPJ related parameters in BatchScanExec

2023-02-15 Thread Chao Sun (Jira)
Chao Sun created SPARK-42454:


 Summary: SPJ: encapsulate all SPJ related parameters in 
BatchScanExec
 Key: SPARK-42454
 URL: https://issues.apache.org/jira/browse/SPARK-42454
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.3.1
Reporter: Chao Sun


The list of SPJ parameters in {{BatchScanExec}} keeps growing, which is 
annoying since there are many places which do pattern-matching on 
{{BatchScanExec}} and they have to change accordingly. 

To make this less disruptive, we can introduce a struct for all the SPJ classes 
and use that as the parameter for {{BatchScanExec}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-40653) Protobuf Support in Structured Streaming

2023-02-15 Thread Raghu Angadi (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-40653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raghu Angadi resolved SPARK-40653.
--
Fix Version/s: 3.4.0
   Resolution: Fixed

Protobuf functions have been in use for couple of months. 

> Protobuf Support in Structured Streaming
> 
>
> Key: SPARK-40653
> URL: https://issues.apache.org/jira/browse/SPARK-40653
> Project: Spark
>  Issue Type: Epic
>  Components: Protobuf, Structured Streaming
>Affects Versions: 3.4.0
>Reporter: Raghu Angadi
>Priority: Major
> Fix For: 3.4.0
>
>
> Add support for Protobuf messages in streaming sources. This would be similar 
> to Avro format support. This includes features like schema-registry, Python 
> support, schema evolution, etc.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-42453) Implement function max in Scala client

2023-02-15 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-42453:


Assignee: (was: Apache Spark)

> Implement function max in Scala client
> --
>
> Key: SPARK-42453
> URL: https://issues.apache.org/jira/browse/SPARK-42453
> Project: Spark
>  Issue Type: Task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Rui Wang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-42453) Implement function max in Scala client

2023-02-15 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-42453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17689307#comment-17689307
 ] 

Apache Spark commented on SPARK-42453:
--

User 'amaliujia' has created a pull request for this issue:
https://github.com/apache/spark/pull/40041

> Implement function max in Scala client
> --
>
> Key: SPARK-42453
> URL: https://issues.apache.org/jira/browse/SPARK-42453
> Project: Spark
>  Issue Type: Task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Rui Wang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-42453) Implement function max in Scala client

2023-02-15 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-42453:


Assignee: Apache Spark

> Implement function max in Scala client
> --
>
> Key: SPARK-42453
> URL: https://issues.apache.org/jira/browse/SPARK-42453
> Project: Spark
>  Issue Type: Task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Rui Wang
>Assignee: Apache Spark
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-42441) Scala Client - Implement Column API

2023-02-15 Thread Jira


 [ 
https://issues.apache.org/jira/browse/SPARK-42441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Herman van Hövell resolved SPARK-42441.
---
Fix Version/s: 3.4.0
 Assignee: Herman van Hövell
   Resolution: Fixed

> Scala Client - Implement Column API
> ---
>
> Key: SPARK-42441
> URL: https://issues.apache.org/jira/browse/SPARK-42441
> Project: Spark
>  Issue Type: Task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Herman van Hövell
>Assignee: Herman van Hövell
>Priority: Major
> Fix For: 3.4.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-42453) Implement function max in Scala client

2023-02-15 Thread Rui Wang (Jira)
Rui Wang created SPARK-42453:


 Summary: Implement function max in Scala client
 Key: SPARK-42453
 URL: https://issues.apache.org/jira/browse/SPARK-42453
 Project: Spark
  Issue Type: Task
  Components: Connect
Affects Versions: 3.4.0
Reporter: Rui Wang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-42445) Fix SparkR install.spark function

2023-02-15 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-42445:
--
Fix Version/s: 3.3.3

> Fix SparkR install.spark function
> -
>
> Key: SPARK-42445
> URL: https://issues.apache.org/jira/browse/SPARK-42445
> Project: Spark
>  Issue Type: Bug
>  Components: R
>Affects Versions: 3.3.0, 3.3.1, 3.3.2, 3.4.0
>Reporter: Dongjoon Hyun
>Assignee: Dongjoon Hyun
>Priority: Major
> Fix For: 3.4.0, 3.3.3
>
>
> {code}
> $ R
> R version 4.2.1 (2022-06-23) -- "Funny-Looking Kid"
> Copyright (C) 2022 The R Foundation for Statistical Computing
> Platform: aarch64-apple-darwin20 (64-bit)
> R is free software and comes with ABSOLUTELY NO WARRANTY.
> You are welcome to redistribute it under certain conditions.
> Type 'license()' or 'licence()' for distribution details.
>   Natural language support but running in an English locale
> R is a collaborative project with many contributors.
> Type 'contributors()' for more information and
> 'citation()' on how to cite R or R packages in publications.
> Type 'demo()' for some demos, 'help()' for on-line help, or
> 'help.start()' for an HTML browser interface to help.
> Type 'q()' to quit R.
> > library(SparkR)
> Attaching package: ‘SparkR’
> The following objects are masked from ‘package:stats’:
> cov, filter, lag, na.omit, predict, sd, var, window
> The following objects are masked from ‘package:base’:
> as.data.frame, colnames, colnames<-, drop, endsWith, intersect,
> rank, rbind, sample, startsWith, subset, summary, transform, union
> > install.spark()
> Spark not found in the cache directory. Installation will start.
> MirrorUrl not provided.
> Looking for preferred site from apache website...
> Preferred mirror site found: https://dlcdn.apache.org/spark
> Downloading spark-3.3.2 for Hadoop 2.7 from:
> - https://dlcdn.apache.org/spark/spark-3.3.2/spark-3.3.2-bin-hadoop2.7.tgz
> trying URL 
> 'https://dlcdn.apache.org/spark/spark-3.3.2/spark-3.3.2-bin-hadoop2.7.tgz'
> simpleWarning in download.file(remotePath, localPath): downloaded length 0 != 
> reported length 196
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-42445) Fix SparkR install.spark function

2023-02-15 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-42445.
---
Fix Version/s: 3.4.0
   Resolution: Fixed

Issue resolved by pull request 40031
[https://github.com/apache/spark/pull/40031]

> Fix SparkR install.spark function
> -
>
> Key: SPARK-42445
> URL: https://issues.apache.org/jira/browse/SPARK-42445
> Project: Spark
>  Issue Type: Bug
>  Components: R
>Affects Versions: 3.3.0, 3.3.1, 3.3.2, 3.4.0
>Reporter: Dongjoon Hyun
>Assignee: Dongjoon Hyun
>Priority: Major
> Fix For: 3.4.0
>
>
> {code}
> $ R
> R version 4.2.1 (2022-06-23) -- "Funny-Looking Kid"
> Copyright (C) 2022 The R Foundation for Statistical Computing
> Platform: aarch64-apple-darwin20 (64-bit)
> R is free software and comes with ABSOLUTELY NO WARRANTY.
> You are welcome to redistribute it under certain conditions.
> Type 'license()' or 'licence()' for distribution details.
>   Natural language support but running in an English locale
> R is a collaborative project with many contributors.
> Type 'contributors()' for more information and
> 'citation()' on how to cite R or R packages in publications.
> Type 'demo()' for some demos, 'help()' for on-line help, or
> 'help.start()' for an HTML browser interface to help.
> Type 'q()' to quit R.
> > library(SparkR)
> Attaching package: ‘SparkR’
> The following objects are masked from ‘package:stats’:
> cov, filter, lag, na.omit, predict, sd, var, window
> The following objects are masked from ‘package:base’:
> as.data.frame, colnames, colnames<-, drop, endsWith, intersect,
> rank, rbind, sample, startsWith, subset, summary, transform, union
> > install.spark()
> Spark not found in the cache directory. Installation will start.
> MirrorUrl not provided.
> Looking for preferred site from apache website...
> Preferred mirror site found: https://dlcdn.apache.org/spark
> Downloading spark-3.3.2 for Hadoop 2.7 from:
> - https://dlcdn.apache.org/spark/spark-3.3.2/spark-3.3.2-bin-hadoop2.7.tgz
> trying URL 
> 'https://dlcdn.apache.org/spark/spark-3.3.2/spark-3.3.2-bin-hadoop2.7.tgz'
> simpleWarning in download.file(remotePath, localPath): downloaded length 0 != 
> reported length 196
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-42445) Fix SparkR install.spark function

2023-02-15 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun reassigned SPARK-42445:
-

Assignee: Dongjoon Hyun

> Fix SparkR install.spark function
> -
>
> Key: SPARK-42445
> URL: https://issues.apache.org/jira/browse/SPARK-42445
> Project: Spark
>  Issue Type: Bug
>  Components: R
>Affects Versions: 3.3.0, 3.3.1, 3.3.2, 3.4.0
>Reporter: Dongjoon Hyun
>Assignee: Dongjoon Hyun
>Priority: Major
>
> {code}
> $ R
> R version 4.2.1 (2022-06-23) -- "Funny-Looking Kid"
> Copyright (C) 2022 The R Foundation for Statistical Computing
> Platform: aarch64-apple-darwin20 (64-bit)
> R is free software and comes with ABSOLUTELY NO WARRANTY.
> You are welcome to redistribute it under certain conditions.
> Type 'license()' or 'licence()' for distribution details.
>   Natural language support but running in an English locale
> R is a collaborative project with many contributors.
> Type 'contributors()' for more information and
> 'citation()' on how to cite R or R packages in publications.
> Type 'demo()' for some demos, 'help()' for on-line help, or
> 'help.start()' for an HTML browser interface to help.
> Type 'q()' to quit R.
> > library(SparkR)
> Attaching package: ‘SparkR’
> The following objects are masked from ‘package:stats’:
> cov, filter, lag, na.omit, predict, sd, var, window
> The following objects are masked from ‘package:base’:
> as.data.frame, colnames, colnames<-, drop, endsWith, intersect,
> rank, rbind, sample, startsWith, subset, summary, transform, union
> > install.spark()
> Spark not found in the cache directory. Installation will start.
> MirrorUrl not provided.
> Looking for preferred site from apache website...
> Preferred mirror site found: https://dlcdn.apache.org/spark
> Downloading spark-3.3.2 for Hadoop 2.7 from:
> - https://dlcdn.apache.org/spark/spark-3.3.2/spark-3.3.2-bin-hadoop2.7.tgz
> trying URL 
> 'https://dlcdn.apache.org/spark/spark-3.3.2/spark-3.3.2-bin-hadoop2.7.tgz'
> simpleWarning in download.file(remotePath, localPath): downloaded length 0 != 
> reported length 196
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-42399) CONV() silently overflows returning wrong results

2023-02-15 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-42399:


Assignee: Apache Spark

> CONV() silently overflows returning wrong results
> -
>
> Key: SPARK-42399
> URL: https://issues.apache.org/jira/browse/SPARK-42399
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: Serge Rielau
>Assignee: Apache Spark
>Priority: Critical
>
> spark-sql> SELECT 
> CONV(SUBSTRING('0x',
>  3), 16, 10);
> 18446744073709551615
> Time taken: 2.114 seconds, Fetched 1 row(s)
> spark-sql> set spark.sql.ansi.enabled = true;
> spark.sql.ansi.enabled true
> Time taken: 0.068 seconds, Fetched 1 row(s)
> spark-sql> SELECT 
> CONV(SUBSTRING('0x',
>  3), 16, 10);
> 18446744073709551615
> Time taken: 0.05 seconds, Fetched 1 row(s)
> In ANSI mode we should raise an error for sure.
> In non ANSI either an error or a NULL maybe be acceptable.
> Alternatively, of course, we could consider if we can support arbitrary 
> domains since the result is a STRING again. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-42399) CONV() silently overflows returning wrong results

2023-02-15 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-42399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17689226#comment-17689226
 ] 

Apache Spark commented on SPARK-42399:
--

User 'NarekDW' has created a pull request for this issue:
https://github.com/apache/spark/pull/40040

> CONV() silently overflows returning wrong results
> -
>
> Key: SPARK-42399
> URL: https://issues.apache.org/jira/browse/SPARK-42399
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: Serge Rielau
>Priority: Critical
>
> spark-sql> SELECT 
> CONV(SUBSTRING('0x',
>  3), 16, 10);
> 18446744073709551615
> Time taken: 2.114 seconds, Fetched 1 row(s)
> spark-sql> set spark.sql.ansi.enabled = true;
> spark.sql.ansi.enabled true
> Time taken: 0.068 seconds, Fetched 1 row(s)
> spark-sql> SELECT 
> CONV(SUBSTRING('0x',
>  3), 16, 10);
> 18446744073709551615
> Time taken: 0.05 seconds, Fetched 1 row(s)
> In ANSI mode we should raise an error for sure.
> In non ANSI either an error or a NULL maybe be acceptable.
> Alternatively, of course, we could consider if we can support arbitrary 
> domains since the result is a STRING again. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-42399) CONV() silently overflows returning wrong results

2023-02-15 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-42399:


Assignee: (was: Apache Spark)

> CONV() silently overflows returning wrong results
> -
>
> Key: SPARK-42399
> URL: https://issues.apache.org/jira/browse/SPARK-42399
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.4.0
>Reporter: Serge Rielau
>Priority: Critical
>
> spark-sql> SELECT 
> CONV(SUBSTRING('0x',
>  3), 16, 10);
> 18446744073709551615
> Time taken: 2.114 seconds, Fetched 1 row(s)
> spark-sql> set spark.sql.ansi.enabled = true;
> spark.sql.ansi.enabled true
> Time taken: 0.068 seconds, Fetched 1 row(s)
> spark-sql> SELECT 
> CONV(SUBSTRING('0x',
>  3), 16, 10);
> 18446744073709551615
> Time taken: 0.05 seconds, Fetched 1 row(s)
> In ANSI mode we should raise an error for sure.
> In non ANSI either an error or a NULL maybe be acceptable.
> Alternatively, of course, we could consider if we can support arbitrary 
> domains since the result is a STRING again. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-42452) Remove hadoop-2 profile from Apache Spark

2023-02-15 Thread Yang Jie (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-42452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17689222#comment-17689222
 ] 

Yang Jie commented on SPARK-42452:
--

Is it time to clean up hadoop-2 profile ? [~dongjoon]  [~gurwls223] 

 

> Remove hadoop-2 profile from Apache Spark
> -
>
> Key: SPARK-42452
> URL: https://issues.apache.org/jira/browse/SPARK-42452
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 3.5.0
>Reporter: Yang Jie
>Priority: Major
>
> SPARK-40651 Drop Hadoop2 binary distribtuion from release process and 
> SPARK-42447 Remove Hadoop 2 GitHub Action job
>   



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-42452) Remove hadoop-2 profile from Apache Spark

2023-02-15 Thread Yang Jie (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Jie updated SPARK-42452:
-
Summary: Remove hadoop-2 profile from Apache Spark  (was: Remove hadoop-2 
profile from Spark)

> Remove hadoop-2 profile from Apache Spark
> -
>
> Key: SPARK-42452
> URL: https://issues.apache.org/jira/browse/SPARK-42452
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 3.5.0
>Reporter: Yang Jie
>Priority: Major
>
> SPARK-40651 Drop Hadoop2 binary distribtuion from release process and 
> SPARK-42447 Remove Hadoop 2 GitHub Action job
>   



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-42452) Remove hadoop-2 profile from Spark

2023-02-15 Thread Yang Jie (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Jie updated SPARK-42452:
-
Description: 
SPARK-40651 Drop Hadoop2 binary distribtuion from release process and 
SPARK-42447 Remove Hadoop 2 GitHub Action job

  

> Remove hadoop-2 profile from Spark
> --
>
> Key: SPARK-42452
> URL: https://issues.apache.org/jira/browse/SPARK-42452
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 3.5.0
>Reporter: Yang Jie
>Priority: Major
>
> SPARK-40651 Drop Hadoop2 binary distribtuion from release process and 
> SPARK-42447 Remove Hadoop 2 GitHub Action job
>   



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-42452) Remove hadoop-2 profile from Spark

2023-02-15 Thread Yang Jie (Jira)
Yang Jie created SPARK-42452:


 Summary: Remove hadoop-2 profile from Spark
 Key: SPARK-42452
 URL: https://issues.apache.org/jira/browse/SPARK-42452
 Project: Spark
  Issue Type: Improvement
  Components: Build
Affects Versions: 3.5.0
Reporter: Yang Jie






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-42451) Remove 3.1 and Java 17 check from filter condition of `testingVersions` in `HiveExternalCatalogVersionsSuite`

2023-02-15 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-42451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17689215#comment-17689215
 ] 

Apache Spark commented on SPARK-42451:
--

User 'LuciferYang' has created a pull request for this issue:
https://github.com/apache/spark/pull/40039

> Remove 3.1 and Java 17 check from  filter condition of `testingVersions` in 
> `HiveExternalCatalogVersionsSuite`
> --
>
> Key: SPARK-42451
> URL: https://issues.apache.org/jira/browse/SPARK-42451
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL, Tests
>Affects Versions: 3.5.0
>Reporter: Yang Jie
>Priority: Minor
>
> Spark 3.1 already EOL and has been deleted from 
> [https://dist.apache.org/repos/dist/release/spark,] we can simplifies the 
> filter conditions of `testingVersions`, all version already support Java 17
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-42451) Remove 3.1 and Java 17 check from filter condition of `testingVersions` in `HiveExternalCatalogVersionsSuite`

2023-02-15 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-42451:


Assignee: (was: Apache Spark)

> Remove 3.1 and Java 17 check from  filter condition of `testingVersions` in 
> `HiveExternalCatalogVersionsSuite`
> --
>
> Key: SPARK-42451
> URL: https://issues.apache.org/jira/browse/SPARK-42451
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL, Tests
>Affects Versions: 3.5.0
>Reporter: Yang Jie
>Priority: Minor
>
> Spark 3.1 already EOL and has been deleted from 
> [https://dist.apache.org/repos/dist/release/spark,] we can simplifies the 
> filter conditions of `testingVersions`, all version already support Java 17
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-42451) Remove 3.1 and Java 17 check from filter condition of `testingVersions` in `HiveExternalCatalogVersionsSuite`

2023-02-15 Thread Apache Spark (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-42451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17689214#comment-17689214
 ] 

Apache Spark commented on SPARK-42451:
--

User 'LuciferYang' has created a pull request for this issue:
https://github.com/apache/spark/pull/40039

> Remove 3.1 and Java 17 check from  filter condition of `testingVersions` in 
> `HiveExternalCatalogVersionsSuite`
> --
>
> Key: SPARK-42451
> URL: https://issues.apache.org/jira/browse/SPARK-42451
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL, Tests
>Affects Versions: 3.5.0
>Reporter: Yang Jie
>Priority: Minor
>
> Spark 3.1 already EOL and has been deleted from 
> [https://dist.apache.org/repos/dist/release/spark,] we can simplifies the 
> filter conditions of `testingVersions`, all version already support Java 17
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-42451) Remove 3.1 and Java 17 check from filter condition of `testingVersions` in `HiveExternalCatalogVersionsSuite`

2023-02-15 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-42451:


Assignee: Apache Spark

> Remove 3.1 and Java 17 check from  filter condition of `testingVersions` in 
> `HiveExternalCatalogVersionsSuite`
> --
>
> Key: SPARK-42451
> URL: https://issues.apache.org/jira/browse/SPARK-42451
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL, Tests
>Affects Versions: 3.5.0
>Reporter: Yang Jie
>Assignee: Apache Spark
>Priority: Minor
>
> Spark 3.1 already EOL and has been deleted from 
> [https://dist.apache.org/repos/dist/release/spark,] we can simplifies the 
> filter conditions of `testingVersions`, all version already support Java 17
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-42451) Remove 3.1 and Java 17 check from filter condition of `testingVersions` in `HiveExternalCatalogVersionsSuite`

2023-02-15 Thread Yang Jie (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Jie updated SPARK-42451:
-
Summary: Remove 3.1 and Java 17 check from  filter condition of 
`testingVersions` in `HiveExternalCatalogVersionsSuite`  (was: Remove 3.1 and 
Java 17 check from `testingVersions` in `HiveExternalCatalogVersionsSuite`)

> Remove 3.1 and Java 17 check from  filter condition of `testingVersions` in 
> `HiveExternalCatalogVersionsSuite`
> --
>
> Key: SPARK-42451
> URL: https://issues.apache.org/jira/browse/SPARK-42451
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL, Tests
>Affects Versions: 3.5.0
>Reporter: Yang Jie
>Priority: Minor
>
> Spark 3.1 already EOL and has been deleted from 
> [https://dist.apache.org/repos/dist/release/spark,] we can simplifies the 
> filter conditions of `testingVersions`, all version already support Java 17
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-42451) Remove 3.1 and Java 17 check from `testingVersions` in `HiveExternalCatalogVersionsSuite`

2023-02-15 Thread Yang Jie (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Jie updated SPARK-42451:
-
Summary: Remove 3.1 and Java 17 check from `testingVersions` in 
`HiveExternalCatalogVersionsSuite`  (was: Remove 3.1 and Java 17 condition 
check from `testingVersions` in `HiveExternalCatalogVersionsSuite`)

> Remove 3.1 and Java 17 check from `testingVersions` in 
> `HiveExternalCatalogVersionsSuite`
> -
>
> Key: SPARK-42451
> URL: https://issues.apache.org/jira/browse/SPARK-42451
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL, Tests
>Affects Versions: 3.5.0
>Reporter: Yang Jie
>Priority: Minor
>
> Spark 3.1 already EOL and has been deleted from 
> [https://dist.apache.org/repos/dist/release/spark,] we can simplifies the 
> filter conditions of `testingVersions`, all version already support Java 17
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-42451) Remove 3.1 and Java 17 condition check from `testingVersions` in `HiveExternalCatalogVersionsSuite`

2023-02-15 Thread Yang Jie (Jira)
Yang Jie created SPARK-42451:


 Summary: Remove 3.1 and Java 17 condition check from 
`testingVersions` in `HiveExternalCatalogVersionsSuite`
 Key: SPARK-42451
 URL: https://issues.apache.org/jira/browse/SPARK-42451
 Project: Spark
  Issue Type: Improvement
  Components: SQL, Tests
Affects Versions: 3.5.0
Reporter: Yang Jie


Spark 3.1 already EOL and has been deleted from 
[https://dist.apache.org/repos/dist/release/spark,] we can simplifies the 
filter conditions of `testingVersions`, all version already support Java 17

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-42449) Fix `native-image.propertie` in Scala Client

2023-02-15 Thread Zhen Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhen Li updated SPARK-42449:

Description: 
The content of `native-image.propertie` file is not correct. This file is used 
to create a native image using GraalVM see more info: 
https://docs.oracle.com/en/graalvm/enterprise/20/docs/reference-manual/native-image/BuildConfiguration/
https://www.graalvm.org/22.1/reference-manual/native-image/BuildConfiguration/

e.g.

The content in `META-INF/native-image/io.netty` should also relocated, just as 
in `grpc-netty-shaded`.

Now, the content of 
`META-INF/native-image/io.netty/netty-codec-http2/native-image.properties` is

```
Args = --initialize-at-build-time=io.netty \
   
--initialize-at-run-time=io.netty.handler.codec.http2.Http2CodecUtil,io.netty.handler.codec.http2.Http2ClientUpgradeCodec,io.netty.handler.codec.http2.Http2ConnectionHandler,io.netty.handler.codec.http2.DefaultHttp2FrameWriter
```

but it should like
```
Args = --initialize-at-build-time=org.sparkproject.connect.client.io.netty \
   
--initialize-at-run-time=org.sparkproject.connect.client.io.netty.handler.codec.http2.Http2CodecUtil,org.sparkproject.connect.client.io.netty.handler.codec.http2.Http2ClientUpgradeCodec,org.sparkproject.connect.client.io.netty.handler.codec.http2.Http2ConnectionHandler,org.sparkproject.connect.client.io.netty.handler.codec.http2.DefaultHttp2FrameWriter
   
```
Other Transformer may need to be added

See more info in this discussion thread 
https://github.com/apache/spark/pull/39866#discussion_r1098833915



  was:
The content of `native-image.propertie` file is not correct. This file is used 
by GraalVM see 
https://docs.oracle.com/en/graalvm/enterprise/20/docs/reference-manual/native-image/BuildConfiguration/.

e.g.

The content in `META-INF/native-image/io.netty` should also relocated, just as 
in `grpc-netty-shaded`.

Now, the content of 
`META-INF/native-image/io.netty/netty-codec-http2/native-image.properties` is

```
Args = --initialize-at-build-time=io.netty \
   
--initialize-at-run-time=io.netty.handler.codec.http2.Http2CodecUtil,io.netty.handler.codec.http2.Http2ClientUpgradeCodec,io.netty.handler.codec.http2.Http2ConnectionHandler,io.netty.handler.codec.http2.DefaultHttp2FrameWriter
```

but it should like
```
Args = --initialize-at-build-time=org.sparkproject.connect.client.io.netty \
   
--initialize-at-run-time=org.sparkproject.connect.client.io.netty.handler.codec.http2.Http2CodecUtil,org.sparkproject.connect.client.io.netty.handler.codec.http2.Http2ClientUpgradeCodec,org.sparkproject.connect.client.io.netty.handler.codec.http2.Http2ConnectionHandler,org.sparkproject.connect.client.io.netty.handler.codec.http2.DefaultHttp2FrameWriter
   
```
Other Transformer may need to be added

See more info in this discussion thread 
https://github.com/apache/spark/pull/39866#discussion_r1098833915




> Fix `native-image.propertie` in Scala Client
> 
>
> Key: SPARK-42449
> URL: https://issues.apache.org/jira/browse/SPARK-42449
> Project: Spark
>  Issue Type: Improvement
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Zhen Li
>Priority: Minor
>
> The content of `native-image.propertie` file is not correct. This file is 
> used to create a native image using GraalVM see more info: 
> https://docs.oracle.com/en/graalvm/enterprise/20/docs/reference-manual/native-image/BuildConfiguration/
> https://www.graalvm.org/22.1/reference-manual/native-image/BuildConfiguration/
> e.g.
> The content in `META-INF/native-image/io.netty` should also relocated, just 
> as in `grpc-netty-shaded`.
> Now, the content of 
> `META-INF/native-image/io.netty/netty-codec-http2/native-image.properties` is
> ```
> Args = --initialize-at-build-time=io.netty \
>
> --initialize-at-run-time=io.netty.handler.codec.http2.Http2CodecUtil,io.netty.handler.codec.http2.Http2ClientUpgradeCodec,io.netty.handler.codec.http2.Http2ConnectionHandler,io.netty.handler.codec.http2.DefaultHttp2FrameWriter
> ```
> but it should like
> ```
> Args = --initialize-at-build-time=org.sparkproject.connect.client.io.netty \
>
> --initialize-at-run-time=org.sparkproject.connect.client.io.netty.handler.codec.http2.Http2CodecUtil,org.sparkproject.connect.client.io.netty.handler.codec.http2.Http2ClientUpgradeCodec,org.sparkproject.connect.client.io.netty.handler.codec.http2.Http2ConnectionHandler,org.sparkproject.connect.client.io.netty.handler.codec.http2.DefaultHttp2FrameWriter
>
> ```
> Other Transformer may need to be added
> See more info in this discussion thread 
> https://github.com/apache/spark/pull/39866#discussion_r1098833915



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To 

[jira] [Updated] (SPARK-42450) dataset.where() omit quotes if where IN clause has more than 10 operands

2023-02-15 Thread Vadim (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vadim updated SPARK-42450:
--
Description: 
dataset.where()/filter() omit string quotes if where IN clause has more than 10 
operands. With datasourceV1 works as expected. 
Attached files: java-code.txt, stacktrace.txt, sql.txt
 - Spark verison 3.3.0
 - Scala version 2.12
 - DatasourceV2
 - Postgres
 - Postrgres JDBC Driver: 42+
 - Java8

  was:
dataset.where()/filter() omit string quotes if where IN clause has more than 10 
operands. With datasourceV1 works as expected. 
Attached files: java-code.txt, stacktrace.txt
 - Spark verison 3.3.0
 - Scala version 2.12
 - DatasourceV2
 - Postgres
 - Postrgres JDBC Driver: 42+
 - Java8

*Expected query:*
SELECT
"flight_id",
"flight_no"
FROM
"bookings"."flights"
WHERE
(
"flight_no"IN (
'55',
'826',
'845',
'799',
'561',
'39',
'385',
'549',
'576',
'15',
'857',
'248',
'324',
'569',
'267'
)
)


*actual query:*
SELECT
"flight_id",
"flight_no",
FROM
"bookings"."flights"
WHERE
(
"flight_no"IN (
55,
826,
845,
799,
561,
39,
385,
549,
576,
15,
857,
248,
324,
569,
267
)
)


> dataset.where() omit quotes if where IN clause has more than 10 operands
> 
>
> Key: SPARK-42450
> URL: https://issues.apache.org/jira/browse/SPARK-42450
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.3.0
>Reporter: Vadim
>Priority: Major
> Fix For: 3.3.2, 3.4.0
>
> Attachments: java-code.txt, sql.txt, stacktrace.txt
>
>
> dataset.where()/filter() omit string quotes if where IN clause has more than 
> 10 operands. With datasourceV1 works as expected. 
> Attached files: java-code.txt, stacktrace.txt, sql.txt
>  - Spark verison 3.3.0
>  - Scala version 2.12
>  - DatasourceV2
>  - Postgres
>  - Postrgres JDBC Driver: 42+
>  - Java8



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



  1   2   >