date:20230329

[spark] branch master updated (f105fe82ab0 -> 880312b5ade)

2023-03-29 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from f105fe82ab0 [SPARK-42971][CORE] Change to print `workdir` if `appDirs` 
is null when worker handle `WorkDirCleanup` event
 add 880312b5ade [SPARK-42907][TESTS][FOLLOWUP] Avro functions doctest 
cleanup

No new revisions were added by this update.

Summary of changes:
 python/pyspark/sql/connect/avro/functions.py | 7 ---
 1 file changed, 7 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

svn commit: r60926 - in /dev/spark/v3.4.0-rc5-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/R/articles/ _site/api/R/deps/ _site/api/R/deps/bootstrap-5.2.2/ _site/api/R/deps/jquery-3.6.0/ _site/api

2023-03-29 Thread xinrong

Author: xinrong
Date: Thu Mar 30 04:53:57 2023
New Revision: 60926

Log:
Apache Spark v3.4.0-rc5 docs


[This commit notification would consist of 2789 parts, 
which exceeds the limit of 50 ones, so it was shortened to the summary.]

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.4 updated: [SPARK-42971][CORE] Change to print `workdir` if `appDirs` is null when worker handle `WorkDirCleanup` event

2023-03-29 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.4 by this push:
 new 6e4fcf71851 [SPARK-42971][CORE] Change to print `workdir` if `appDirs` 
is null when worker handle `WorkDirCleanup` event
6e4fcf71851 is described below

commit 6e4fcf71851be64ac75fc3c50cb178e01d71368f
Author: yangjie01 
AuthorDate: Thu Mar 30 13:50:27 2023 +0900

[SPARK-42971][CORE] Change to print `workdir` if `appDirs` is null when 
worker handle `WorkDirCleanup` event

### What changes were proposed in this pull request?
This pr change to print `workdir` if `appDirs` is null when worker handle 
`WorkDirCleanup` event.

### Why are the changes needed?
Print `appDirs` ause a NPE because `appDirs` is null and from the context, 
what should be printed is `workdir`

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Pass GitHub Actions

Closes #40597 from LuciferYang/SPARK-39296-FOLLOW.

Authored-by: yangjie01 
Signed-off-by: Hyukjin Kwon 
(cherry picked from commit f105fe82ab01fe787d00c6ad72f1d6dedb5f3a1b)
Signed-off-by: Hyukjin Kwon 
---
 core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala 
b/core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala
index 04ef9cc0126..9fb66faef39 100755
--- a/core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala
+++ b/core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala
@@ -516,8 +516,7 @@ private[deploy] class Worker(
 val cleanupFuture: concurrent.Future[Unit] = concurrent.Future {
   val appDirs = workDir.listFiles()
   if (appDirs == null) {
-throw new IOException(
-  s"ERROR: Failed to list files in ${appDirs.mkString("dirs(", ", 
", ")")}")
+throw new IOException(s"ERROR: Failed to list files in $workDir")
   }
   appDirs.filter { dir =>
 // the directory is used by an application - check that the 
application is not running


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-42971][CORE] Change to print `workdir` if `appDirs` is null when worker handle `WorkDirCleanup` event

2023-03-29 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new f105fe82ab0 [SPARK-42971][CORE] Change to print `workdir` if `appDirs` 
is null when worker handle `WorkDirCleanup` event
f105fe82ab0 is described below

commit f105fe82ab01fe787d00c6ad72f1d6dedb5f3a1b
Author: yangjie01 
AuthorDate: Thu Mar 30 13:50:27 2023 +0900

[SPARK-42971][CORE] Change to print `workdir` if `appDirs` is null when 
worker handle `WorkDirCleanup` event

### What changes were proposed in this pull request?
This pr change to print `workdir` if `appDirs` is null when worker handle 
`WorkDirCleanup` event.

### Why are the changes needed?
Print `appDirs` ause a NPE because `appDirs` is null and from the context, 
what should be printed is `workdir`

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Pass GitHub Actions

Closes #40597 from LuciferYang/SPARK-39296-FOLLOW.

Authored-by: yangjie01 
Signed-off-by: Hyukjin Kwon 
---
 core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala 
b/core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala
index 04ef9cc0126..9fb66faef39 100755
--- a/core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala
+++ b/core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala
@@ -516,8 +516,7 @@ private[deploy] class Worker(
 val cleanupFuture: concurrent.Future[Unit] = concurrent.Future {
   val appDirs = workDir.listFiles()
   if (appDirs == null) {
-throw new IOException(
-  s"ERROR: Failed to list files in ${appDirs.mkString("dirs(", ", 
", ")")}")
+throw new IOException(s"ERROR: Failed to list files in $workDir")
   }
   appDirs.filter { dir =>
 // the directory is used by an application - check that the 
application is not running


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

svn commit: r60925 - /dev/spark/v3.4.0-rc5-bin/

2023-03-29 Thread xinrong

Author: xinrong
Date: Thu Mar 30 03:39:03 2023
New Revision: 60925

Log:
Apache Spark v3.4.0-rc5

Added:
dev/spark/v3.4.0-rc5-bin/
dev/spark/v3.4.0-rc5-bin/SparkR_3.4.0.tar.gz   (with props)
dev/spark/v3.4.0-rc5-bin/SparkR_3.4.0.tar.gz.asc
dev/spark/v3.4.0-rc5-bin/SparkR_3.4.0.tar.gz.sha512
dev/spark/v3.4.0-rc5-bin/pyspark-3.4.0.tar.gz   (with props)
dev/spark/v3.4.0-rc5-bin/pyspark-3.4.0.tar.gz.asc
dev/spark/v3.4.0-rc5-bin/pyspark-3.4.0.tar.gz.sha512
dev/spark/v3.4.0-rc5-bin/spark-3.4.0-bin-hadoop3-scala2.13.tgz   (with 
props)
dev/spark/v3.4.0-rc5-bin/spark-3.4.0-bin-hadoop3-scala2.13.tgz.asc
dev/spark/v3.4.0-rc5-bin/spark-3.4.0-bin-hadoop3-scala2.13.tgz.sha512
dev/spark/v3.4.0-rc5-bin/spark-3.4.0-bin-hadoop3.tgz   (with props)
dev/spark/v3.4.0-rc5-bin/spark-3.4.0-bin-hadoop3.tgz.asc
dev/spark/v3.4.0-rc5-bin/spark-3.4.0-bin-hadoop3.tgz.sha512
dev/spark/v3.4.0-rc5-bin/spark-3.4.0-bin-without-hadoop.tgz   (with props)
dev/spark/v3.4.0-rc5-bin/spark-3.4.0-bin-without-hadoop.tgz.asc
dev/spark/v3.4.0-rc5-bin/spark-3.4.0-bin-without-hadoop.tgz.sha512
dev/spark/v3.4.0-rc5-bin/spark-3.4.0.tgz   (with props)
dev/spark/v3.4.0-rc5-bin/spark-3.4.0.tgz.asc
dev/spark/v3.4.0-rc5-bin/spark-3.4.0.tgz.sha512

Added: dev/spark/v3.4.0-rc5-bin/SparkR_3.4.0.tar.gz
==
Binary file - no diff available.

Propchange: dev/spark/v3.4.0-rc5-bin/SparkR_3.4.0.tar.gz
--
svn:mime-type = application/octet-stream

Added: dev/spark/v3.4.0-rc5-bin/SparkR_3.4.0.tar.gz.asc
==
--- dev/spark/v3.4.0-rc5-bin/SparkR_3.4.0.tar.gz.asc (added)
+++ dev/spark/v3.4.0-rc5-bin/SparkR_3.4.0.tar.gz.asc Thu Mar 30 03:39:03 2023
@@ -0,0 +1,17 @@
+-BEGIN PGP SIGNATURE-
+
+iQJHBAABCgAxFiEEzGiz0W/jOnZnBRYLp+V5CMek4bEFAmQlA+wTHHhpbnJvbmdA
+YXBhY2hlLm9yZwAKCRCn5XkIx6Thsdo/D/9CLT5v+RVNTX0mmZq501F205cDUan+
+tiC/G2ddtGfSLcRAWeWqoDFWOkeupwEqtKMoqQGnElXM7qVF2miBfcohBxm3151l
+UBJD6paLgSrI2omxxqBNTB265BbojbmQcZx5UjHzO/opVahllET/7RXI6I8k/gsC
+hpoSJe77SHPXsLQpSFPaxct7Qy6IwwLq8yvVZIFlrYgjqvWBa3zsnqb4T6W859lb
+uiAAWJTJ0xQPF/u9TmXM8a9vFRfo3rXuttW8W7wKlHQjZgDJpNSJyQCaVmWYUssM
+2nzrfiwy7/E5wGzFsdxzO8lOlyeA6Cdmhwo8G5xcZnjNt9032DrAYFdo5rIoim9v
+irsqWyOJ5XclUOWpxKpXdYPcQGpEW74vUBymAW5P6jt0Yi2/3qvZSiwh1qceJ8Fo
+nut0HUWIFkohDoattkCjoA1yconcJd4+FuoDxrCX+QWAlchgR4eijMWfYCyH/7LX
+SucOJOK80psdGnZGuecuRjCzhvnbPjjNjS3dYMrudLlgxHyb2ahjeHXpVyDjI/O6
+AwUmJtUEGHk0Ypa8OHlgzB8UUaZRQDIiwL8j8tlIHYMt+VdQLUtvyK+hqe45It6F
+OAlocOnign7Ej/9EGyJfKXX0gZr6NmkuANWggPRIrIs1NSnqz4bDWQRGwVOkpb7x
+NOdLdMoi6QMC0A==
+=H+Kf
+-END PGP SIGNATURE-

Added: dev/spark/v3.4.0-rc5-bin/SparkR_3.4.0.tar.gz.sha512
==
--- dev/spark/v3.4.0-rc5-bin/SparkR_3.4.0.tar.gz.sha512 (added)
+++ dev/spark/v3.4.0-rc5-bin/SparkR_3.4.0.tar.gz.sha512 Thu Mar 30 03:39:03 2023
@@ -0,0 +1 @@
+c3086edefab6656535e234fd11d0a2a4d4c6ede97b85f94801d06064bd89c6f58196714e335e92ffd2ac83c82714ad8a9a51165621ecff194af290c1eb537ef2
  SparkR_3.4.0.tar.gz

Added: dev/spark/v3.4.0-rc5-bin/pyspark-3.4.0.tar.gz
==
Binary file - no diff available.

Propchange: dev/spark/v3.4.0-rc5-bin/pyspark-3.4.0.tar.gz
--
svn:mime-type = application/octet-stream

Added: dev/spark/v3.4.0-rc5-bin/pyspark-3.4.0.tar.gz.asc
==
--- dev/spark/v3.4.0-rc5-bin/pyspark-3.4.0.tar.gz.asc (added)
+++ dev/spark/v3.4.0-rc5-bin/pyspark-3.4.0.tar.gz.asc Thu Mar 30 03:39:03 2023
@@ -0,0 +1,17 @@
+-BEGIN PGP SIGNATURE-
+
+iQJHBAABCgAxFiEEzGiz0W/jOnZnBRYLp+V5CMek4bEFAmQlA+4THHhpbnJvbmdA
+YXBhY2hlLm9yZwAKCRCn5XkIx6Thsb0pEACXvrvU/8Xh7ns7J8RtV/Wmf4oMu9Mk
+i6G8JwBUTS1kqRe9Xb1g3GJxNil8HTta1yNKgjvkTDc6EXIYrtQD4PpL6cuumckW
+0+itx9dih22OcvfN6sJNizAtRoTcpXx7UHq00dAjzHHbOv0dwGqnjKRU3UUQ/XnY
+RjT3kM4isf95TzAmEFwsXNSzkUY0+EzDgfhnDAwb60nzTzZ2bEiZnLP1JC2iScDI
+jSXMoWtZTaJz51bssKzzXpVmrwBxLDgSPlDM5KVmeD+WQMqS7Hk51bSikSEW1X39
+CO7hEXw+SYLQB5yKaqu03diErTOWmP6aJ8tbHCPWNrs3JMJkm4/Cj6Sc2JOktixO
+Ns8Pc82kpnvG0eWCMXwihZa7pxnq59ByZsxYAfmcIdf4q02VJNetFjplgXAs2jjy
+n9UZ6l8ZrCjUW2/AB3TSSibXLXMvuI6PLSYnKY9IP0t0dqxnBIKkACTx8qBA/o+I
+0n02LBJCD8ZPJvHpI2MGlaFGftbQx4LUXX4CFlAz+RI9iizCbpjrDYFzvXBEY7ri
+46i5uL+sHkP6Uj/8fNJ3QRhggb19i0NajzofSs5vNsVk2qHjHokIjG/kOkpCfBzC
+6rM5zd/OyQNZmbHThlOjAdEvTSgasXb/5uHpwWDHbTlPGJYMZOWzuBdDSfBlHW/t
+56VKCDfYO11shA==
+=a3bs
+-END PGP SIGNATURE-

Added: dev/spark/v3.4.0-rc5-bin/pyspark-3.4.0.tar.gz.sha512
==
--- dev/spark/v3.4.0-rc5-bin/pyspark-3.4.0.tar.gz.sha512 (added)
+++ dev/spark/v

[spark] 01/01: Preparing development version 3.4.1-SNAPSHOT

2023-03-29 Thread xinrong

This is an automated email from the ASF dual-hosted git repository.

xinrong pushed a commit to branch branch-3.4
in repository https://gitbox.apache.org/repos/asf/spark.git

commit 6a6f50444d43af24773ecc158aa127027f088288
Author: Xinrong Meng 
AuthorDate: Thu Mar 30 02:18:32 2023 +

Preparing development version 3.4.1-SNAPSHOT
---
 R/pkg/DESCRIPTION  | 2 +-
 assembly/pom.xml   | 2 +-
 common/kvstore/pom.xml | 2 +-
 common/network-common/pom.xml  | 2 +-
 common/network-shuffle/pom.xml | 2 +-
 common/network-yarn/pom.xml| 2 +-
 common/sketch/pom.xml  | 2 +-
 common/tags/pom.xml| 2 +-
 common/unsafe/pom.xml  | 2 +-
 connector/avro/pom.xml | 2 +-
 connector/connect/client/jvm/pom.xml   | 2 +-
 connector/connect/common/pom.xml   | 2 +-
 connector/connect/server/pom.xml   | 2 +-
 connector/docker-integration-tests/pom.xml | 2 +-
 connector/kafka-0-10-assembly/pom.xml  | 2 +-
 connector/kafka-0-10-sql/pom.xml   | 2 +-
 connector/kafka-0-10-token-provider/pom.xml| 2 +-
 connector/kafka-0-10/pom.xml   | 2 +-
 connector/kinesis-asl-assembly/pom.xml | 2 +-
 connector/kinesis-asl/pom.xml  | 2 +-
 connector/protobuf/pom.xml | 2 +-
 connector/spark-ganglia-lgpl/pom.xml   | 2 +-
 core/pom.xml   | 2 +-
 docs/_config.yml   | 6 +++---
 examples/pom.xml   | 2 +-
 graphx/pom.xml | 2 +-
 hadoop-cloud/pom.xml   | 2 +-
 launcher/pom.xml   | 2 +-
 mllib-local/pom.xml| 2 +-
 mllib/pom.xml  | 2 +-
 pom.xml| 2 +-
 python/pyspark/version.py  | 2 +-
 repl/pom.xml   | 2 +-
 resource-managers/kubernetes/core/pom.xml  | 2 +-
 resource-managers/kubernetes/integration-tests/pom.xml | 2 +-
 resource-managers/mesos/pom.xml| 2 +-
 resource-managers/yarn/pom.xml | 2 +-
 sql/catalyst/pom.xml   | 2 +-
 sql/core/pom.xml   | 2 +-
 sql/hive-thriftserver/pom.xml  | 2 +-
 sql/hive/pom.xml   | 2 +-
 streaming/pom.xml  | 2 +-
 tools/pom.xml  | 2 +-
 43 files changed, 45 insertions(+), 45 deletions(-)

diff --git a/R/pkg/DESCRIPTION b/R/pkg/DESCRIPTION
index 4a32762b34c..fa7028630a8 100644
--- a/R/pkg/DESCRIPTION
+++ b/R/pkg/DESCRIPTION
@@ -1,6 +1,6 @@
 Package: SparkR
 Type: Package
-Version: 3.4.0
+Version: 3.4.1
 Title: R Front End for 'Apache Spark'
 Description: Provides an R Front end for 'Apache Spark' 
.
 Authors@R:
diff --git a/assembly/pom.xml b/assembly/pom.xml
index c58da7aa112..b86fee4bceb 100644
--- a/assembly/pom.xml
+++ b/assembly/pom.xml
@@ -21,7 +21,7 @@
   
 org.apache.spark
 spark-parent_2.12
-3.4.0
+3.4.1-SNAPSHOT
 ../pom.xml
   
 
diff --git a/common/kvstore/pom.xml b/common/kvstore/pom.xml
index 95ea15552da..f9ecfb3d692 100644
--- a/common/kvstore/pom.xml
+++ b/common/kvstore/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.spark
 spark-parent_2.12
-3.4.0
+3.4.1-SNAPSHOT
 ../../pom.xml
   
 
diff --git a/common/network-common/pom.xml b/common/network-common/pom.xml
index e4d98471bf9..22ee65b7d25 100644
--- a/common/network-common/pom.xml
+++ b/common/network-common/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.spark
 spark-parent_2.12
-3.4.0
+3.4.1-SNAPSHOT
 ../../pom.xml
   
 
diff --git a/common/network-shuffle/pom.xml b/common/network-shuffle/pom.xml
index 7a6d5aedf65..2c67da81ca4 100644
--- a/common/network-shuffle/pom.xml
+++ b/common/network-shuffle/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.spark
 spark-parent_2.12
-3.4.0
+3.4.1-SNAPSHOT
 ../../pom.xml
   
 
diff --git a/common/network-yarn/pom.xml b/common/network-yarn/pom.xml
index 1c421754083..219682e047d 100644
--- a/common/network-yarn/pom.xml
+++ b/common/network-yarn/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.spark
 spark-parent_2.12
-3.4.0
+3.4.1-SNAPSHOT
 ../../pom.xml
   
 
diff --git a/common/sketch/pom.xml b/common/sketch/pom.xml
index 2ee25ebfffc..22ce7

[spark] branch branch-3.4 updated (ce36692eeee -> 6a6f50444d4)

2023-03-29 Thread xinrong

This is an automated email from the ASF dual-hosted git repository.

xinrong pushed a change to branch branch-3.4
in repository https://gitbox.apache.org/repos/asf/spark.git


from ce36692 [SPARK-42631][CONNECT][FOLLOW-UP] Expose Column.expr to 
extensions
 add f39ad617d32 Preparing Spark release v3.4.0-rc5
 new 6a6f50444d4 Preparing development version 3.4.1-SNAPSHOT

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] tag v3.4.0-rc5 created (now f39ad617d32)

2023-03-29 Thread xinrong

This is an automated email from the ASF dual-hosted git repository.

xinrong pushed a change to tag v3.4.0-rc5
in repository https://gitbox.apache.org/repos/asf/spark.git


  at f39ad617d32 (commit)
This tag includes the following new commits:

 new f39ad617d32 Preparing Spark release v3.4.0-rc5

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.



-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] 01/01: Preparing Spark release v3.4.0-rc5

2023-03-29 Thread xinrong

This is an automated email from the ASF dual-hosted git repository.

xinrong pushed a commit to tag v3.4.0-rc5
in repository https://gitbox.apache.org/repos/asf/spark.git

commit f39ad617d32a671e120464e4a75986241d72c487
Author: Xinrong Meng 
AuthorDate: Thu Mar 30 02:18:27 2023 +

Preparing Spark release v3.4.0-rc5
---
 R/pkg/DESCRIPTION  | 2 +-
 assembly/pom.xml   | 2 +-
 common/kvstore/pom.xml | 2 +-
 common/network-common/pom.xml  | 2 +-
 common/network-shuffle/pom.xml | 2 +-
 common/network-yarn/pom.xml| 2 +-
 common/sketch/pom.xml  | 2 +-
 common/tags/pom.xml| 2 +-
 common/unsafe/pom.xml  | 2 +-
 connector/avro/pom.xml | 2 +-
 connector/connect/client/jvm/pom.xml   | 2 +-
 connector/connect/common/pom.xml   | 2 +-
 connector/connect/server/pom.xml   | 2 +-
 connector/docker-integration-tests/pom.xml | 2 +-
 connector/kafka-0-10-assembly/pom.xml  | 2 +-
 connector/kafka-0-10-sql/pom.xml   | 2 +-
 connector/kafka-0-10-token-provider/pom.xml| 2 +-
 connector/kafka-0-10/pom.xml   | 2 +-
 connector/kinesis-asl-assembly/pom.xml | 2 +-
 connector/kinesis-asl/pom.xml  | 2 +-
 connector/protobuf/pom.xml | 2 +-
 connector/spark-ganglia-lgpl/pom.xml   | 2 +-
 core/pom.xml   | 2 +-
 docs/_config.yml   | 6 +++---
 examples/pom.xml   | 2 +-
 graphx/pom.xml | 2 +-
 hadoop-cloud/pom.xml   | 2 +-
 launcher/pom.xml   | 2 +-
 mllib-local/pom.xml| 2 +-
 mllib/pom.xml  | 2 +-
 pom.xml| 2 +-
 python/pyspark/version.py  | 2 +-
 repl/pom.xml   | 2 +-
 resource-managers/kubernetes/core/pom.xml  | 2 +-
 resource-managers/kubernetes/integration-tests/pom.xml | 2 +-
 resource-managers/mesos/pom.xml| 2 +-
 resource-managers/yarn/pom.xml | 2 +-
 sql/catalyst/pom.xml   | 2 +-
 sql/core/pom.xml   | 2 +-
 sql/hive-thriftserver/pom.xml  | 2 +-
 sql/hive/pom.xml   | 2 +-
 streaming/pom.xml  | 2 +-
 tools/pom.xml  | 2 +-
 43 files changed, 45 insertions(+), 45 deletions(-)

diff --git a/R/pkg/DESCRIPTION b/R/pkg/DESCRIPTION
index fa7028630a8..4a32762b34c 100644
--- a/R/pkg/DESCRIPTION
+++ b/R/pkg/DESCRIPTION
@@ -1,6 +1,6 @@
 Package: SparkR
 Type: Package
-Version: 3.4.1
+Version: 3.4.0
 Title: R Front End for 'Apache Spark'
 Description: Provides an R Front end for 'Apache Spark' 
.
 Authors@R:
diff --git a/assembly/pom.xml b/assembly/pom.xml
index b86fee4bceb..c58da7aa112 100644
--- a/assembly/pom.xml
+++ b/assembly/pom.xml
@@ -21,7 +21,7 @@
   
 org.apache.spark
 spark-parent_2.12
-3.4.1-SNAPSHOT
+3.4.0
 ../pom.xml
   
 
diff --git a/common/kvstore/pom.xml b/common/kvstore/pom.xml
index f9ecfb3d692..95ea15552da 100644
--- a/common/kvstore/pom.xml
+++ b/common/kvstore/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.spark
 spark-parent_2.12
-3.4.1-SNAPSHOT
+3.4.0
 ../../pom.xml
   
 
diff --git a/common/network-common/pom.xml b/common/network-common/pom.xml
index 22ee65b7d25..e4d98471bf9 100644
--- a/common/network-common/pom.xml
+++ b/common/network-common/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.spark
 spark-parent_2.12
-3.4.1-SNAPSHOT
+3.4.0
 ../../pom.xml
   
 
diff --git a/common/network-shuffle/pom.xml b/common/network-shuffle/pom.xml
index 2c67da81ca4..7a6d5aedf65 100644
--- a/common/network-shuffle/pom.xml
+++ b/common/network-shuffle/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.spark
 spark-parent_2.12
-3.4.1-SNAPSHOT
+3.4.0
 ../../pom.xml
   
 
diff --git a/common/network-yarn/pom.xml b/common/network-yarn/pom.xml
index 219682e047d..1c421754083 100644
--- a/common/network-yarn/pom.xml
+++ b/common/network-yarn/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.spark
 spark-parent_2.12
-3.4.1-SNAPSHOT
+3.4.0
 ../../pom.xml
   
 
diff --git a/common/sketch/pom.xml b/common/sketch/pom.xml
index 22ce78c6fd2..2ee25ebfffc 100644

[spark] branch master updated: [SPARK-42970][CONNECT][PYTHON][TESTS] Reuse pyspark.sql.tests.test_arrow test cases

2023-03-29 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 12e7991b5b3 [SPARK-42970][CONNECT][PYTHON][TESTS] Reuse 
pyspark.sql.tests.test_arrow test cases
12e7991b5b3 is described below

commit 12e7991b5b38302e6496307c7263ad729c82a6cf
Author: Takuya UESHIN 
AuthorDate: Thu Mar 30 09:34:09 2023 +0900

[SPARK-42970][CONNECT][PYTHON][TESTS] Reuse pyspark.sql.tests.test_arrow 
test cases

### What changes were proposed in this pull request?

Reuses `pyspark.sql.tests.test_arrow` test cases.

### Why are the changes needed?

`test_arrow` is also helpful because it contains many tests for 
`createDataFrame` with pandas or `toPandas`.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Added the tests.

Closes #40594 from ueshin/issues/SPARK-42970/test_arrow.

Authored-by: Takuya UESHIN 
Signed-off-by: Hyukjin Kwon 
---
 dev/sparktestsupport/modules.py|   1 +
 .../pyspark/sql/tests/connect/test_parity_arrow.py | 110 +
 python/pyspark/sql/tests/test_arrow.py |  65 +++-
 3 files changed, 149 insertions(+), 27 deletions(-)

diff --git a/dev/sparktestsupport/modules.py b/dev/sparktestsupport/modules.py
index f65ef7e3ac0..1a28a644e55 100644
--- a/dev/sparktestsupport/modules.py
+++ b/dev/sparktestsupport/modules.py
@@ -755,6 +755,7 @@ pyspark_connect = Module(
 "pyspark.sql.tests.connect.test_connect_basic",
 "pyspark.sql.tests.connect.test_connect_function",
 "pyspark.sql.tests.connect.test_connect_column",
+"pyspark.sql.tests.connect.test_parity_arrow",
 "pyspark.sql.tests.connect.test_parity_datasources",
 "pyspark.sql.tests.connect.test_parity_errors",
 "pyspark.sql.tests.connect.test_parity_catalog",
diff --git a/python/pyspark/sql/tests/connect/test_parity_arrow.py 
b/python/pyspark/sql/tests/connect/test_parity_arrow.py
new file mode 100644
index 000..f8180d661db
--- /dev/null
+++ b/python/pyspark/sql/tests/connect/test_parity_arrow.py
@@ -0,0 +1,110 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+import unittest
+
+from pyspark.sql.tests.test_arrow import ArrowTestsMixin
+from pyspark.testing.connectutils import ReusedConnectTestCase
+
+
+class ArrowParityTests(ArrowTestsMixin, ReusedConnectTestCase):
+@unittest.skip("Spark Connect does not support Spark Context but the test 
depends on that.")
+def test_createDataFrame_empty_partition(self):
+super().test_createDataFrame_empty_partition()
+
+@unittest.skip("Spark Connect does not support fallback.")
+def test_createDataFrame_fallback_disabled(self):
+super().test_createDataFrame_fallback_disabled()
+
+@unittest.skip("Spark Connect does not support fallback.")
+def test_createDataFrame_fallback_enabled(self):
+super().test_createDataFrame_fallback_enabled()
+
+def test_createDataFrame_with_incorrect_schema(self):
+self.check_createDataFrame_with_incorrect_schema()
+
+# TODO(SPARK-42969): Fix the comparison the result with Arrow optimization 
enabled/disabled.
+@unittest.skip("Fails in Spark Connect, should enable.")
+def test_createDataFrame_with_map_type(self):
+super().test_createDataFrame_with_map_type()
+
+# TODO(SPARK-42969): Fix the comparison the result with Arrow optimization 
enabled/disabled.
+@unittest.skip("Fails in Spark Connect, should enable.")
+def test_createDataFrame_with_ndarray(self):
+super().test_createDataFrame_with_ndarray()
+
+@unittest.skip("Fails in Spark Connect, should enable.")
+def test_createDataFrame_with_single_data_type(self):
+super().test_createDataFrame_with_single_data_type()
+
+@unittest.skip("Spark Connect does not support RDD but the tests depend on 
them.")
+def test_no_partition_frame(self):
+super().test_no_partition_frame()
+
+@unittest.skip("Spark Connect does not support RDD but the tests depend on

[spark] branch branch-3.4 updated: [SPARK-42631][CONNECT][FOLLOW-UP] Expose Column.expr to extensions

2023-03-29 Thread hvanhovell

This is an automated email from the ASF dual-hosted git repository.

hvanhovell pushed a commit to branch branch-3.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.4 by this push:
 new ce36692 [SPARK-42631][CONNECT][FOLLOW-UP] Expose Column.expr to 
extensions
ce36692 is described below

commit ce366927c2e459cc0a109112b29572cea52b
Author: Tom van Bussel 
AuthorDate: Wed Mar 29 15:28:09 2023 -0400

[SPARK-42631][CONNECT][FOLLOW-UP] Expose Column.expr to extensions

### What changes were proposed in this pull request?
This PR is a follow-up to https://github.com/apache/spark/pull/40234, which 
makes it possible for extensions to create custom `Dataset`s and `Column`s. It 
exposes `Dataset.plan`, but unfortunately it does not expose `Column.expr`. 
This means that extensions cannot build custom `Column`s that provide a user 
provider `Column` as input.

### Why are the changes needed?
See above.

### Does this PR introduce _any_ user-facing change?
No. This only adds a change for a Developer API.

### How was this patch tested?
Existing tests to make sure nothing breaks.

Closes #40590 from tomvanbussel/SPARK-42631.

Authored-by: Tom van Bussel 
Signed-off-by: Herman van Hovell 
(cherry picked from commit c3716c4ec68c2dea07e8cd896d79bd7175517a31)
Signed-off-by: Herman van Hovell 
---
 .../connect/client/jvm/src/main/scala/org/apache/spark/sql/Column.scala | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git 
a/connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/Column.scala 
b/connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/Column.scala
index 4212747f57a..6a660a7482e 100644
--- 
a/connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/Column.scala
+++ 
b/connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/Column.scala
@@ -51,7 +51,7 @@ import org.apache.spark.sql.types._
  *
  * @since 3.4.0
  */
-class Column private[sql] (private[sql] val expr: proto.Expression) extends 
Logging {
+class Column private[sql] (@DeveloperApi val expr: proto.Expression) extends 
Logging {
 
   private[sql] def this(name: String, planId: Option[Long]) =
 this(Column.nameToExpression(name, planId))


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-42631][CONNECT][FOLLOW-UP] Expose Column.expr to extensions

2023-03-29 Thread hvanhovell

This is an automated email from the ASF dual-hosted git repository.

hvanhovell pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new c3716c4ec68 [SPARK-42631][CONNECT][FOLLOW-UP] Expose Column.expr to 
extensions
c3716c4ec68 is described below

commit c3716c4ec68c2dea07e8cd896d79bd7175517a31
Author: Tom van Bussel 
AuthorDate: Wed Mar 29 15:28:09 2023 -0400

[SPARK-42631][CONNECT][FOLLOW-UP] Expose Column.expr to extensions

### What changes were proposed in this pull request?
This PR is a follow-up to https://github.com/apache/spark/pull/40234, which 
makes it possible for extensions to create custom `Dataset`s and `Column`s. It 
exposes `Dataset.plan`, but unfortunately it does not expose `Column.expr`. 
This means that extensions cannot build custom `Column`s that provide a user 
provider `Column` as input.

### Why are the changes needed?
See above.

### Does this PR introduce _any_ user-facing change?
No. This only adds a change for a Developer API.

### How was this patch tested?
Existing tests to make sure nothing breaks.

Closes #40590 from tomvanbussel/SPARK-42631.

Authored-by: Tom van Bussel 
Signed-off-by: Herman van Hovell 
---
 .../connect/client/jvm/src/main/scala/org/apache/spark/sql/Column.scala | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git 
a/connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/Column.scala 
b/connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/Column.scala
index 4212747f57a..6a660a7482e 100644
--- 
a/connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/Column.scala
+++ 
b/connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/Column.scala
@@ -51,7 +51,7 @@ import org.apache.spark.sql.types._
  *
  * @since 3.4.0
  */
-class Column private[sql] (private[sql] val expr: proto.Expression) extends 
Logging {
+class Column private[sql] (@DeveloperApi val expr: proto.Expression) extends 
Logging {
 
   private[sql] def this(name: String, planId: Option[Long]) =
 this(Column.nameToExpression(name, planId))


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-42873][SQL] Define Spark SQL types as keywords

2023-03-29 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 907cefeebb5 [SPARK-42873][SQL] Define Spark SQL types as keywords
907cefeebb5 is described below

commit 907cefeebb5e15ab6b4970ca8b6e42a8410a7c46
Author: Max Gekk 
AuthorDate: Wed Mar 29 18:52:06 2023 +0300

[SPARK-42873][SQL] Define Spark SQL types as keywords

### What changes were proposed in this pull request?
In the PR, I propose to define Spark SQL types as keywords.

### Why are the changes needed?
The non-keywords types cause some inconveniences while 
analysing/transforming the lexer tree. For example, while forming the stable 
column aliases, see https://github.com/apache/spark/pull/40126.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
By running the modified test suites:
```
$ build/sbt "test:testOnly *.ResolveAliasesSuite"
$ build/sbt "test:testOnly *.ParserUtilsSuite"
```

Closes #40565 from MaxGekk/datatype-keywords.

Authored-by: Max Gekk 
Signed-off-by: Max Gekk 
---
 docs/sql-ref-ansi-compliance.md| 24 +++
 .../spark/sql/catalyst/parser/SqlBaseLexer.g4  | 26 +++-
 .../spark/sql/catalyst/parser/SqlBaseParser.g4 | 74 +-
 .../spark/sql/catalyst/parser/AstBuilder.scala | 61 +-
 .../spark/sql/errors/QueryParsingErrors.scala  |  6 +-
 .../catalyst/analysis/ResolveAliasesSuite.scala|  6 +-
 .../sql/catalyst/parser/ParserUtilsSuite.scala | 10 +--
 7 files changed, 164 insertions(+), 43 deletions(-)

diff --git a/docs/sql-ref-ansi-compliance.md b/docs/sql-ref-ansi-compliance.md
index 4124e958e39..36d1f8f73eb 100644
--- a/docs/sql-ref-ansi-compliance.md
+++ b/docs/sql-ref-ansi-compliance.md
@@ -366,10 +366,14 @@ Below is a list of all the keywords in Spark SQL.
 |AT|non-reserved|non-reserved|reserved|
 |AUTHORIZATION|reserved|non-reserved|reserved|
 |BETWEEN|non-reserved|non-reserved|reserved|
+|BIGINT|non-reserved|non-reserved|reserved|
+|BINARY|non-reserved|non-reserved|reserved|
+|BOOLEAN|non-reserved|non-reserved|reserved|
 |BOTH|reserved|non-reserved|reserved|
 |BUCKET|non-reserved|non-reserved|non-reserved|
 |BUCKETS|non-reserved|non-reserved|non-reserved|
 |BY|non-reserved|non-reserved|reserved|
+|BYTE|non-reserved|non-reserved|non-reserved|
 |CACHE|non-reserved|non-reserved|non-reserved|
 |CASCADE|non-reserved|non-reserved|non-reserved|
 |CASE|reserved|non-reserved|reserved|
@@ -377,6 +381,8 @@ Below is a list of all the keywords in Spark SQL.
 |CATALOG|non-reserved|non-reserved|non-reserved|
 |CATALOGS|non-reserved|non-reserved|non-reserved|
 |CHANGE|non-reserved|non-reserved|non-reserved|
+|CHAR|non-reserved|non-reserved|reserved|
+|CHARACTER|non-reserved|non-reserved|reserved|
 |CHECK|reserved|non-reserved|reserved|
 |CLEAR|non-reserved|non-reserved|non-reserved|
 |CLUSTER|non-reserved|non-reserved|non-reserved|
@@ -403,6 +409,7 @@ Below is a list of all the keywords in Spark SQL.
 |CURRENT_TIMESTAMP|reserved|non-reserved|reserved|
 |CURRENT_USER|reserved|non-reserved|reserved|
 |DATA|non-reserved|non-reserved|non-reserved|
+|DATE|non-reserved|non-reserved|reserved|
 |DATABASE|non-reserved|non-reserved|non-reserved|
 |DATABASES|non-reserved|non-reserved|non-reserved|
 |DATEADD|non-reserved|non-reserved|non-reserved|
@@ -411,6 +418,8 @@ Below is a list of all the keywords in Spark SQL.
 |DAYS|non-reserved|non-reserved|non-reserved|
 |DAYOFYEAR|non-reserved|non-reserved|non-reserved|
 |DBPROPERTIES|non-reserved|non-reserved|non-reserved|
+|DEC|non-reserved|non-reserved|reserved|
+|DECIMAL|non-reserved|non-reserved|reserved|
 |DEFAULT|non-reserved|non-reserved|non-reserved|
 |DEFINED|non-reserved|non-reserved|non-reserved|
 |DELETE|non-reserved|non-reserved|reserved|
@@ -423,6 +432,7 @@ Below is a list of all the keywords in Spark SQL.
 |DISTINCT|reserved|non-reserved|reserved|
 |DISTRIBUTE|non-reserved|non-reserved|non-reserved|
 |DIV|non-reserved|non-reserved|not a keyword|
+|DOUBLE|non-reserved|non-reserved|reserved|
 |DROP|non-reserved|non-reserved|reserved|
 |ELSE|reserved|non-reserved|reserved|
 |END|reserved|non-reserved|reserved|
@@ -443,6 +453,7 @@ Below is a list of all the keywords in Spark SQL.
 |FILTER|reserved|non-reserved|reserved|
 |FILEFORMAT|non-reserved|non-reserved|non-reserved|
 |FIRST|non-reserved|non-reserved|non-reserved|
+|FLOAT|non-reserved|non-reserved|reserved|
 |FOLLOWING|non-reserved|non-reserved|non-reserved|
 |FOR|reserved|non-reserved|reserved|
 |FOREIGN|reserved|non-reserved|reserved|
@@ -471,6 +482,8 @@ Below is a list of all the keywords in Spark SQL.
 |INPATH|non-reserved|non-reserved|non-reserved|
 |INPUTFORMAT|non-reserved|non-reserved|non-reserved|
 |INSERT|non-reserved|non-reserved|reserved

[spark] branch master updated: [SPARK-42954][PYTHON][CONNECT] Add `YearMonthIntervalType` to PySpark and Spark Connect Python Client

2023-03-29 Thread ruifengz

This is an automated email from the ASF dual-hosted git repository.

ruifengz pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new f532d222321 [SPARK-42954][PYTHON][CONNECT] Add `YearMonthIntervalType` 
to PySpark and Spark Connect Python Client
f532d222321 is described below

commit f532d222321aeec6d736ccf69f71b94fe07d4cd8
Author: Ruifeng Zheng 
AuthorDate: Wed Mar 29 17:39:54 2023 +0800

[SPARK-42954][PYTHON][CONNECT] Add `YearMonthIntervalType` to PySpark and 
Spark Connect Python Client

### What changes were proposed in this pull request?
Add `YearMonthIntervalType` to PySpark and Spark Connect Python Client

### Why are the changes needed?
function parity

**Note**
the added  `YearMonthIntervalType` is not supported in 
`collect`/`createDataFrame`, since I cannot find a python built-in type for 
`YearMonthIntervalType` (like `datetime.timedelta` for `DayTimeIntervalType`), 
we need further discussion.

### Does this PR introduce _any_ user-facing change?
yes, new data type in python

before this PR
```
In [1]: spark.sql("SELECT INTERVAL '10-8' YEAR TO MONTH AS interval")
Out[1]: 
---
ValueErrorTraceback (most recent call last)
File ~/Dev/spark/python/pyspark/sql/dataframe.py:570, in 
DataFrame.schema(self)
568 try:
569 self._schema = cast(
--> 570 StructType, 
_parse_datatype_json_string(self._jdf.schema().json())
571 )
572 except Exception as e:

...

ValueError: Unable to parse datatype from schema. Could not parse datatype: 
interval year to month
```

after this PR
```
In [3]: spark.sql("SELECT INTERVAL '10-8' YEAR TO MONTH AS interval")
Out[3]: DataFrame[interval: interval year to month]
```

### How was this patch tested?
added UT

Closes #40582 from zhengruifeng/py_y_m.

Authored-by: Ruifeng Zheng 
Signed-off-by: Ruifeng Zheng 
---
 .../source/reference/pyspark.sql/data_types.rst|  1 +
 python/pyspark/sql/connect/types.py| 16 +
 python/pyspark/sql/tests/test_types.py | 32 ++
 python/pyspark/sql/types.py| 72 --
 4 files changed, 116 insertions(+), 5 deletions(-)

diff --git a/python/docs/source/reference/pyspark.sql/data_types.rst 
b/python/docs/source/reference/pyspark.sql/data_types.rst
index 53417e43419..60c6b92590d 100644
--- a/python/docs/source/reference/pyspark.sql/data_types.rst
+++ b/python/docs/source/reference/pyspark.sql/data_types.rst
@@ -47,3 +47,4 @@ Data Types
 TimestampType
 TimestampNTZType
 DayTimeIntervalType
+YearMonthIntervalType
diff --git a/python/pyspark/sql/connect/types.py 
b/python/pyspark/sql/connect/types.py
index dfb0fb5303f..3afac8dc5b9 100644
--- a/python/pyspark/sql/connect/types.py
+++ b/python/pyspark/sql/connect/types.py
@@ -34,6 +34,7 @@ from pyspark.sql.types import (
 TimestampType,
 TimestampNTZType,
 DayTimeIntervalType,
+YearMonthIntervalType,
 MapType,
 StringType,
 CharType,
@@ -154,6 +155,9 @@ def pyspark_types_to_proto_types(data_type: DataType) -> 
pb2.DataType:
 elif isinstance(data_type, DayTimeIntervalType):
 ret.day_time_interval.start_field = data_type.startField
 ret.day_time_interval.end_field = data_type.endField
+elif isinstance(data_type, YearMonthIntervalType):
+ret.year_month_interval.start_field = data_type.startField
+ret.year_month_interval.end_field = data_type.endField
 elif isinstance(data_type, StructType):
 for field in data_type.fields:
 struct_field = pb2.DataType.StructField()
@@ -236,6 +240,18 @@ def proto_schema_to_pyspark_data_type(schema: 
pb2.DataType) -> DataType:
 else None
 )
 return DayTimeIntervalType(startField=start, endField=end)
+elif schema.HasField("year_month_interval"):
+start: Optional[int] = (  # type: ignore[no-redef]
+schema.year_month_interval.start_field
+if schema.year_month_interval.HasField("start_field")
+else None
+)
+end: Optional[int] = (  # type: ignore[no-redef]
+schema.year_month_interval.end_field
+if schema.year_month_interval.HasField("end_field")
+else None
+)
+return YearMonthIntervalType(startField=start, endField=end)
 elif schema.HasField("array"):
 return ArrayType(
 proto_schema_to_pyspark_data_type(schema.array.element_type),
diff --git a/python/pyspark/sql/tests/test_types.py 
b/python/pyspark/sql/tests/test_types.py
index 5d6476b47f4..dd2abda4620 100644
--- a/python/pyspark/sql/tests/test_types.py

[spark] branch branch-3 created (now d6a6af51d64)

2023-03-29 Thread ruifengz

This is an automated email from the ASF dual-hosted git repository.

ruifengz pushed a change to branch branch-3
in repository https://gitbox.apache.org/repos/asf/spark.git


  at d6a6af51d64 [SPARK-42952][SQL] Simplify the parameter of analyzer rule 
PreprocessTableCreation and DataSourceAnalysis

No new revisions were added by this update.


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-42952][SQL] Simplify the parameter of analyzer rule PreprocessTableCreation and DataSourceAnalysis

2023-03-29 Thread gengliang

This is an automated email from the ASF dual-hosted git repository.

gengliang pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new d6a6af51d64 [SPARK-42952][SQL] Simplify the parameter of analyzer rule 
PreprocessTableCreation and DataSourceAnalysis
d6a6af51d64 is described below

commit d6a6af51d64e809c7413fde05fdebdd035d89f2e
Author: Gengliang Wang 
AuthorDate: Wed Mar 29 01:15:57 2023 -0700

[SPARK-42952][SQL] Simplify the parameter of analyzer rule 
PreprocessTableCreation and DataSourceAnalysis

### What changes were proposed in this pull request?

Simplify the parameter of the following analyzer rule:
* PreprocessTableCreation: use a SessionCatalog instead of passing 
SparkSession
* DataSourceAnalysis: remove the unused `Analyzer` in the parameter and 
turn it from class to object.

### Why are the changes needed?

Code cleanup

### Does this PR introduce _any_ user-facing change?

No
### How was this patch tested?

Existing tests

Closes #40580 from gengliangwang/catalog.

Authored-by: Gengliang Wang 
Signed-off-by: Gengliang Wang 
---
 .../apache/spark/sql/execution/datasources/DataSourceStrategy.scala   | 2 +-
 .../main/scala/org/apache/spark/sql/execution/datasources/rules.scala | 4 +---
 .../scala/org/apache/spark/sql/internal/BaseSessionStateBuilder.scala | 4 ++--
 .../apache/spark/sql/connector/V2CommandsCaseSensitivitySuite.scala   | 2 +-
 .../scala/org/apache/spark/sql/sources/DataSourceAnalysisSuite.scala  | 3 +--
 .../scala/org/apache/spark/sql/hive/HiveSessionStateBuilder.scala | 4 ++--
 6 files changed, 8 insertions(+), 11 deletions(-)

diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala
index e3a1f6f6b68..69c7624605b 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala
@@ -61,7 +61,7 @@ import org.apache.spark.unsafe.types.UTF8String
  * Note that, this rule must be run after `PreprocessTableCreation` and
  * `PreprocessTableInsertion`.
  */
-case class DataSourceAnalysis(analyzer: Analyzer) extends Rule[LogicalPlan] {
+object DataSourceAnalysis extends Rule[LogicalPlan] {
 
   def resolver: Resolver = conf.resolver
 
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/rules.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/rules.scala
index 635562ab54d..2564d7e50a2 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/rules.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/rules.scala
@@ -89,9 +89,7 @@ class ResolveSQLOnFile(sparkSession: SparkSession) extends 
Rule[LogicalPlan] {
 /**
  * Preprocess [[CreateTable]], to do some normalization and checking.
  */
-case class PreprocessTableCreation(sparkSession: SparkSession) extends 
Rule[LogicalPlan] {
-  // catalog is a def and not a val/lazy val as the latter would introduce a 
circular reference
-  private def catalog = sparkSession.sessionState.catalog
+case class PreprocessTableCreation(catalog: SessionCatalog) extends 
Rule[LogicalPlan] {
 
   def apply(plan: LogicalPlan): LogicalPlan = plan resolveOperators {
 // When we CREATE TABLE without specifying the table schema, we should 
fail the query if
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/internal/BaseSessionStateBuilder.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/internal/BaseSessionStateBuilder.scala
index f17d0c3dd2e..8585808d762 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/internal/BaseSessionStateBuilder.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/internal/BaseSessionStateBuilder.scala
@@ -193,9 +193,9 @@ abstract class BaseSessionStateBuilder(
 
 override val postHocResolutionRules: Seq[Rule[LogicalPlan]] =
   DetectAmbiguousSelfJoin +:
-PreprocessTableCreation(session) +:
+PreprocessTableCreation(catalog) +:
 PreprocessTableInsertion +:
-DataSourceAnalysis(this) +:
+DataSourceAnalysis +:
 ApplyCharTypePadding +:
 ReplaceCharWithVarchar +:
 customPostHocResolutionRules
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/connector/V2CommandsCaseSensitivitySuite.scala
 
b/sql/core/src/test/scala/org/apache/spark/sql/connector/V2CommandsCaseSensitivitySuite.scala
index 44cd4f0f9b3..a51ac78fdb2 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/sql/connector/V2CommandsCaseSensitivitySuite.scala
+++ 
b/sql/core/src/test/scala/org/apache/spark/sql/connector/V2CommandsCaseSensitivitySuite.scala
@@ -44,7 +44,7 @@ class

[spark] branch master updated: [SPARK-42957][INFRA][FOLLOWUP] Use 'cyclonedx' instead of file extensions

2023-03-29 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 7aa0195e467 [SPARK-42957][INFRA][FOLLOWUP] Use 'cyclonedx' instead of 
file extensions
7aa0195e467 is described below

commit 7aa0195e46792ddfd2ec295ad439dab70f8dbe1d
Author: Dongjoon Hyun 
AuthorDate: Wed Mar 29 01:09:42 2023 -0700

[SPARK-42957][INFRA][FOLLOWUP] Use 'cyclonedx' instead of file extensions

### What changes were proposed in this pull request?

This PR is a follow-up of #40585 which aims to use `cyclonedx` instead of 
file extension.

### Why are the changes needed?

When we use file extensions `xml` and `json`, `maven-metadata-local.xml` 
are missed.
```

spark-rm1ea0f8a3e397:/opt/spark-rm/output/spark/spark-repo-glCsK/org/apache/spark$
 find . | grep xml
./spark-core_2.13/3.4.1-SNAPSHOT/maven-metadata-local.xml

./spark-core_2.13/3.4.1-SNAPSHOT/spark-core_2.13-3.4.1-SNAPSHOT-cyclonedx.xml
./spark-core_2.13/maven-metadata-local.xml
```

We need to use `cyclonedx` specifically.
```

spark-rm1ea0f8a3e397:/opt/spark-rm/output/spark/spark-repo-glCsK/org/apache/spark$
 find . -type f |grep -v \.jar |grep -v \.pom

./spark-catalyst_2.13/3.4.1-SNAPSHOT/spark-catalyst_2.13-3.4.1-SNAPSHOT-cyclonedx.json

./spark-catalyst_2.13/3.4.1-SNAPSHOT/spark-catalyst_2.13-3.4.1-SNAPSHOT-cyclonedx.xml

./spark-core_2.13/3.4.1-SNAPSHOT/spark-core_2.13-3.4.1-SNAPSHOT-cyclonedx.json

./spark-core_2.13/3.4.1-SNAPSHOT/spark-core_2.13-3.4.1-SNAPSHOT-cyclonedx.xml

./spark-graphx_2.13/3.4.1-SNAPSHOT/spark-graphx_2.13-3.4.1-SNAPSHOT-cyclonedx.json

./spark-graphx_2.13/3.4.1-SNAPSHOT/spark-graphx_2.13-3.4.1-SNAPSHOT-cyclonedx.xml

./spark-kvstore_2.13/3.4.1-SNAPSHOT/spark-kvstore_2.13-3.4.1-SNAPSHOT-cyclonedx.json

./spark-kvstore_2.13/3.4.1-SNAPSHOT/spark-kvstore_2.13-3.4.1-SNAPSHOT-cyclonedx.xml

./spark-launcher_2.13/3.4.1-SNAPSHOT/spark-launcher_2.13-3.4.1-SNAPSHOT-cyclonedx.json

./spark-launcher_2.13/3.4.1-SNAPSHOT/spark-launcher_2.13-3.4.1-SNAPSHOT-cyclonedx.xml

./spark-mllib-local_2.13/3.4.1-SNAPSHOT/spark-mllib-local_2.13-3.4.1-SNAPSHOT-cyclonedx.json

./spark-mllib-local_2.13/3.4.1-SNAPSHOT/spark-mllib-local_2.13-3.4.1-SNAPSHOT-cyclonedx.xml

./spark-network-common_2.13/3.4.1-SNAPSHOT/spark-network-common_2.13-3.4.1-SNAPSHOT-cyclonedx.json

./spark-network-common_2.13/3.4.1-SNAPSHOT/spark-network-common_2.13-3.4.1-SNAPSHOT-cyclonedx.xml

./spark-network-shuffle_2.13/3.4.1-SNAPSHOT/spark-network-shuffle_2.13-3.4.1-SNAPSHOT-cyclonedx.json

./spark-network-shuffle_2.13/3.4.1-SNAPSHOT/spark-network-shuffle_2.13-3.4.1-SNAPSHOT-cyclonedx.xml

./spark-parent_2.13/3.4.1-SNAPSHOT/spark-parent_2.13-3.4.1-SNAPSHOT-cyclonedx.json

./spark-parent_2.13/3.4.1-SNAPSHOT/spark-parent_2.13-3.4.1-SNAPSHOT-cyclonedx.xml

./spark-sketch_2.13/3.4.1-SNAPSHOT/spark-sketch_2.13-3.4.1-SNAPSHOT-cyclonedx.json

./spark-sketch_2.13/3.4.1-SNAPSHOT/spark-sketch_2.13-3.4.1-SNAPSHOT-cyclonedx.xml

./spark-streaming_2.13/3.4.1-SNAPSHOT/spark-streaming_2.13-3.4.1-SNAPSHOT-cyclonedx.json

./spark-streaming_2.13/3.4.1-SNAPSHOT/spark-streaming_2.13-3.4.1-SNAPSHOT-cyclonedx.xml

./spark-tags_2.13/3.4.1-SNAPSHOT/spark-tags_2.13-3.4.1-SNAPSHOT-cyclonedx.json

./spark-tags_2.13/3.4.1-SNAPSHOT/spark-tags_2.13-3.4.1-SNAPSHOT-cyclonedx.xml

./spark-unsafe_2.13/3.4.1-SNAPSHOT/spark-unsafe_2.13-3.4.1-SNAPSHOT-cyclonedx.json

./spark-unsafe_2.13/3.4.1-SNAPSHOT/spark-unsafe_2.13-3.4.1-SNAPSHOT-cyclonedx.xml
```

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Manual test.

Closes #40587 from dongjoon-hyun/SPARK-42957-2.

Authored-by: Dongjoon Hyun 
Signed-off-by: Dongjoon Hyun 
---
 dev/create-release/release-build.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/dev/create-release/release-build.sh 
b/dev/create-release/release-build.sh
index 0107e1cb2c0..e0588ae934c 100755
--- a/dev/create-release/release-build.sh
+++ b/dev/create-release/release-build.sh
@@ -473,7 +473,7 @@ if [[ "$1" == "publish-release" ]]; then
   pushd $tmp_repo/org/apache/spark
 
   # Remove any extra files generated during install
-  find . -type f |grep -v \.jar |grep -v \.pom |grep -v \.xml |grep -v \.json 
| xargs rm
+  find . -type f |grep -v \.jar |grep -v \.pom |grep -v cyclonedx | xargs rm
 
   echo "Creating hash and signature files"
   # this must have .asc, .md5 and .sha1 - it really doesn't like anything else 
there


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.4 updated: [SPARK-42957][INFRA][FOLLOWUP] Use 'cyclonedx' instead of file extensions

2023-03-29 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.4 by this push:
 new 4e2046729be [SPARK-42957][INFRA][FOLLOWUP] Use 'cyclonedx' instead of 
file extensions
4e2046729be is described below

commit 4e2046729be4e5b0223e3f183733d182864e1a1c
Author: Dongjoon Hyun 
AuthorDate: Wed Mar 29 01:09:42 2023 -0700

[SPARK-42957][INFRA][FOLLOWUP] Use 'cyclonedx' instead of file extensions

### What changes were proposed in this pull request?

This PR is a follow-up of #40585 which aims to use `cyclonedx` instead of 
file extension.

### Why are the changes needed?

When we use file extensions `xml` and `json`, `maven-metadata-local.xml` 
are missed.
```

spark-rm1ea0f8a3e397:/opt/spark-rm/output/spark/spark-repo-glCsK/org/apache/spark$
 find . | grep xml
./spark-core_2.13/3.4.1-SNAPSHOT/maven-metadata-local.xml

./spark-core_2.13/3.4.1-SNAPSHOT/spark-core_2.13-3.4.1-SNAPSHOT-cyclonedx.xml
./spark-core_2.13/maven-metadata-local.xml
```

We need to use `cyclonedx` specifically.
```

spark-rm1ea0f8a3e397:/opt/spark-rm/output/spark/spark-repo-glCsK/org/apache/spark$
 find . -type f |grep -v \.jar |grep -v \.pom

./spark-catalyst_2.13/3.4.1-SNAPSHOT/spark-catalyst_2.13-3.4.1-SNAPSHOT-cyclonedx.json

./spark-catalyst_2.13/3.4.1-SNAPSHOT/spark-catalyst_2.13-3.4.1-SNAPSHOT-cyclonedx.xml

./spark-core_2.13/3.4.1-SNAPSHOT/spark-core_2.13-3.4.1-SNAPSHOT-cyclonedx.json

./spark-core_2.13/3.4.1-SNAPSHOT/spark-core_2.13-3.4.1-SNAPSHOT-cyclonedx.xml

./spark-graphx_2.13/3.4.1-SNAPSHOT/spark-graphx_2.13-3.4.1-SNAPSHOT-cyclonedx.json

./spark-graphx_2.13/3.4.1-SNAPSHOT/spark-graphx_2.13-3.4.1-SNAPSHOT-cyclonedx.xml

./spark-kvstore_2.13/3.4.1-SNAPSHOT/spark-kvstore_2.13-3.4.1-SNAPSHOT-cyclonedx.json

./spark-kvstore_2.13/3.4.1-SNAPSHOT/spark-kvstore_2.13-3.4.1-SNAPSHOT-cyclonedx.xml

./spark-launcher_2.13/3.4.1-SNAPSHOT/spark-launcher_2.13-3.4.1-SNAPSHOT-cyclonedx.json

./spark-launcher_2.13/3.4.1-SNAPSHOT/spark-launcher_2.13-3.4.1-SNAPSHOT-cyclonedx.xml

./spark-mllib-local_2.13/3.4.1-SNAPSHOT/spark-mllib-local_2.13-3.4.1-SNAPSHOT-cyclonedx.json

./spark-mllib-local_2.13/3.4.1-SNAPSHOT/spark-mllib-local_2.13-3.4.1-SNAPSHOT-cyclonedx.xml

./spark-network-common_2.13/3.4.1-SNAPSHOT/spark-network-common_2.13-3.4.1-SNAPSHOT-cyclonedx.json

./spark-network-common_2.13/3.4.1-SNAPSHOT/spark-network-common_2.13-3.4.1-SNAPSHOT-cyclonedx.xml

./spark-network-shuffle_2.13/3.4.1-SNAPSHOT/spark-network-shuffle_2.13-3.4.1-SNAPSHOT-cyclonedx.json

./spark-network-shuffle_2.13/3.4.1-SNAPSHOT/spark-network-shuffle_2.13-3.4.1-SNAPSHOT-cyclonedx.xml

./spark-parent_2.13/3.4.1-SNAPSHOT/spark-parent_2.13-3.4.1-SNAPSHOT-cyclonedx.json

./spark-parent_2.13/3.4.1-SNAPSHOT/spark-parent_2.13-3.4.1-SNAPSHOT-cyclonedx.xml

./spark-sketch_2.13/3.4.1-SNAPSHOT/spark-sketch_2.13-3.4.1-SNAPSHOT-cyclonedx.json

./spark-sketch_2.13/3.4.1-SNAPSHOT/spark-sketch_2.13-3.4.1-SNAPSHOT-cyclonedx.xml

./spark-streaming_2.13/3.4.1-SNAPSHOT/spark-streaming_2.13-3.4.1-SNAPSHOT-cyclonedx.json

./spark-streaming_2.13/3.4.1-SNAPSHOT/spark-streaming_2.13-3.4.1-SNAPSHOT-cyclonedx.xml

./spark-tags_2.13/3.4.1-SNAPSHOT/spark-tags_2.13-3.4.1-SNAPSHOT-cyclonedx.json

./spark-tags_2.13/3.4.1-SNAPSHOT/spark-tags_2.13-3.4.1-SNAPSHOT-cyclonedx.xml

./spark-unsafe_2.13/3.4.1-SNAPSHOT/spark-unsafe_2.13-3.4.1-SNAPSHOT-cyclonedx.json

./spark-unsafe_2.13/3.4.1-SNAPSHOT/spark-unsafe_2.13-3.4.1-SNAPSHOT-cyclonedx.xml
```

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Manual test.

Closes #40587 from dongjoon-hyun/SPARK-42957-2.

Authored-by: Dongjoon Hyun 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit 7aa0195e46792ddfd2ec295ad439dab70f8dbe1d)
Signed-off-by: Dongjoon Hyun 
---
 dev/create-release/release-build.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/dev/create-release/release-build.sh 
b/dev/create-release/release-build.sh
index 0107e1cb2c0..e0588ae934c 100755
--- a/dev/create-release/release-build.sh
+++ b/dev/create-release/release-build.sh
@@ -473,7 +473,7 @@ if [[ "$1" == "publish-release" ]]; then
   pushd $tmp_repo/org/apache/spark
 
   # Remove any extra files generated during install
-  find . -type f |grep -v \.jar |grep -v \.pom |grep -v \.xml |grep -v \.json 
| xargs rm
+  find . -type f |grep -v \.jar |grep -v \.pom |grep -v cyclonedx | xargs rm
 
   echo "Creating hash and signature files"
   # this must have .asc, .md5 and .sha1 - it really doesn't like anything else 
there


-
To unsubscribe, e-mail: commits

[spark] branch branch-3.4 updated: [SPARK-42895][CONNECT] Improve error messages for stopped Spark sessions

2023-03-29 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.4 by this push:
 new dc834d4ba97 [SPARK-42895][CONNECT] Improve error messages for stopped 
Spark sessions
dc834d4ba97 is described below

commit dc834d4ba977a27ccd7dfff262b96008476d0574
Author: allisonwang-db 
AuthorDate: Wed Mar 29 17:02:10 2023 +0900

[SPARK-42895][CONNECT] Improve error messages for stopped Spark sessions

### What changes were proposed in this pull request?

This PR improves error messages when users attempt to invoke session 
operations on a stopped Spark session.

### Why are the changes needed?
To make the error messages more user-friendly.

For example:
```python
spark.stop()
spark.sql("select 1")
```
Before this PR, this code will throw two exceptions:
```
ValueError: Cannot invoke RPC: Channel closed!

During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  ...
return e.code() == grpc.StatusCode.UNAVAILABLE
AttributeError: 'ValueError' object has no attribute 'code'
```

After this PR, it will show this exception:
```
[NO_ACTIVE_SESSION] No active Spark session found. Please create a new 
Spark session before running the code.
```

### Does this PR introduce _any_ user-facing change?
Yes. This PR modifies the error messages.

### How was this patch tested?

New unit test.

Closes #40536 from allisonwang-db/spark-42895-stopped-session.

Authored-by: allisonwang-db 
Signed-off-by: Hyukjin Kwon 
(cherry picked from commit e9a87825f737211c2ab0fa6d02c1c6f2a47b0024)
Signed-off-by: Hyukjin Kwon 
---
 python/pyspark/errors/error_classes.py |  5 +++
 python/pyspark/sql/connect/client.py   | 47 +-
 .../sql/tests/connect/test_connect_basic.py| 42 ++-
 3 files changed, 82 insertions(+), 12 deletions(-)

diff --git a/python/pyspark/errors/error_classes.py 
b/python/pyspark/errors/error_classes.py
index dda1f5a1f84..85d3c0d7dfb 100644
--- a/python/pyspark/errors/error_classes.py
+++ b/python/pyspark/errors/error_classes.py
@@ -164,6 +164,11 @@ ERROR_CLASSES_JSON = """
   "Argument `` should be a WindowSpec, got ."
 ]
   },
+  "NO_ACTIVE_SESSION" : {
+"message" : [
+  "No active Spark session found. Please create a new Spark session before 
running the code."
+]
+  },
   "UNSUPPORTED_NUMPY_ARRAY_SCALAR" : {
 "message" : [
   "The type of array scalar '' is not supported."
diff --git a/python/pyspark/sql/connect/client.py 
b/python/pyspark/sql/connect/client.py
index 84889e76103..1ba7dba957d 100644
--- a/python/pyspark/sql/connect/client.py
+++ b/python/pyspark/sql/connect/client.py
@@ -514,8 +514,11 @@ class SparkConnectClient(object):
 """
 
 @classmethod
-def retry_exception(cls, e: grpc.RpcError) -> bool:
-return e.code() == grpc.StatusCode.UNAVAILABLE
+def retry_exception(cls, e: Exception) -> bool:
+if isinstance(e, grpc.RpcError):
+return e.code() == grpc.StatusCode.UNAVAILABLE
+else:
+return False
 
 def __init__(
 self,
@@ -880,8 +883,8 @@ class SparkConnectClient(object):
 )
 return AnalyzeResult.fromProto(resp)
 raise SparkConnectException("Invalid state during retry exception 
handling.")
-except grpc.RpcError as rpc_error:
-self._handle_error(rpc_error)
+except Exception as error:
+self._handle_error(error)
 
 def _execute(self, req: pb2.ExecutePlanRequest) -> None:
 """
@@ -905,8 +908,8 @@ class SparkConnectClient(object):
 "Received incorrect session identifier for 
request: "
 f"{b.session_id} != {self._session_id}"
 )
-except grpc.RpcError as rpc_error:
-self._handle_error(rpc_error)
+except Exception as error:
+self._handle_error(error)
 
 def _execute_and_fetch_as_iterator(
 self, req: pb2.ExecutePlanRequest
@@ -956,8 +959,8 @@ class SparkConnectClient(object):
 for batch in reader:
 assert isinstance(batch, pa.RecordBatch)
 yield batch
-except grpc.RpcError as rpc_error:
-self._handle_error(rpc_error)
+except Exception as error:
+self._handle_error(error)
 
 def _execute_and_fetch(
 self, req: pb2.ExecutePlanRequest
@@ -1032,10 +1035,32 @@ class SparkConnectClient(object):
 )
 return ConfigResult.fromProto(

[spark] branch master updated (e586e212ca2 -> e9a87825f73)

2023-03-29 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from e586e212ca2 [SPARK-42956][CONNECT] Support avro functions for Scala 
client
 add e9a87825f73 [SPARK-42895][CONNECT] Improve error messages for stopped 
Spark sessions

No new revisions were added by this update.

Summary of changes:
 python/pyspark/errors/error_classes.py |  5 +++
 python/pyspark/sql/connect/client.py   | 47 +-
 .../sql/tests/connect/test_connect_basic.py| 42 ++-
 3 files changed, 82 insertions(+), 12 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-42956][CONNECT] Support avro functions for Scala client

2023-03-29 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new e586e212ca2 [SPARK-42956][CONNECT] Support avro functions for Scala 
client
e586e212ca2 is described below

commit e586e212ca2eb1d5f30443b6ee97bae8f6498cdb
Author: yangjie01 
AuthorDate: Wed Mar 29 16:20:08 2023 +0900

[SPARK-42956][CONNECT] Support avro functions for Scala client

### What changes were proposed in this pull request?
This pr aims to support avro functions for Scala client.

### Why are the changes needed?
Add Spark connect jvm client api coverage.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?

- Add new test
- Checked Scala 2.13

Closes #40584 from LuciferYang/SPARK-42956.

Authored-by: yangjie01 
Signed-off-by: Hyukjin Kwon 
---
 .../org/apache/spark/sql/avro/functions.scala  |  97 +
 .../scala/org/apache/spark/sql/functions.scala |   2 +-
 .../org/apache/spark/sql/FunctionTestSuite.scala   |   8 ++
 .../apache/spark/sql/PlanGenerationTestSuite.scala |  23 +
 .../explain-results/from_avro_with_options.explain |   2 +
 .../from_avro_without_options.explain  |   2 +
 .../explain-results/to_avro_with_schema.explain|   2 +
 .../explain-results/to_avro_without_schema.explain |   2 +
 .../queries/from_avro_with_options.json|  50 +++
 .../queries/from_avro_with_options.proto.bin   | Bin 0 -> 173 bytes
 .../queries/from_avro_without_options.json |  29 ++
 .../queries/from_avro_without_options.proto.bin| Bin 0 -> 112 bytes
 .../query-tests/queries/to_avro_with_schema.json   |  29 ++
 .../queries/to_avro_with_schema.proto.bin  | Bin 0 -> 103 bytes
 .../queries/to_avro_without_schema.json|  25 ++
 .../queries/to_avro_without_schema.proto.bin   | Bin 0 -> 69 bytes
 16 files changed, 270 insertions(+), 1 deletion(-)

diff --git 
a/connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/avro/functions.scala
 
b/connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/avro/functions.scala
new file mode 100644
index 000..c4b16ca0d5e
--- /dev/null
+++ 
b/connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/avro/functions.scala
@@ -0,0 +1,97 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.avro
+
+import scala.collection.JavaConverters._
+
+import org.apache.spark.annotation.Experimental
+import org.apache.spark.sql.Column
+import org.apache.spark.sql.functions.{fnWithOptions, lit}
+
+// scalastyle:off: object.name
+object functions {
+// scalastyle:on: object.name
+
+  /**
+   * Converts a binary column of avro format into its corresponding catalyst 
value. The specified
+   * schema must match the read data, otherwise the behavior is undefined: it 
may fail or return
+   * arbitrary result.
+   *
+   * @param data
+   *   the binary column.
+   * @param jsonFormatSchema
+   *   the avro schema in JSON string format.
+   *
+   * @since 3.5.0
+   */
+  @Experimental
+  def from_avro(data: Column, jsonFormatSchema: String): Column = {
+Column.fn("from_avro", data, lit(jsonFormatSchema))
+  }
+
+  /**
+   * Converts a binary column of Avro format into its corresponding catalyst 
value. The specified
+   * schema must match actual schema of the read data, otherwise the behavior 
is undefined: it may
+   * fail or return arbitrary result. To deserialize the data with a 
compatible and evolved
+   * schema, the expected Avro schema can be set via the option avroSchema.
+   *
+   * @param data
+   *   the binary column.
+   * @param jsonFormatSchema
+   *   the avro schema in JSON string format.
+   * @param options
+   *   options to control how the Avro record is parsed.
+   *
+   * @since 3.5.0
+   */
+  @Experimental
+  def from_avro(
+  data: Column,
+  jsonFormatSchema: String,
+  options: java.util.Map[String, String]): Column = {
+fnWithOptions("from_avro", options.asScala.i

[spark] branch master updated (f105fe82ab0 -> 880312b5ade)

svn commit: r60926 - in /dev/spark/v3.4.0-rc5-docs: ./ _site/ _site/api/ _site/api/R/ _site/api/R/articles/ _site/api/R/deps/ _site/api/R/deps/bootstrap-5.2.2/ _site/api/R/deps/jquery-3.6.0/ _site/api

[spark] branch branch-3.4 updated: [SPARK-42971][CORE] Change to print `workdir` if `appDirs` is null when worker handle `WorkDirCleanup` event

[spark] branch master updated: [SPARK-42971][CORE] Change to print `workdir` if `appDirs` is null when worker handle `WorkDirCleanup` event

svn commit: r60925 - /dev/spark/v3.4.0-rc5-bin/

[spark] 01/01: Preparing development version 3.4.1-SNAPSHOT

[spark] branch branch-3.4 updated (ce36692eeee -> 6a6f50444d4)

[spark] tag v3.4.0-rc5 created (now f39ad617d32)

[spark] 01/01: Preparing Spark release v3.4.0-rc5

[spark] branch master updated: [SPARK-42970][CONNECT][PYTHON][TESTS] Reuse pyspark.sql.tests.test_arrow test cases

[spark] branch branch-3.4 updated: [SPARK-42631][CONNECT][FOLLOW-UP] Expose Column.expr to extensions

[spark] branch master updated: [SPARK-42631][CONNECT][FOLLOW-UP] Expose Column.expr to extensions

[spark] branch master updated: [SPARK-42873][SQL] Define Spark SQL types as keywords

[spark] branch master updated: [SPARK-42954][PYTHON][CONNECT] Add `YearMonthIntervalType` to PySpark and Spark Connect Python Client

[spark] branch branch-3 created (now d6a6af51d64)

[spark] branch master updated: [SPARK-42952][SQL] Simplify the parameter of analyzer rule PreprocessTableCreation and DataSourceAnalysis

[spark] branch master updated: [SPARK-42957][INFRA][FOLLOWUP] Use 'cyclonedx' instead of file extensions

[spark] branch branch-3.4 updated: [SPARK-42957][INFRA][FOLLOWUP] Use 'cyclonedx' instead of file extensions

[spark] branch branch-3.4 updated: [SPARK-42895][CONNECT] Improve error messages for stopped Spark sessions

[spark] branch master updated (e586e212ca2 -> e9a87825f73)

[spark] branch master updated: [SPARK-42956][CONNECT] Support avro functions for Scala client

21 matches

Site Navigation

Mail list logo

Footer information