[spark] branch master updated: [SPARK-44640][PYTHON][FOLLOW-UP] Update UDTF error messages to include method name

2023-09-02 Thread ueshin
This is an automated email from the ASF dual-hosted git repository.

ueshin pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 3e22c8653d7 [SPARK-44640][PYTHON][FOLLOW-UP] Update UDTF error 
messages to include method name
3e22c8653d7 is described below

commit 3e22c8653d728a6b8523051faddcca437accfc22
Author: allisonwang-db 
AuthorDate: Sat Sep 2 16:07:09 2023 -0700

[SPARK-44640][PYTHON][FOLLOW-UP] Update UDTF error messages to include 
method name

### What changes were proposed in this pull request?

This PR is a follow-up for SPARK-44640 to make the error message of a few 
UDTF errors more informative by including the method name in the error message 
(`eval` or `terminate`).

### Why are the changes needed?

To improve error messages.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Existing tests.

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #42726 from allisonwang-db/SPARK-44640-follow-up.

Authored-by: allisonwang-db 
Signed-off-by: Takuya UESHIN 
---
 python/pyspark/errors/error_classes.py |  8 
 python/pyspark/sql/tests/test_udtf.py  | 21 +++
 python/pyspark/worker.py   | 37 +-
 3 files changed, 52 insertions(+), 14 deletions(-)

diff --git a/python/pyspark/errors/error_classes.py 
b/python/pyspark/errors/error_classes.py
index ca448a169e8..74f52c416e9 100644
--- a/python/pyspark/errors/error_classes.py
+++ b/python/pyspark/errors/error_classes.py
@@ -244,7 +244,7 @@ ERROR_CLASSES_JSON = """
   },
   "INVALID_ARROW_UDTF_RETURN_TYPE" : {
 "message" : [
-  "The return type of the arrow-optimized Python UDTF should be of type 
'pandas.DataFrame', but the function returned a value of type  with 
value: ."
+  "The return type of the arrow-optimized Python UDTF should be of type 
'pandas.DataFrame', but the '' method returned a value of type 
 with value: ."
 ]
   },
   "INVALID_BROADCAST_OPERATION": {
@@ -745,17 +745,17 @@ ERROR_CLASSES_JSON = """
   },
   "UDTF_INVALID_OUTPUT_ROW_TYPE" : {
 "message" : [
-"The type of an individual output row in the UDTF is invalid. Each row 
should be a tuple, list, or dict, but got ''. Please make sure that the 
output rows are of the correct type."
+"The type of an individual output row in the '' method of the 
UDTF is invalid. Each row should be a tuple, list, or dict, but got ''. 
Please make sure that the output rows are of the correct type."
 ]
   },
   "UDTF_RETURN_NOT_ITERABLE" : {
 "message" : [
-  "The return value of the UDTF is invalid. It should be an iterable 
(e.g., generator or list), but got ''. Please make sure that the UDTF 
returns one of these types."
+  "The return value of the '' method of the UDTF is invalid. It 
should be an iterable (e.g., generator or list), but got ''. Please make 
sure that the UDTF returns one of these types."
 ]
   },
   "UDTF_RETURN_SCHEMA_MISMATCH" : {
 "message" : [
-  "The number of columns in the result does not match the specified 
schema. Expected column count: , Actual column count: . 
Please make sure the values returned by the function have the same number of 
columns as specified in the output schema."
+  "The number of columns in the result does not match the specified 
schema. Expected column count: , Actual column count: . 
Please make sure the values returned by the '' method have the same 
number of columns as specified in the output schema."
 ]
   },
   "UDTF_RETURN_TYPE_MISMATCH" : {
diff --git a/python/pyspark/sql/tests/test_udtf.py 
b/python/pyspark/sql/tests/test_udtf.py
index c5f8b7693c2..97d5190a506 100644
--- a/python/pyspark/sql/tests/test_udtf.py
+++ b/python/pyspark/sql/tests/test_udtf.py
@@ -190,6 +190,27 @@ class BaseUDTFTestsMixin:
 with self.assertRaisesRegex(PythonException, 
"UDTF_RETURN_NOT_ITERABLE"):
 TestUDTF(lit(1)).collect()
 
+def test_udtf_with_zero_arg_and_invalid_return_value(self):
+@udtf(returnType="x: int")
+class TestUDTF:
+def eval(self):
+return 1
+
+with self.assertRaisesRegex(PythonException, 
"UDTF_RETURN_NOT_ITERABLE"):
+TestUDTF().collect()
+
+def test_udtf_with_invalid_return_value_in_terminate(self):
+@udtf(returnType="x: int")
+class TestUDTF:
+def eval(self, a):
+...
+
+def terminate(self):
+return 1
+
+with self.assertRaisesRegex(PythonException, 
"UDTF_RETURN_NOT_ITERABLE"):
+TestUDTF(lit(1)).collect()
+
 def test_udtf_eval_with_no_return(self):
 @udtf(returnType="a: int")
 class TestUDTF:
diff --git 

[spark] branch master updated: [SPARK-44956][BUILD] Upgrade Jekyll to 4.3.2 & Webrick to 1.8.1

2023-09-02 Thread srowen
This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 967aac1171a [SPARK-44956][BUILD] Upgrade Jekyll to 4.3.2 & Webrick to 
1.8.1
967aac1171a is described below

commit 967aac1171a49c8e98c992512487d77c2b1c4565
Author: panbingkun 
AuthorDate: Sat Sep 2 08:19:38 2023 -0500

[SPARK-44956][BUILD] Upgrade Jekyll to 4.3.2 & Webrick to 1.8.1

### What changes were proposed in this pull request?
The pr aims to upgrade
- Jekyll  from 4.2.1 to 4.3.2.
- Webrick from 1.7 to 1.8.1.

### Why are the changes needed?
1.The `4.2.1` version was released on Sep 27, 2021, and it has been 2 years 
since now.

2.Jekyll 4.3.2 was released in `Jan 21, 2023`, which includes the fix of a 
regression bug.
- https://github.com/jekyll/jekyll/releases/tag/v4.3.2
- https://github.com/jekyll/jekyll/releases/tag/v4.3.1
- https://github.com/jekyll/jekyll/releases/tag/v4.3.0
   Fix regression in Convertible module from v4.2.0 
(https://github.com/jekyll/jekyll/pull/8786)
- https://github.com/jekyll/jekyll/releases/tag/v4.2.2

3.The webrick newest version include some big fixed.
https://github.com/ruby/webrick/releases/tag/v1.8.1
https://github.com/ruby/webrick/releases/tag/v1.8.0

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
- Pass GA.
- Manually test.

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes #42669 from panbingkun/SPARK-44956.

Authored-by: panbingkun 
Signed-off-by: Sean Owen 
---
 docs/Gemfile  |  4 ++--
 docs/Gemfile.lock | 62 +--
 2 files changed, 35 insertions(+), 31 deletions(-)

diff --git a/docs/Gemfile b/docs/Gemfile
index 6c352012964..6c676037116 100644
--- a/docs/Gemfile
+++ b/docs/Gemfile
@@ -18,7 +18,7 @@
 source "https://rubygems.org;
 
 gem "ffi", "1.15.5"
-gem "jekyll", "4.2.1"
+gem "jekyll", "4.3.2"
 gem "rouge", "3.26.0"
 gem "jekyll-redirect-from", "0.16.0"
-gem "webrick", "1.7"
+gem "webrick", "1.8.1"
diff --git a/docs/Gemfile.lock b/docs/Gemfile.lock
index 6654e6c47c6..eda31f85747 100644
--- a/docs/Gemfile.lock
+++ b/docs/Gemfile.lock
@@ -1,74 +1,78 @@
 GEM
   remote: https://rubygems.org/
   specs:
-addressable (2.8.0)
-  public_suffix (>= 2.0.2, < 5.0)
+addressable (2.8.5)
+  public_suffix (>= 2.0.2, < 6.0)
 colorator (1.1.0)
-concurrent-ruby (1.1.9)
-em-websocket (0.5.2)
+concurrent-ruby (1.2.2)
+em-websocket (0.5.3)
   eventmachine (>= 0.12.9)
-  http_parser.rb (~> 0.6.0)
+  http_parser.rb (~> 0)
 eventmachine (1.2.7)
 ffi (1.15.5)
 forwardable-extended (2.6.0)
-http_parser.rb (0.6.0)
-i18n (1.8.11)
+google-protobuf (3.24.2)
+http_parser.rb (0.8.0)
+i18n (1.14.1)
   concurrent-ruby (~> 1.0)
-jekyll (4.2.1)
+jekyll (4.3.2)
   addressable (~> 2.4)
   colorator (~> 1.0)
   em-websocket (~> 0.5)
   i18n (~> 1.0)
-  jekyll-sass-converter (~> 2.0)
+  jekyll-sass-converter (>= 2.0, < 4.0)
   jekyll-watch (~> 2.0)
-  kramdown (~> 2.3)
+  kramdown (~> 2.3, >= 2.3.1)
   kramdown-parser-gfm (~> 1.0)
   liquid (~> 4.0)
-  mercenary (~> 0.4.0)
+  mercenary (>= 0.3.6, < 0.5)
   pathutil (~> 0.9)
-  rouge (~> 3.0)
+  rouge (>= 3.0, < 5.0)
   safe_yaml (~> 1.0)
-  terminal-table (~> 2.0)
+  terminal-table (>= 1.8, < 4.0)
+  webrick (~> 1.7)
 jekyll-redirect-from (0.16.0)
   jekyll (>= 3.3, < 5.0)
-jekyll-sass-converter (2.1.0)
-  sassc (> 2.0.1, < 3.0)
+jekyll-sass-converter (3.0.0)
+  sass-embedded (~> 1.54)
 jekyll-watch (2.2.1)
   listen (~> 3.0)
-kramdown (2.3.1)
+kramdown (2.4.0)
   rexml
 kramdown-parser-gfm (1.1.0)
   kramdown (~> 2.0)
-liquid (4.0.3)
-listen (3.7.0)
+liquid (4.0.4)
+listen (3.8.0)
   rb-fsevent (~> 0.10, >= 0.10.3)
   rb-inotify (~> 0.9, >= 0.9.10)
 mercenary (0.4.0)
 pathutil (0.16.2)
   forwardable-extended (~> 2.6)
-public_suffix (4.0.6)
-rb-fsevent (0.11.0)
+public_suffix (5.0.3)
+rake (13.0.6)
+rb-fsevent (0.11.2)
 rb-inotify (0.10.1)
   ffi (~> 1.0)
-rexml (3.2.5)
+rexml (3.2.6)
 rouge (3.26.0)
 safe_yaml (1.0.5)
-sassc (2.4.0)
-  ffi (~> 1.9)
-terminal-table (2.0.0)
-  unicode-display_width (~> 1.1, >= 1.1.1)
-unicode-display_width (1.8.0)
-webrick (1.7.0)
+sass-embedded (1.63.6)
+  google-protobuf (~> 3.23)
+  rake (>= 13.0.0)
+terminal-table (3.0.2)
+  unicode-display_width (>= 1.1.1, < 3)
+unicode-display_width (2.4.2)
+webrick (1.8.1)
 
 PLATFORMS
   ruby
 
 DEPENDENCIES
   

[spark] branch master updated: [SPARK-45043][BUILD] Upgrade `scalafmt` to 3.7.13

2023-09-02 Thread srowen
This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 82d54fc8924 [SPARK-45043][BUILD] Upgrade `scalafmt` to 3.7.13
82d54fc8924 is described below

commit 82d54fc8924618777992ee9a4d939b1fb336f20d
Author: panbingkun 
AuthorDate: Sat Sep 2 08:18:43 2023 -0500

[SPARK-45043][BUILD] Upgrade `scalafmt` to 3.7.13

### What changes were proposed in this pull request?
The pr aims to upgrade `scalafmt` from 3.7.5 to 3.7.13.

### Why are the changes needed?
1.The newest version include some bug fixed, eg:
- FormatWriter: accumulate align shift correctly 
(https://github.com/scalameta/scalafmt/pull/3615)
- Indents: ignore fewerBraces if indentation is 1 
(https://github.com/scalameta/scalafmt/pull/3592)
- RemoveScala3OptionalBraces: handle infix on rbrace 
(https://github.com/scalameta/scalafmt/pull/3576)

2.The full release notes:
https://github.com/scalameta/scalafmt/releases/tag/v3.7.13
https://github.com/scalameta/scalafmt/releases/tag/v3.7.12
https://github.com/scalameta/scalafmt/releases/tag/v3.7.11
https://github.com/scalameta/scalafmt/releases/tag/v3.7.10
https://github.com/scalameta/scalafmt/releases/tag/v3.7.9
https://github.com/scalameta/scalafmt/releases/tag/v3.7.8
https://github.com/scalameta/scalafmt/releases/tag/v3.7.7
https://github.com/scalameta/scalafmt/releases/tag/v3.7.6

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Pass GA.

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes #42764 from panbingkun/SPARK-45043.

Authored-by: panbingkun 
Signed-off-by: Sean Owen 
---
 dev/.scalafmt.conf | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/dev/.scalafmt.conf b/dev/.scalafmt.conf
index c3b26002a76..721dec28990 100644
--- a/dev/.scalafmt.conf
+++ b/dev/.scalafmt.conf
@@ -32,4 +32,4 @@ fileOverride {
 runner.dialect = scala213
   }
 }
-version = 3.7.5
+version = 3.7.13


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated: [SPARK-45026][CONNECT][FOLLOW-UP] Code cleanup

2023-09-02 Thread ruifengz
This is an automated email from the ASF dual-hosted git repository.

ruifengz pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new f0fb434c268 [SPARK-45026][CONNECT][FOLLOW-UP] Code cleanup
f0fb434c268 is described below

commit f0fb434c268f69e6845ba97e3256d3c1b873fc95
Author: Ruifeng Zheng 
AuthorDate: Sat Sep 2 17:20:22 2023 +0800

[SPARK-45026][CONNECT][FOLLOW-UP] Code cleanup

### What changes were proposed in this pull request?
move 3 variables to `isCommand` branch

### Why are the changes needed?
they are not used in other branches

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
CI

### Was this patch authored or co-authored using generative AI tooling?
NO

Closes #42765 from zhengruifeng/SPARK-45026-followup.

Authored-by: Ruifeng Zheng 
Signed-off-by: Ruifeng Zheng 
---
 .../apache/spark/sql/connect/planner/SparkConnectPlanner.scala | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git 
a/connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/SparkConnectPlanner.scala
 
b/connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/SparkConnectPlanner.scala
index 547b6a9fb40..11300631491 100644
--- 
a/connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/SparkConnectPlanner.scala
+++ 
b/connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/SparkConnectPlanner.scala
@@ -2464,15 +2464,15 @@ class SparkConnectPlanner(val sessionHolder: 
SessionHolder) extends Logging {
   case _ => Seq.empty
 }
 
-// Convert the results to Arrow.
-val schema = df.schema
-val maxBatchSize = 
(SparkEnv.get.conf.get(CONNECT_GRPC_ARROW_MAX_BATCH_SIZE) * 0.7).toLong
-val timeZoneId = session.sessionState.conf.sessionLocalTimeZone
-
 // To avoid explicit handling of the result on the client, we build the 
expected input
 // of the relation on the server. The client has to simply forward the 
result.
 val result = SqlCommandResult.newBuilder()
 if (isCommand) {
+  // Convert the results to Arrow.
+  val schema = df.schema
+  val maxBatchSize = 
(SparkEnv.get.conf.get(CONNECT_GRPC_ARROW_MAX_BATCH_SIZE) * 0.7).toLong
+  val timeZoneId = session.sessionState.conf.sessionLocalTimeZone
+
   // Convert the data.
   val bytes = if (rows.isEmpty) {
 ArrowConverters.createEmptyArrowBatch(


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org