[spark] branch master updated: [SPARK-43386][SQL] Improve list of suggested column/attributes in `UNRESOLVED_COLUMN.WITH_SUGGESTION` error class

2023-05-11 Thread maxgekk
This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 1782d0ed9d3 [SPARK-43386][SQL] Improve list of suggested 
column/attributes in `UNRESOLVED_COLUMN.WITH_SUGGESTION` error class
1782d0ed9d3 is described below

commit 1782d0ed9d3c82caee8e57b94229184e308d8b84
Author: Vitalii Li 
AuthorDate: Fri May 12 08:38:49 2023 +0300

[SPARK-43386][SQL] Improve list of suggested column/attributes in 
`UNRESOLVED_COLUMN.WITH_SUGGESTION` error class

### What changes were proposed in this pull request?

In this change we determine whether unresolved identifier is multipart or 
not and remap list of suggested columns to fit the same pattern.

- Main change is in `StringUtils.scala`. The rest is caused by method 
rename and test fixes.

### Why are the changes needed?

When determining a list of suggested columns/attributes for 
`UNRESOLVED_COLUMN.WITH_SUGGESTION` we sort by Levenshtein distance between 
unresolved identifier and list of fully qualified column names from target 
table. In case of a single-part identifier this might lead to poor experience 
especially for short(ish) identifiers, e.g. `a` and table with columns `m` and 
`aa` in default spark catalog => suggested list will be 
(`spark_catalog.default.m`, `spark_catalog.default.aa`)

### Does this PR introduce _any_ user-facing change?

No, since we don't document internals of how we generate suggestion list 
for this error.

### How was this patch tested?

Existing tests.

Closes #41038 from vitaliili-db/missing_col.

Authored-by: Vitalii Li 
Signed-off-by: Max Gekk 
---
 .../spark/sql/catalyst/analysis/Analyzer.scala |  2 +-
 .../sql/catalyst/analysis/CheckAnalysis.scala  |  3 +-
 .../plans/logical/basicLogicalOperators.scala  |  2 +-
 .../spark/sql/catalyst/util/StringUtils.scala  | 33 --
 .../sql/catalyst/analysis/AnalysisSuite.scala  |  2 +-
 .../columnresolution-negative.sql.out  |  2 +-
 .../sql-tests/analyzer-results/cte.sql.out |  2 +-
 .../analyzer-results/group-by-all-duckdb.sql.out   |  2 +-
 .../analyzer-results/group-by-all.sql.out  |  2 +-
 .../sql-tests/analyzer-results/group-by.sql.out|  4 +--
 .../analyzer-results/order-by-all.sql.out  |  2 +-
 .../sql-tests/analyzer-results/pivot.sql.out   |  4 +--
 .../analyzer-results/postgreSQL/join.sql.out   |  2 +-
 .../analyzer-results/postgreSQL/union.sql.out  |  2 +-
 .../analyzer-results/query_regex_column.sql.out| 14 -
 .../negative-cases/invalid-correlation.sql.out |  2 +-
 .../analyzer-results/table-aliases.sql.out |  2 +-
 .../udf/postgreSQL/udf-join.sql.out|  2 +-
 .../analyzer-results/udf/udf-group-by.sql.out  |  2 +-
 .../analyzer-results/udf/udf-pivot.sql.out |  4 +--
 .../results/columnresolution-negative.sql.out  |  2 +-
 .../test/resources/sql-tests/results/cte.sql.out   |  2 +-
 .../sql-tests/results/group-by-all-duckdb.sql.out  |  2 +-
 .../sql-tests/results/group-by-all.sql.out |  2 +-
 .../resources/sql-tests/results/group-by.sql.out   |  4 +--
 .../sql-tests/results/order-by-all.sql.out |  2 +-
 .../test/resources/sql-tests/results/pivot.sql.out |  4 +--
 .../sql-tests/results/postgreSQL/join.sql.out  |  2 +-
 .../sql-tests/results/postgreSQL/union.sql.out |  2 +-
 .../sql-tests/results/query_regex_column.sql.out   | 14 -
 .../negative-cases/invalid-correlation.sql.out |  2 +-
 .../sql-tests/results/table-aliases.sql.out|  2 +-
 .../sql-tests/results/udaf/udaf-group-by.sql.out   |  4 +--
 .../results/udf/postgreSQL/udf-join.sql.out|  2 +-
 .../sql-tests/results/udf/udf-group-by.sql.out |  2 +-
 .../sql-tests/results/udf/udf-pivot.sql.out|  4 +--
 .../scala/org/apache/spark/sql/SubquerySuite.scala |  2 +-
 .../spark/sql/connector/DataSourceV2SQLSuite.scala |  8 ++
 .../sql/errors/QueryCompilationErrorsSuite.scala   |  6 ++--
 .../apache/spark/sql/execution/SQLViewSuite.scala  |  4 +--
 .../execution/command/v2/DescribeTableSuite.scala  |  4 +--
 .../org/apache/spark/sql/sources/InsertSuite.scala |  3 +-
 .../apache/spark/sql/hive/HiveParquetSuite.scala   |  4 +--
 43 files changed, 98 insertions(+), 75 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
index c3480c35680..dbc9da1ea22 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
@@ -3366,7 +3366,7 @@ class Analyzer(override val catalogManager: 

[spark] branch master updated: [SPARK-42945][CONNECT][FOLLOWUP] Disable JVM stack trace by default

2023-05-11 Thread dongjoon
This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new f05b658443c [SPARK-42945][CONNECT][FOLLOWUP] Disable JVM stack trace 
by default
f05b658443c is described below

commit f05b658443c59cf886aed0ea8ad8c75f502d18ac
Author: Takuya UESHIN 
AuthorDate: Thu May 11 21:23:15 2023 -0700

[SPARK-42945][CONNECT][FOLLOWUP] Disable JVM stack trace by default

### What changes were proposed in this pull request?

This is a follow-up of #40575.

Disables JVM stack trace by default.

```py
% ./bin/pyspark --remote local
...
>>> spark.conf.set("spark.sql.ansi.enabled", True)
>>> spark.sql('select 1/0').show()
...
Traceback (most recent call last):
...
pyspark.errors.exceptions.connect.ArithmeticException: [DIVIDE_BY_ZERO] 
Division by zero. Use `try_divide` to tolerate divisor being 0 and return NULL 
instead. If necessary set "spark.sql.ansi.enabled" to "false" to bypass this 
error.
== SQL(line 1, position 8) ==
select 1/0
   ^^^

>>>
>>> spark.conf.set("spark.sql.pyspark.jvmStacktrace.enabled", True)
>>> spark.sql('select 1/0').show()
...
Traceback (most recent call last):
...
pyspark.errors.exceptions.connect.ArithmeticException: [DIVIDE_BY_ZERO] 
Division by zero. Use `try_divide` to tolerate divisor being 0 and return NULL 
instead. If necessary set "spark.sql.ansi.enabled" to "false" to bypass this 
error.
== SQL(line 1, position 8) ==
select 1/0
   ^^^

JVM stacktrace:
org.apache.spark.SparkArithmeticException: [DIVIDE_BY_ZERO] Division by 
zero. Use `try_divide` to tolerate divisor being 0 and return NULL instead. If 
necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.
== SQL(line 1, position 8) ==
select 1/0
   ^^^

at 
org.apache.spark.sql.errors.QueryExecutionErrors$.divideByZeroError(QueryExecutionErrors.scala:226)
at 
org.apache.spark.sql.catalyst.expressions.DivModLike.eval(arithmetic.scala:674)
...
```

### Why are the changes needed?

Currently JVM stack trace is enabled by default.

```py
% ./bin/pyspark --remote local
...
>>> spark.conf.set("spark.sql.ansi.enabled", True)
>>> spark.sql('select 1/0').show()
...
Traceback (most recent call last):
...
pyspark.errors.exceptions.connect.ArithmeticException: [DIVIDE_BY_ZERO] 
Division by zero. Use `try_divide` to tolerate divisor being 0 and return NULL 
instead. If necessary set "spark.sql.ansi.enabled" to "false" to bypass this 
error.
== SQL(line 1, position 8) ==
select 1/0
   ^^^

JVM stacktrace:
org.apache.spark.SparkArithmeticException: [DIVIDE_BY_ZERO] Division by 
zero. Use `try_divide` to tolerate divisor being 0 and return NULL instead. If 
necessary set "spark.sql.ansi.enabled" to "false" to bypass this error.
== SQL(line 1, position 8) ==
select 1/0
   ^^^

at 
org.apache.spark.sql.errors.QueryExecutionErrors$.divideByZeroError(QueryExecutionErrors.scala:226)
at 
org.apache.spark.sql.catalyst.expressions.DivModLike.eval(arithmetic.scala:674)
...
```

### Does this PR introduce _any_ user-facing change?

Users won't see the JVM stack trace by default.

### How was this patch tested?

Existing tests.

Closes #41148 from ueshin/issues/SPARK-42945/default.

Authored-by: Takuya UESHIN 
Signed-off-by: Dongjoon Hyun 
---
 .../org/apache/spark/sql/connect/service/SparkConnectService.scala  | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git 
a/connector/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectService.scala
 
b/connector/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectService.scala
index b444fc67ce1..c1647fd85a0 100644
--- 
a/connector/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectService.scala
+++ 
b/connector/connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectService.scala
@@ -125,7 +125,7 @@ class SparkConnectService(debug: Boolean)
   SparkConnectService
 .getOrCreateIsolatedSession(userId, sessionId)
 .session
-val stackTraceEnabled = 
session.conf.get(PYSPARK_JVM_STACKTRACE_ENABLED.key, "true").toBoolean
+val stackTraceEnabled = session.conf.get(PYSPARK_JVM_STACKTRACE_ENABLED)
 
 {
   case se: SparkException if isPythonExecutionException(se) =>


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[GitHub] [spark-website] yaooqinn commented on pull request #461: Update Apache Spark 3.5 Release Window

2023-05-11 Thread via GitHub


yaooqinn commented on PR #461:
URL: https://github.com/apache/spark-website/pull/461#issuecomment-1544984825

   thanks @xinrong-meng 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark-website] branch asf-site updated: Update Apache Spark 3.5 Release Window

2023-05-11 Thread xinrong
This is an automated email from the ASF dual-hosted git repository.

xinrong pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/spark-website.git


The following commit(s) were added to refs/heads/asf-site by this push:
 new 18ca078b23 Update Apache Spark 3.5 Release Window
18ca078b23 is described below

commit 18ca078b23f826c24bed32df1dc89854a91cb580
Author: Xinrong Meng 
AuthorDate: Thu May 11 17:42:37 2023 -0700

Update Apache Spark 3.5 Release Window

Update Apache Spark 3.5 Release Window, with proposed dates:

```
| July 16th 2023 | Code freeze. Release branch cut.|
| Late July 2023 | QA period. Focus on bug fixes, tests, stability and 
docs. Generally, no new features merged.|
| August 2023| Release candidates (RC), voting, etc. until final 
release passes|
```

Author: Xinrong Meng 

Closes #461 from xinrong-meng/3.5release_window.
---
 site/versioning-policy.html | 8 
 versioning-policy.md| 8 
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/site/versioning-policy.html b/site/versioning-policy.html
index d25bd676c7..74b559d5e8 100644
--- a/site/versioning-policy.html
+++ b/site/versioning-policy.html
@@ -250,7 +250,7 @@ available APIs.
 Hence, Spark 2.3.0 would generally be released about 6 months after 2.2.0. 
Maintenance releases happen as needed
 in between feature releases. Major releases do not happen according to a fixed 
schedule.
 
-Spark 3.4 release window
+Spark 3.5 release window
 
 
   
@@ -261,15 +261,15 @@ in between feature releases. Major releases do not happen 
according to a fixed s
   
   
 
-  January 16th 2023
+  July 16th 2023
   Code freeze. Release branch cut.
 
 
-  Late January 2023
+  Late July 2023
   QA period. Focus on bug fixes, tests, stability and docs. Generally, 
no new features merged.
 
 
-  February 2023
+  August 2023
   Release candidates (RC), voting, etc. until final release passes
 
   
diff --git a/versioning-policy.md b/versioning-policy.md
index 153085259f..0f3892e8a2 100644
--- a/versioning-policy.md
+++ b/versioning-policy.md
@@ -103,13 +103,13 @@ The branch is cut every January and July, so feature 
("minor") releases occur ab
 Hence, Spark 2.3.0 would generally be released about 6 months after 2.2.0. 
Maintenance releases happen as needed
 in between feature releases. Major releases do not happen according to a fixed 
schedule.
 
-Spark 3.4 release window
+Spark 3.5 release window
 
 | Date  | Event |
 | - | - |
-| January 16th 2023 | Code freeze. Release branch cut.|
-| Late January 2023 | QA period. Focus on bug fixes, tests, stability and 
docs. Generally, no new features merged.|
-| February 2023 | Release candidates (RC), voting, etc. until final release 
passes|
+| July 16th 2023 | Code freeze. Release branch cut.|
+| Late July 2023 | QA period. Focus on bug fixes, tests, stability and docs. 
Generally, no new features merged.|
+| August 2023 | Release candidates (RC), voting, etc. until final release 
passes|
 
 Maintenance releases and EOL
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[GitHub] [spark-website] xinrong-meng closed pull request #461: Update Apache Spark 3.5 Release Window

2023-05-11 Thread via GitHub


xinrong-meng closed pull request #461: Update Apache Spark 3.5 Release Window
URL: https://github.com/apache/spark-website/pull/461


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[GitHub] [spark-website] xinrong-meng commented on pull request #461: Update Apache Spark 3.5 Release Window

2023-05-11 Thread via GitHub


xinrong-meng commented on PR #461:
URL: https://github.com/apache/spark-website/pull/461#issuecomment-1544941845

   @yaooqinn Certainly, I'll send an email now!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch branch-3.4 updated: [SPARK-43471][CORE] Handle missing hadoopProperties and metricsProperties

2023-05-11 Thread dongjoon
This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.4 by this push:
 new 6e443658991 [SPARK-43471][CORE] Handle missing hadoopProperties and 
metricsProperties
6e443658991 is described below

commit 6e443658991b2596466a92433cca4bb6010861e4
Author: Dongjoon Hyun 
AuthorDate: Thu May 11 15:30:04 2023 -0700

[SPARK-43471][CORE] Handle missing hadoopProperties and metricsProperties

### What changes were proposed in this pull request?

This PR aims to handle a corner case where `hadoopProperties` and 
`metricsProperties` is null which means not loaded.

### Why are the changes needed?

To prevent NPE.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass the CIs with the newly added test suite.

Closes #41145 from TQJADE/SPARK-43471.

Lead-authored-by: Dongjoon Hyun 
Co-authored-by: Qi Tan 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit 1dba7b803ecffd09c544009b79a6a3219f56d4e0)
Signed-off-by: Dongjoon Hyun 
---
 .../org/apache/spark/ui/env/EnvironmentPage.scala  |  9 ++--
 .../apache/spark/ui/env/EnvironmentPageSuite.scala | 53 ++
 2 files changed, 58 insertions(+), 4 deletions(-)

diff --git a/core/src/main/scala/org/apache/spark/ui/env/EnvironmentPage.scala 
b/core/src/main/scala/org/apache/spark/ui/env/EnvironmentPage.scala
index c6e224732cb..4aaa04019cc 100644
--- a/core/src/main/scala/org/apache/spark/ui/env/EnvironmentPage.scala
+++ b/core/src/main/scala/org/apache/spark/ui/env/EnvironmentPage.scala
@@ -77,15 +77,16 @@ private[ui] class EnvironmentPage(
 val sparkPropertiesTable = UIUtils.listingTable(propertyHeader, 
propertyRow,
   Utils.redact(conf, appEnv.sparkProperties.sorted), fixedWidth = true,
   headerClasses = headerClasses)
+val emptyProperties = collection.Seq.empty[(String, String)]
 val hadoopPropertiesTable = UIUtils.listingTable(propertyHeader, 
propertyRow,
-  Utils.redact(conf, appEnv.hadoopProperties.sorted), fixedWidth = true,
-  headerClasses = headerClasses)
+  Utils.redact(conf, 
Option(appEnv.hadoopProperties).getOrElse(emptyProperties).sorted),
+  fixedWidth = true, headerClasses = headerClasses)
 val systemPropertiesTable = UIUtils.listingTable(propertyHeader, 
propertyRow,
   Utils.redact(conf, appEnv.systemProperties.sorted), fixedWidth = true,
   headerClasses = headerClasses)
 val metricsPropertiesTable = UIUtils.listingTable(propertyHeader, 
propertyRow,
-  Utils.redact(conf, appEnv.metricsProperties.sorted), fixedWidth = true,
-  headerClasses = headerClasses)
+  Utils.redact(conf, 
Option(appEnv.metricsProperties).getOrElse(emptyProperties).sorted),
+  fixedWidth = true, headerClasses = headerClasses)
 val classpathEntriesTable = UIUtils.listingTable(
   classPathHeader, classPathRow, appEnv.classpathEntries.sorted, 
fixedWidth = true,
   headerClasses = headerClasses)
diff --git 
a/core/src/test/scala/org/apache/spark/ui/env/EnvironmentPageSuite.scala 
b/core/src/test/scala/org/apache/spark/ui/env/EnvironmentPageSuite.scala
new file mode 100644
index 000..92791874668
--- /dev/null
+++ b/core/src/test/scala/org/apache/spark/ui/env/EnvironmentPageSuite.scala
@@ -0,0 +1,53 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.ui.env
+
+import javax.servlet.http.HttpServletRequest
+
+import org.mockito.Mockito._
+
+import org.apache.spark.SparkConf
+import org.apache.spark.SparkFunSuite
+import org.apache.spark.status.AppStatusStore
+import org.apache.spark.status.api.v1.{ApplicationEnvironmentInfo, RuntimeInfo}
+
+class EnvironmentPageSuite extends SparkFunSuite {
+
+  test("SPARK-43471: Handle missing hadoopProperties and metricsProperties") {
+val environmentTab = mock(classOf[EnvironmentTab])
+when(environmentTab.appName).thenReturn("Environment")
+when(environmentTab.basePath).thenReturn("http://localhost:4040;)
+

[spark] branch master updated (2bcfebf8a91 -> 1dba7b803ec)

2023-05-11 Thread dongjoon
This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from 2bcfebf8a91 [SPARK-43455][BUILD][K8S] Bump kubernetes-client 6.6.1
 add 1dba7b803ec [SPARK-43471][CORE] Handle missing hadoopProperties and 
metricsProperties

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/ui/env/EnvironmentPage.scala  |  9 ++--
 .../apache/spark/ui/env/EnvironmentPageSuite.scala | 53 ++
 2 files changed, 58 insertions(+), 4 deletions(-)
 create mode 100644 
core/src/test/scala/org/apache/spark/ui/env/EnvironmentPageSuite.scala


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated: [SPARK-43455][BUILD][K8S] Bump kubernetes-client 6.6.1

2023-05-11 Thread dongjoon
This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 2bcfebf8a91 [SPARK-43455][BUILD][K8S] Bump kubernetes-client 6.6.1
2bcfebf8a91 is described below

commit 2bcfebf8a91ffcdbd27f9b6cb70bea56960d3e60
Author: Cheng Pan 
AuthorDate: Thu May 11 08:44:42 2023 -0700

[SPARK-43455][BUILD][K8S] Bump kubernetes-client 6.6.1

### What changes were proposed in this pull request?

Release Notes: 
https://github.com/fabric8io/kubernetes-client/releases/tag/v6.6.1

### Why are the changes needed?

It's basically a routine to keep the third-party libs up-to-date.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass GA.

Closes #41137 from pan3793/SPARK-43455.

Authored-by: Cheng Pan 
Signed-off-by: Dongjoon Hyun 
---
 dev/deps/spark-deps-hadoop-3-hive-2.3 | 50 +--
 pom.xml   |  2 +-
 2 files changed, 26 insertions(+), 26 deletions(-)

diff --git a/dev/deps/spark-deps-hadoop-3-hive-2.3 
b/dev/deps/spark-deps-hadoop-3-hive-2.3
index 0ea96381a3b..c23bb89c983 100644
--- a/dev/deps/spark-deps-hadoop-3-hive-2.3
+++ b/dev/deps/spark-deps-hadoop-3-hive-2.3
@@ -141,31 +141,31 @@ jsr305/3.0.0//jsr305-3.0.0.jar
 jta/1.1//jta-1.1.jar
 jul-to-slf4j/2.0.7//jul-to-slf4j-2.0.7.jar
 kryo-shaded/4.0.2//kryo-shaded-4.0.2.jar
-kubernetes-client-api/6.6.0//kubernetes-client-api-6.6.0.jar
-kubernetes-client/6.6.0//kubernetes-client-6.6.0.jar
-kubernetes-httpclient-okhttp/6.6.0//kubernetes-httpclient-okhttp-6.6.0.jar
-kubernetes-model-admissionregistration/6.6.0//kubernetes-model-admissionregistration-6.6.0.jar
-kubernetes-model-apiextensions/6.6.0//kubernetes-model-apiextensions-6.6.0.jar
-kubernetes-model-apps/6.6.0//kubernetes-model-apps-6.6.0.jar
-kubernetes-model-autoscaling/6.6.0//kubernetes-model-autoscaling-6.6.0.jar
-kubernetes-model-batch/6.6.0//kubernetes-model-batch-6.6.0.jar
-kubernetes-model-certificates/6.6.0//kubernetes-model-certificates-6.6.0.jar
-kubernetes-model-common/6.6.0//kubernetes-model-common-6.6.0.jar
-kubernetes-model-coordination/6.6.0//kubernetes-model-coordination-6.6.0.jar
-kubernetes-model-core/6.6.0//kubernetes-model-core-6.6.0.jar
-kubernetes-model-discovery/6.6.0//kubernetes-model-discovery-6.6.0.jar
-kubernetes-model-events/6.6.0//kubernetes-model-events-6.6.0.jar
-kubernetes-model-extensions/6.6.0//kubernetes-model-extensions-6.6.0.jar
-kubernetes-model-flowcontrol/6.6.0//kubernetes-model-flowcontrol-6.6.0.jar
-kubernetes-model-gatewayapi/6.6.0//kubernetes-model-gatewayapi-6.6.0.jar
-kubernetes-model-metrics/6.6.0//kubernetes-model-metrics-6.6.0.jar
-kubernetes-model-networking/6.6.0//kubernetes-model-networking-6.6.0.jar
-kubernetes-model-node/6.6.0//kubernetes-model-node-6.6.0.jar
-kubernetes-model-policy/6.6.0//kubernetes-model-policy-6.6.0.jar
-kubernetes-model-rbac/6.6.0//kubernetes-model-rbac-6.6.0.jar
-kubernetes-model-resource/6.6.0//kubernetes-model-resource-6.6.0.jar
-kubernetes-model-scheduling/6.6.0//kubernetes-model-scheduling-6.6.0.jar
-kubernetes-model-storageclass/6.6.0//kubernetes-model-storageclass-6.6.0.jar
+kubernetes-client-api/6.6.1//kubernetes-client-api-6.6.1.jar
+kubernetes-client/6.6.1//kubernetes-client-6.6.1.jar
+kubernetes-httpclient-okhttp/6.6.1//kubernetes-httpclient-okhttp-6.6.1.jar
+kubernetes-model-admissionregistration/6.6.1//kubernetes-model-admissionregistration-6.6.1.jar
+kubernetes-model-apiextensions/6.6.1//kubernetes-model-apiextensions-6.6.1.jar
+kubernetes-model-apps/6.6.1//kubernetes-model-apps-6.6.1.jar
+kubernetes-model-autoscaling/6.6.1//kubernetes-model-autoscaling-6.6.1.jar
+kubernetes-model-batch/6.6.1//kubernetes-model-batch-6.6.1.jar
+kubernetes-model-certificates/6.6.1//kubernetes-model-certificates-6.6.1.jar
+kubernetes-model-common/6.6.1//kubernetes-model-common-6.6.1.jar
+kubernetes-model-coordination/6.6.1//kubernetes-model-coordination-6.6.1.jar
+kubernetes-model-core/6.6.1//kubernetes-model-core-6.6.1.jar
+kubernetes-model-discovery/6.6.1//kubernetes-model-discovery-6.6.1.jar
+kubernetes-model-events/6.6.1//kubernetes-model-events-6.6.1.jar
+kubernetes-model-extensions/6.6.1//kubernetes-model-extensions-6.6.1.jar
+kubernetes-model-flowcontrol/6.6.1//kubernetes-model-flowcontrol-6.6.1.jar
+kubernetes-model-gatewayapi/6.6.1//kubernetes-model-gatewayapi-6.6.1.jar
+kubernetes-model-metrics/6.6.1//kubernetes-model-metrics-6.6.1.jar
+kubernetes-model-networking/6.6.1//kubernetes-model-networking-6.6.1.jar
+kubernetes-model-node/6.6.1//kubernetes-model-node-6.6.1.jar
+kubernetes-model-policy/6.6.1//kubernetes-model-policy-6.6.1.jar
+kubernetes-model-rbac/6.6.1//kubernetes-model-rbac-6.6.1.jar
+kubernetes-model-resource/6.6.1//kubernetes-model-resource-6.6.1.jar

[spark] branch master updated: [SPARK-43138][CORE] Fix ClassNotFoundException during migration

2023-05-11 Thread srowen
This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 37a0ae3511c [SPARK-43138][CORE] Fix ClassNotFoundException during 
migration
37a0ae3511c is described below

commit 37a0ae3511c9f153537d5928e9938f72763f5464
Author: Emil Ejbyfeldt 
AuthorDate: Thu May 11 08:25:45 2023 -0500

[SPARK-43138][CORE] Fix ClassNotFoundException during migration

### What changes were proposed in this pull request?

This PR fixes an unhandled ClassNotFoundException during RDD block 
decommissions migrations.
```
2023-04-08 04:15:11,791 ERROR server.TransportRequestHandler: Error while 
invoking RpcHandler#receive() on RPC id 6425687122551756860
java.lang.ClassNotFoundException: com.class.from.user.jar.ClassName
at 
java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:581)
at 
java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178)
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522)
at java.base/java.lang.Class.forName0(Native Method)
at java.base/java.lang.Class.forName(Class.java:398)
at 
org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:71)
at 
java.base/java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:2003)
at 
java.base/java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1870)
at 
java.base/java.io.ObjectInputStream.readClass(ObjectInputStream.java:1833)
at 
java.base/java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1658)
at 
java.base/java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2496)
at 
java.base/java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2390)
at 
java.base/java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2228)
at 
java.base/java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1687)
at 
java.base/java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2496)
at 
java.base/java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2390)
at 
java.base/java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2228)
at 
java.base/java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1687)
at 
java.base/java.io.ObjectInputStream.readObject(ObjectInputStream.java:489)
at 
java.base/java.io.ObjectInputStream.readObject(ObjectInputStream.java:447)
at 
org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:87)
at 
org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:123)
at 
org.apache.spark.network.netty.NettyBlockRpcServer.deserializeMetadata(NettyBlockRpcServer.scala:180)
at 
org.apache.spark.network.netty.NettyBlockRpcServer.receive(NettyBlockRpcServer.scala:119)
at 
org.apache.spark.network.server.TransportRequestHandler.processRpcRequest(TransportRequestHandler.java:163)
at 
org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:109)
at 
org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:140)
at 
org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:53)
at 
io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:99)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
at 
io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:286)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
at 
io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
at 

[spark] branch master updated (bd669a927f0 -> 46251f00b85)

2023-05-11 Thread maxgekk
This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from bd669a927f0 [SPARK-43443][SQL] Add benchmark for Timestamp type 
inference when use invalid value
 add 46251f00b85 [SPARK-38467][CORE] Use error classes in 
org.apache.spark.memory

No new revisions were added by this update.

Summary of changes:
 core/src/main/resources/error/error-classes.json | 18 ++
 .../org/apache/spark/memory/StorageMemoryPool.scala  |  3 ++-
 .../apache/spark/memory/UnifiedMemoryManager.scala   | 20 +---
 3 files changed, 33 insertions(+), 8 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated: [SPARK-43443][SQL] Add benchmark for Timestamp type inference when use invalid value

2023-05-11 Thread maxgekk
This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new bd669a927f0 [SPARK-43443][SQL] Add benchmark for Timestamp type 
inference when use invalid value
bd669a927f0 is described below

commit bd669a927f09f271afd6a1058493c23d8a0e3c04
Author: Hisoka 
AuthorDate: Thu May 11 13:35:25 2023 +0300

[SPARK-43443][SQL] Add benchmark for Timestamp type inference when use 
invalid value

### What changes were proposed in this pull request?

When we try to speed up Timestamp type inference with format (PR: #36562 
#41078 #41091). There is no way to judge whether the change has improved the 
speed for Timestamp type inference.

So we need a benchmark to measure whether our optimization of Timestamp 
type inference is useful, we have valid Timestamp value benchmark at now, but 
don't have invalid Timestamp value benchmark when use Timestamp type inference.

### Why are the changes needed?

Add new benchmark for Timestamp type inference when use invalid value, to 
make sure our speed up PR work normally.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?
benchmarks already are test code.

Closes #41131 from Hisoka-X/add_banchmarks.

Authored-by: Hisoka 
Signed-off-by: Max Gekk 
---
 sql/core/benchmarks/CSVBenchmark-jdk11-results.txt |  95 ++---
 sql/core/benchmarks/CSVBenchmark-jdk17-results.txt |  95 ++---
 sql/core/benchmarks/CSVBenchmark-results.txt   |  95 ++---
 .../benchmarks/JsonBenchmark-jdk11-results.txt | 123 -
 .../benchmarks/JsonBenchmark-jdk17-results.txt | 123 -
 sql/core/benchmarks/JsonBenchmark-results.txt  | 147 +++--
 .../execution/datasources/csv/CSVBenchmark.scala   |  30 +
 .../execution/datasources/json/JsonBenchmark.scala |  26 
 8 files changed, 404 insertions(+), 330 deletions(-)

diff --git a/sql/core/benchmarks/CSVBenchmark-jdk11-results.txt 
b/sql/core/benchmarks/CSVBenchmark-jdk11-results.txt
index ca33c059b3a..0185251877e 100644
--- a/sql/core/benchmarks/CSVBenchmark-jdk11-results.txt
+++ b/sql/core/benchmarks/CSVBenchmark-jdk11-results.txt
@@ -2,66 +2,69 @@
 Benchmark to measure CSV read/write performance
 

 
-OpenJDK 64-Bit Server VM 11.0.18+10 on Linux 5.15.0-1031-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+OpenJDK 64-Bit Server VM 11.0.18+10 on Linux 5.15.0-1036-azure
+Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
 Parsing quoted values:Best Time(ms)   Avg Time(ms)   
Stdev(ms)Rate(M/s)   Per Row(ns)   Relative
 

-One quoted string 36620  36718 
168  0.0  732395.8   1.0X
+One quoted string 30782  30948 
229  0.0  615635.9   1.0X
 
-OpenJDK 64-Bit Server VM 11.0.18+10 on Linux 5.15.0-1031-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+OpenJDK 64-Bit Server VM 11.0.18+10 on Linux 5.15.0-1036-azure
+Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
 Wide rows with 1000 columns:  Best Time(ms)   Avg Time(ms)   
Stdev(ms)Rate(M/s)   Per Row(ns)   Relative
 

-Select 1000 columns   86305  86907
1033  0.0   86305.2   1.0X
-Select 100 columns38778  38792 
 15  0.0   38778.3   2.2X
-Select one column 31901  31913 
 12  0.0   31901.0   2.7X
-count()6971   7033 
 61  0.16970.9  12.4X
-Select 100 columns, one bad input field   51175  51195 
 26  0.0   51174.8   1.7X
-Select 100 columns, corrupt record field  56219  56283 
 60  0.0   56219.3   1.5X
+Select 1000 columns   74038  74677
1024  0.0   74038.3   1.0X
+Select 100 columns33611  33625 
 12  0.0   33611.1   2.2X
+Select one column 29350  29428 
 73  0.0   29349.7   2.5X
+count()4909   4934 
 26  0.24908.8  15.1X

[spark] branch master updated: [SPARK-43449][INFRA] Remove branch-3.2 daily GitHub Action job and conditions

2023-05-11 Thread dongjoon
This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 8c601ea2cf6 [SPARK-43449][INFRA] Remove branch-3.2 daily GitHub Action 
job and conditions
8c601ea2cf6 is described below

commit 8c601ea2cf6a89e9e879f65ad8ab9ba96f73c616
Author: Dongjoon Hyun 
AuthorDate: Thu May 11 02:38:27 2023 -0700

[SPARK-43449][INFRA] Remove branch-3.2 daily GitHub Action job and 
conditions

### What changes were proposed in this pull request?

This PR aims the following.
- Remove Daily GitHub Action job on branch-3.2 to save the community 
resource
  - https://github.com/apache/spark/actions/workflows/build_branch32.yml
- Simplify `build_and_test.yml` by removing `branch-3.2` specific code.

### Why are the changes needed?

Apache Spark 3.2 is EOL.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass the CIs.

Closes #41134 from dongjoon-hyun/SPARK-43449.

Authored-by: Dongjoon Hyun 
Signed-off-by: Dongjoon Hyun 
---
 .github/workflows/build_and_test.yml | 19 ++
 .github/workflows/build_branch32.yml | 49 
 2 files changed, 8 insertions(+), 60 deletions(-)

diff --git a/.github/workflows/build_and_test.yml 
b/.github/workflows/build_and_test.yml
index d3b634ffa26..4aff1bc9753 100644
--- a/.github/workflows/build_and_test.yml
+++ b/.github/workflows/build_and_test.yml
@@ -59,7 +59,7 @@ jobs:
   required: ${{ steps.set-outputs.outputs.required }}
   image_url: >-
 ${{
-  ((inputs.branch == 'branch-3.2' || inputs.branch == 'branch-3.3') && 
'dongjoon/apache-spark-github-action-image:20220207')
+  (inputs.branch == 'branch-3.3' && 
'dongjoon/apache-spark-github-action-image:20220207')
   || steps.infra-image-outputs.outputs.image_url
 }}
 steps:
@@ -80,15 +80,12 @@ jobs:
   id: set-outputs
   run: |
 if [ -z "${{ inputs.jobs }}" ]; then
-  # is-changed.py is missing in branch-3.2, and it might run in 
scheduled build, see also SPARK-39517
   pyspark=true; sparkr=true; tpcds=true; docker=true;
-  if [ -f "./dev/is-changed.py" ]; then
-pyspark_modules=`cd dev && python -c "import 
sparktestsupport.modules as m; print(','.join(m.name for m in m.all_modules if 
m.name.startswith('pyspark')))"`
-pyspark=`./dev/is-changed.py -m $pyspark_modules`
-sparkr=`./dev/is-changed.py -m sparkr`
-tpcds=`./dev/is-changed.py -m sql`
-docker=`./dev/is-changed.py -m docker-integration-tests`
-  fi
+  pyspark_modules=`cd dev && python -c "import 
sparktestsupport.modules as m; print(','.join(m.name for m in m.all_modules if 
m.name.startswith('pyspark')))"`
+  pyspark=`./dev/is-changed.py -m $pyspark_modules`
+  sparkr=`./dev/is-changed.py -m sparkr`
+  tpcds=`./dev/is-changed.py -m sql`
+  docker=`./dev/is-changed.py -m docker-integration-tests`
   # 'build', 'scala-213', and 'java-11-17' are always true for now.
   # It does not save significant time and most of PRs trigger the 
build.
   precondition="
@@ -278,7 +275,7 @@ jobs:
   (fromJson(needs.precondition.outputs.required).pyspark == 'true' ||
   fromJson(needs.precondition.outputs.required).lint == 'true' ||
   fromJson(needs.precondition.outputs.required).sparkr == 'true') &&
-  (inputs.branch != 'branch-3.2' && inputs.branch != 'branch-3.3')
+  (inputs.branch != 'branch-3.3')
 runs-on: ubuntu-latest
 permissions:
   packages: write
@@ -602,7 +599,7 @@ jobs:
 - name: Java linter
   run: ./dev/lint-java
 - name: Spark connect jvm client mima check
-  if: inputs.branch != 'branch-3.2' && inputs.branch != 'branch-3.3'
+  if: inputs.branch != 'branch-3.3'
   run: ./dev/connect-jvm-client-mima-check
 - name: Install Python linter dependencies
   run: |
diff --git a/.github/workflows/build_branch32.yml 
b/.github/workflows/build_branch32.yml
deleted file mode 100644
index 723db45ca37..000
--- a/.github/workflows/build_branch32.yml
+++ /dev/null
@@ -1,49 +0,0 @@
-#
-# Licensed to the Apache Software Foundation (ASF) under one
-# or more contributor license agreements.  See the NOTICE file
-# distributed with this work for additional information
-# regarding copyright ownership.  The ASF licenses this file
-# to you under the Apache License, Version 2.0 (the
-# "License"); you may not use this file except in compliance
-# with the License.  You may obtain a copy of the License at
-#
-#   http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing,
-# software distributed under the 

[spark] branch master updated: [SPARK-43448][BUILD] Remove dummy dependency `hadoop-openstack`

2023-05-11 Thread dongjoon
This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new e62aab2846e [SPARK-43448][BUILD] Remove dummy dependency 
`hadoop-openstack`
e62aab2846e is described below

commit e62aab2846e591219d058efb1a50f921425f09b2
Author: Cheng Pan 
AuthorDate: Thu May 11 02:03:27 2023 -0700

[SPARK-43448][BUILD] Remove dummy dependency `hadoop-openstack`

### What changes were proposed in this pull request?

Remove the dummy dependency `hadoop-openstack` from Spark binary artifacts.

### Why are the changes needed?

[HADOOP-18442](https://issues.apache.org/jira/browse/HADOOP-18442) removed 
the `hadoop-openstack` and temporarily retained a dummy jar for the downstream 
project which consumes it.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass GA.

Closes #41133 from pan3793/SPARK-43448.

Authored-by: Cheng Pan 
Signed-off-by: Dongjoon Hyun 
---
 dev/deps/spark-deps-hadoop-3-hive-2.3 |  1 -
 hadoop-cloud/pom.xml  | 24 
 2 files changed, 25 deletions(-)

diff --git a/dev/deps/spark-deps-hadoop-3-hive-2.3 
b/dev/deps/spark-deps-hadoop-3-hive-2.3
index a0ec39fbf68..0ea96381a3b 100644
--- a/dev/deps/spark-deps-hadoop-3-hive-2.3
+++ b/dev/deps/spark-deps-hadoop-3-hive-2.3
@@ -74,7 +74,6 @@ hadoop-azure/3.3.5//hadoop-azure-3.3.5.jar
 hadoop-client-api/3.3.5//hadoop-client-api-3.3.5.jar
 hadoop-client-runtime/3.3.5//hadoop-client-runtime-3.3.5.jar
 hadoop-cloud-storage/3.3.5//hadoop-cloud-storage-3.3.5.jar
-hadoop-openstack/3.3.5//hadoop-openstack-3.3.5.jar
 hadoop-shaded-guava/1.1.1//hadoop-shaded-guava-1.1.1.jar
 hadoop-yarn-server-web-proxy/3.3.5//hadoop-yarn-server-web-proxy-3.3.5.jar
 hive-beeline/2.3.9//hive-beeline-2.3.9.jar
diff --git a/hadoop-cloud/pom.xml b/hadoop-cloud/pom.xml
index 3ac8c0cec68..e213052dbc1 100644
--- a/hadoop-cloud/pom.xml
+++ b/hadoop-cloud/pom.xml
@@ -111,30 +111,6 @@
 
   
 
-
-  org.apache.hadoop
-  hadoop-openstack
-  ${hadoop.version}
-  ${hadoop.deps.scope}
-  
-
-  org.apache.hadoop
-  hadoop-common
-
-
-  commons-logging
-  commons-logging
-
-
-  junit
-  junit
-
-
-  org.mockito
-  mockito-all
-
-  
-
 
   com.google.cloud.bigdataoss
   gcs-connector


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (ebb20a3871e -> 8fad3715ae6)

2023-05-11 Thread dongjoon
This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from ebb20a3871e [SPARK-43294][BUILD] Upgrade zstd-jni to 1.5.5-2
 add 8fad3715ae6 [SPARK-43446][BUILD] Upgrade Apache Arrow to 12.0.0

No new revisions were added by this update.

Summary of changes:
 dev/deps/spark-deps-hadoop-3-hive-2.3 | 8 
 pom.xml   | 2 +-
 2 files changed, 5 insertions(+), 5 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated: [SPARK-43294][BUILD] Upgrade zstd-jni to 1.5.5-2

2023-05-11 Thread dongjoon
This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new ebb20a3871e [SPARK-43294][BUILD] Upgrade zstd-jni to 1.5.5-2
ebb20a3871e is described below

commit ebb20a3871e111df141634a8516799d8e52ddb03
Author: yangjie01 
AuthorDate: Thu May 11 01:20:20 2023 -0700

[SPARK-43294][BUILD] Upgrade zstd-jni to 1.5.5-2

### What changes were proposed in this pull request?
This pr aims upgrade `zstd-jni` from 1.5.5-1 to 1.5.5-2.

### Why are the changes needed?
New version includes one new improvement `Added support for initialising a 
"dict" from a direct ByteBuffer`:
- https://github.com/luben/zstd-jni/pull/255

Other changes as follows:
- https://github.com/luben/zstd-jni/compare/v1.5.5-1...v1.5.5-2

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Pass GitHub Actions

Closes #41135 from LuciferYang/SPARK-43294-2.

Authored-by: yangjie01 
Signed-off-by: Dongjoon Hyun 
---
 .../ZStandardBenchmark-jdk11-results.txt   | 32 +++---
 .../ZStandardBenchmark-jdk17-results.txt   | 32 +++---
 core/benchmarks/ZStandardBenchmark-results.txt | 32 +++---
 dev/deps/spark-deps-hadoop-3-hive-2.3  |  2 +-
 pom.xml|  2 +-
 5 files changed, 50 insertions(+), 50 deletions(-)

diff --git a/core/benchmarks/ZStandardBenchmark-jdk11-results.txt 
b/core/benchmarks/ZStandardBenchmark-jdk11-results.txt
index da037d631a2..96aa17e2138 100644
--- a/core/benchmarks/ZStandardBenchmark-jdk11-results.txt
+++ b/core/benchmarks/ZStandardBenchmark-jdk11-results.txt
@@ -2,26 +2,26 @@
 Benchmark ZStandardCompressionCodec
 

 
-OpenJDK 64-Bit Server VM 11.0.18+10 on Linux 5.15.0-1035-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+OpenJDK 64-Bit Server VM 11.0.18+10 on Linux 5.15.0-1036-azure
+Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
 Benchmark ZStandardCompressionCodec:Best Time(ms)   Avg 
Time(ms)   Stdev(ms)Rate(M/s)   Per Row(ns)   Relative
 
--
-Compression 1 times at level 1 without buffer pool523  
  608  96  0.0   52324.2   1.0X
-Compression 1 times at level 2 without buffer pool597  
  598   1  0.0   59668.8   0.9X
-Compression 1 times at level 3 without buffer pool792  
  794   2  0.0   79185.9   0.7X
-Compression 1 times at level 1 with buffer pool   332  
  333   1  0.0   33188.3   1.6X
-Compression 1 times at level 2 with buffer pool   398  
  399   1  0.0   39798.4   1.3X
-Compression 1 times at level 3 with buffer pool   589  
  590   1  0.0   58927.7   0.9X
+Compression 1 times at level 1 without buffer pool   1074  
 1076   3  0.0  107421.2   1.0X
+Compression 1 times at level 2 without buffer pool   1056  
 1056   0  0.0  105604.6   1.0X
+Compression 1 times at level 3 without buffer pool   1297  
 1298   1  0.0  129748.3   0.8X
+Compression 1 times at level 1 with buffer pool   305  
  317  13  0.0   30513.4   3.5X
+Compression 1 times at level 2 with buffer pool   386  
  408  17  0.0   38550.2   2.8X
+Compression 1 times at level 3 with buffer pool   603  
  605   2  0.0   60250.7   1.8X
 
-OpenJDK 64-Bit Server VM 11.0.18+10 on Linux 5.15.0-1035-azure
-Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
+OpenJDK 64-Bit Server VM 11.0.18+10 on Linux 5.15.0-1036-azure
+Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
 Benchmark ZStandardCompressionCodec:Best Time(ms)   
Avg Time(ms)   Stdev(ms)Rate(M/s)   Per Row(ns)   Relative
 
--
-Decompression 1 times from level 1 without buffer pool722  
  722   1  0.0   72153.7   1.0X
-Decompression 1 times from level 2 without buffer pool723  
  724   2  0.0   72298.9   1.0X
-Decompression 1 

[spark] branch master updated: [SPARK-43447][R] Support R 4.3.0

2023-05-11 Thread dongjoon
This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 96658310cfd [SPARK-43447][R] Support R 4.3.0
96658310cfd is described below

commit 96658310cfde90a517cb6c5c4db211011d8728b0
Author: Dongjoon Hyun 
AuthorDate: Thu May 11 01:01:08 2023 -0700

[SPARK-43447][R] Support R 4.3.0

### What changes were proposed in this pull request?

This PR aims to support R 4.3.0 officially in Apache Spark 3.5.0 by 
upgrading AppVeyor to 4.3.0.

### Why are the changes needed?

R 4.3.0 is released on April 21th, 2023.
- https://stat.ethz.ch/pipermail/r-announce/2023/000691.html

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass the CIs.

I verified locally.
```
$ R --version
R version 4.3.0 (2023-04-21) -- "Already Tomorrow"
Copyright (C) 2023 The R Foundation for Statistical Computing
Platform: aarch64-apple-darwin20 (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under the terms of the
GNU General Public License versions 2 or 3.
For more information about these matters see
https://www.gnu.org/licenses/.

$ build/sbt package -Psparkr

$ R/install-dev.sh; R/run-tests.sh
...
Tests passed.
```

Closes #41132 from dongjoon-hyun/SPARK-43447.

Authored-by: Dongjoon Hyun 
Signed-off-by: Dongjoon Hyun 
---
 dev/appveyor-install-dependencies.ps1 | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/dev/appveyor-install-dependencies.ps1 
b/dev/appveyor-install-dependencies.ps1
index 88090149f5c..4fe44e58e97 100644
--- a/dev/appveyor-install-dependencies.ps1
+++ b/dev/appveyor-install-dependencies.ps1
@@ -129,7 +129,7 @@ $env:PATH = "$env:HADOOP_HOME\bin;" + $env:PATH
 Pop-Location
 
 # == R
-$rVer = "4.2.0"
+$rVer = "4.3.0"
 $rToolsVer = "4.0.2"
 
 InstallR


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated: [SPARK-43286][SQL][TESTS][FOLLOWUP] Moving `ExpressionImplUtilsSuite` to `catalyst` package

2023-05-11 Thread dongjoon
This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 262b5ae6fc8 [SPARK-43286][SQL][TESTS][FOLLOWUP] Moving 
`ExpressionImplUtilsSuite` to `catalyst` package
262b5ae6fc8 is described below

commit 262b5ae6fc8611dae0111f37762aa6180f65ae17
Author: Steve Weis 
AuthorDate: Wed May 10 23:12:12 2023 -0700

[SPARK-43286][SQL][TESTS][FOLLOWUP] Moving `ExpressionImplUtilsSuite` to 
`catalyst` package

### What changes were proposed in this pull request?

ExpressionImplUtilsSuite was mistakenly in the `core` directory, while 
ExpressionImplUtilsSuite is under `catalyst`. This change moves the file to the 
same package the ExpressionImplUtils.

### Why are the changes needed?

The test suite was committed to the wrong location by mistake.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

This command runs the unit tests:
```
$ build/sbt "catalyst/testOnly 
org.apache.spark.sql.catalyst.expressions.ExpressionImplUtilsSuite"
```

Closes #41128 from sweisdb/SPARK-43286.

Authored-by: Steve Weis 
Signed-off-by: Dongjoon Hyun 
---
 .../apache/spark/sql/catalyst/expressions/ExpressionImplUtilsSuite.scala  | 0
 1 file changed, 0 insertions(+), 0 deletions(-)

diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionImplUtilsSuite.scala
 
b/sql/catalyst/src/test/java/org/apache/spark/sql/catalyst/expressions/ExpressionImplUtilsSuite.scala
similarity index 100%
rename from 
sql/core/src/test/scala/org/apache/spark/sql/catalyst/expressions/ExpressionImplUtilsSuite.scala
rename to 
sql/catalyst/src/test/java/org/apache/spark/sql/catalyst/expressions/ExpressionImplUtilsSuite.scala


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org