date:20230706

[spark] branch master updated: [SPARK-44268][CORE][TEST][FOLLOWUP] Add test to generate `sql-error-conditions` doc automatic

2023-07-06 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new d4277b8e783 [SPARK-44268][CORE][TEST][FOLLOWUP] Add test to generate 
`sql-error-conditions` doc automatic
d4277b8e783 is described below

commit d4277b8e78347fd4e3163c6218edc4675ebb6db2
Author: Jia Fan 
AuthorDate: Fri Jul 7 11:58:23 2023 +0900

[SPARK-44268][CORE][TEST][FOLLOWUP] Add test to generate 
`sql-error-conditions` doc automatic

### What changes were proposed in this pull request?
This is a follow up PR for #41813, change test to automatic generate 
`sql-error-conditions` doc. Like other GOLDEN_FILES test, if the doc not match 
will report a error during test.

### Why are the changes needed?
1. make sure the `error-classes.json` sync with doc
2. and developer sync doc easier.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
add new test.

Closes #41865 from Hisoka-X/SPARK-44268_error_json_follow_up.

Authored-by: Jia Fan 
Signed-off-by: Hyukjin Kwon 
---
 .../src/main/resources/error/error-classes.json|   2 +-
 .../org/apache/spark/SparkThrowableSuite.scala | 219 +++
 ...sql-error-conditions-as-of-join-error-class.md} |   6 +-
 ...ror-conditions-datatype-mismatch-error-class.md |   5 -
 ...onsistent-behavior-cross-version-error-class.md |  19 --
 ...ns-insert-column-arity-mismatch-error-class.md} |   2 +
 ...ons-insufficient-table-property-error-class.md} |   4 +
 ...ror-conditions-invalid-boundary-error-class.md} |   4 +
 ...onditions-invalid-default-value-error-class.md} |   4 +
 ...-error-conditions-invalid-format-error-class.md |   2 +-
 ...conditions-invalid-inline-table-error-class.md} |   4 +
 ...ns-invalid-lambda-function-call-error-class.md} |   8 +-
 ...s-invalid-limit-like-expression-error-class.md} |   4 +
 ...ditions-invalid-observed-metrics-error-class.md |   2 +
 ...ons-invalid-partition-operation-error-class.md} |   4 +
 ...-error-conditions-invalid-schema-error-class.md |   1 -
 ...r-conditions-invalid-sql-syntax-error-class.md} |  58 ++---
 ...alid-time-travel-timestamp-expr-error-class.md} |   4 +
 ...ions-invalid-write-distribution-error-class.md} |   4 +
 ...ons-malformed-record-in-parsing-error-class.md} |   3 +
 ...r-conditions-missing-attributes-error-class.md} |   3 +
 ...conditions-not-a-constant-string-error-class.md |   2 +
 ...-conditions-not-allowed-in-from-error-class.md} |   4 +
 ...s-not-supported-in-jdbc-catalog-error-class.md} |   2 +
 ...conditions-unsupported-add-file-error-class.md} |   2 +
 ...tions-unsupported-default-value-error-class.md} |   6 +-
 ...r-conditions-unsupported-feature-error-class.md |  10 +-
 ...r-conditions-unsupported-insert-error-class.md} |   2 +
 ...ons-unsupported-merge-condition-error-class.md} |   2 +
 ...onditions-unsupported-overwrite-error-class.md} |   2 +
 docs/sql-error-conditions.md   | 239 ++---
 31 files changed, 406 insertions(+), 227 deletions(-)

diff --git a/common/utils/src/main/resources/error/error-classes.json 
b/common/utils/src/main/resources/error/error-classes.json
index 55fa2878e37..0afd103b565 100644
--- a/common/utils/src/main/resources/error/error-classes.json
+++ b/common/utils/src/main/resources/error/error-classes.json
@@ -1214,7 +1214,7 @@
   },
   "UNEXPECTED_TOKEN" : {
 "message" : [
-  "Found the unexpected  in the format string; the structure of 
the format string must match: [MI|S] [$] [0|9|G|,]* [.|D] [0|9]* [$] [PR|MI|S]."
+  "Found the unexpected  in the format string; the structure of 
the format string must match: `[MI|S]` `[$]` `[0|9|G|,]*` `[.|D]` `[0|9]*` 
`[$]` `[PR|MI|S]`."
 ]
   },
   "WRONG_NUM_DIGIT" : {
diff --git a/core/src/test/scala/org/apache/spark/SparkThrowableSuite.scala 
b/core/src/test/scala/org/apache/spark/SparkThrowableSuite.scala
index 034a782e533..0249cde5488 100644
--- a/core/src/test/scala/org/apache/spark/SparkThrowableSuite.scala
+++ b/core/src/test/scala/org/apache/spark/SparkThrowableSuite.scala
@@ -20,8 +20,10 @@ package org.apache.spark
 import java.io.File
 import java.nio.charset.StandardCharsets
 import java.nio.file.Files
+import java.util.Locale
 
 import scala.util.Properties.lineSeparator
+import scala.util.matching.Regex
 
 import com.fasterxml.jackson.annotation.JsonInclude.Include
 import com.fasterxml.jackson.core.JsonParser.Feature.STRICT_DUPLICATE_DETECTION
@@ -45,6 +47,12 @@ class SparkThrowableSuite extends SparkFunSuite {
   SPARK_GENERATE_GOLDEN_FILES=1 build/sbt \
 "core/testOnly *SparkThrowableSuite -- -t \"Error classes are 
correctly formatted\""
}}}
+
+   To regenerate the error class document. Run:
+   {{{
+  SPARK_GENERATE_GOLDEN_FILES=1 build/sbt \

[spark] branch master updated: [SPARK-44312][CONNECT][PYTHON] Allow to set a user agent with an environment variable

2023-07-06 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 485e8bc44b4 [SPARK-44312][CONNECT][PYTHON] Allow to set a user agent 
with an environment variable
485e8bc44b4 is described below

commit 485e8bc44b4b688030aba5128f72aa7fd66e080d
Author: Robert Dillitz 
AuthorDate: Fri Jul 7 11:57:35 2023 +0900

[SPARK-44312][CONNECT][PYTHON] Allow to set a user agent with an 
environment variable

### What changes were proposed in this pull request?
Use the `SPARK_CONNECT_USER_AGENT` environment variable as a fallback for 
the prepended user agent string if one did not set the user agent in 
`ChannelBuilder`.

### Why are the changes needed?
Currently one has to specify a custom user agent string in 
`ChannelBuilder`. It would be useful to be able to set this string with an 
environment variable.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Manual testing + existing tests.

Closes #41866 from dillitz/SPARK-44312-user-agent-environment.

Authored-by: Robert Dillitz 
Signed-off-by: Hyukjin Kwon 
---
 python/pyspark/sql/connect/client/core.py | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/python/pyspark/sql/connect/client/core.py 
b/python/pyspark/sql/connect/client/core.py
index f8d304e9ccc..537ab0a6140 100644
--- a/python/pyspark/sql/connect/client/core.py
+++ b/python/pyspark/sql/connect/client/core.py
@@ -296,7 +296,10 @@ class ChannelBuilder:
 or "_SPARK_CONNECT_PYTHON" when not specified.
 The returned value will be percent encoded.
 """
-user_agent = self.params.get(ChannelBuilder.PARAM_USER_AGENT, 
"_SPARK_CONNECT_PYTHON")
+user_agent = self.params.get(
+ChannelBuilder.PARAM_USER_AGENT,
+os.getenv("SPARK_CONNECT_USER_AGENT", "_SPARK_CONNECT_PYTHON"),
+)
 ua_len = len(urllib.parse.quote(user_agent))
 if ua_len > 2048:
 raise SparkConnectException(


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (17fac569b4e -> e3c2bf56595)

2023-07-06 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from 17fac569b4e [SPARK-43660][CONNECT][PS] Enable `resample` with Spark 
Connect
 add e3c2bf56595 [SPARK-44275][CONNECT] Add configurable retry mechanism to 
Scala Spark Connect

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/connect/client/ArtifactManager.scala |  12 +-
 .../client/CustomSparkConnectBlockingStub.scala|  33 +++-
 .../connect/client/CustomSparkConnectStub.scala|  31 ++--
 .../sql/connect/client/GrpcRetryHandler.scala  | 196 +
 .../sql/connect/client/SparkConnectClient.scala|  23 ++-
 .../spark/sql/connect/client/ArtifactSuite.scala   |   8 +-
 .../connect/client/SparkConnectClientSuite.scala   |  57 +-
 7 files changed, 320 insertions(+), 40 deletions(-)
 copy 
sql/core/src/main/java/org/apache/spark/api/java/function/MapGroupsWithStateFunction.java
 => 
connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/connect/client/CustomSparkConnectStub.scala
 (57%)
 create mode 100644 
connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/connect/client/GrpcRetryHandler.scala


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-43660][CONNECT][PS] Enable `resample` with Spark Connect

2023-07-06 Thread ruifengz

This is an automated email from the ASF dual-hosted git repository.

ruifengz pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 17fac569b4e [SPARK-43660][CONNECT][PS] Enable `resample` with Spark 
Connect
17fac569b4e is described below

commit 17fac569b4e4b569d41f761db07d7bf112801e0c
Author: itholic 
AuthorDate: Fri Jul 7 10:24:40 2023 +0800

[SPARK-43660][CONNECT][PS] Enable `resample` with Spark Connect

### What changes were proposed in this pull request?

This PR proposes to enable `resample` on Spark Connect.

### Why are the changes needed?

To increase pandas API coverage on Spark Connect

### Does this PR introduce _any_ user-facing change?

`resample` is available on Spark Connect.

### How was this patch tested?

Uncommented skipping tests, the existing CI should pass.

Closes #41877 from itholic/SPARK-43660.

Authored-by: itholic 
Signed-off-by: Ruifeng Zheng 
---
 .../sql/connect/planner/SparkConnectPlanner.scala  |  5 +++
 python/pyspark/pandas/resample.py  | 46 --
 python/pyspark/pandas/spark/functions.py   | 16 
 .../pandas/tests/connect/test_parity_resample.py   |  8 +---
 4 files changed, 56 insertions(+), 19 deletions(-)

diff --git 
a/connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/SparkConnectPlanner.scala
 
b/connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/SparkConnectPlanner.scala
index d3090e8b09b..5fd5f7d4c77 100644
--- 
a/connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/SparkConnectPlanner.scala
+++ 
b/connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/SparkConnectPlanner.scala
@@ -1798,6 +1798,11 @@ class SparkConnectPlanner(val sessionHolder: 
SessionHolder) extends Logging {
 val children = fun.getArgumentsList.asScala.map(transformExpression)
 Some(NullIndex(children(0)))
 
+  case "timestampdiff" if fun.getArgumentsCount == 3 =>
+val children = fun.getArgumentsList.asScala.map(transformExpression)
+val unit = extractString(children(0), "unit")
+Some(TimestampDiff(unit, children(1), children(2)))
+
   // ML-specific functions
   case "vector_to_array" if fun.getArgumentsCount == 2 =>
 val expr = transformExpression(fun.getArguments(0))
diff --git a/python/pyspark/pandas/resample.py 
b/python/pyspark/pandas/resample.py
index 1bd4c075342..c6c6019c07e 100644
--- a/python/pyspark/pandas/resample.py
+++ b/python/pyspark/pandas/resample.py
@@ -26,6 +26,7 @@ from typing import (
 Generic,
 List,
 Optional,
+Union,
 )
 
 import numpy as np
@@ -65,6 +66,8 @@ from pyspark.pandas.utils import (
 scol_for,
 verify_temp_column_name,
 )
+from pyspark.sql.utils import is_remote
+from pyspark.pandas.spark.functions import timestampdiff
 
 
 class Resampler(Generic[FrameLike], metaclass=ABCMeta):
@@ -131,8 +134,27 @@ class Resampler(Generic[FrameLike], metaclass=ABCMeta):
 def _agg_columns_scols(self) -> List[Column]:
 return [s.spark.column for s in self._agg_columns]
 
+def get_make_interval(  # type: ignore[return]
+self, unit: str, col: Union[Column, int, float]
+) -> Column:
+if is_remote():
+from pyspark.sql.connect.functions import lit, make_interval
+
+col = col if not isinstance(col, (int, float)) else lit(col)  # 
type: ignore[assignment]
+if unit == "MONTH":
+return make_interval(months=col)  # type: ignore
+if unit == "HOUR":
+return make_interval(hours=col)  # type: ignore
+if unit == "MINUTE":
+return make_interval(mins=col)  # type: ignore
+if unit == "SECOND":
+return make_interval(secs=col)  # type: ignore
+else:
+sql_utils = SparkContext._active_spark_context._jvm.PythonSQLUtils
+col = col._jc if isinstance(col, Column) else F.lit(col)._jc
+return sql_utils.makeInterval(unit, col)
+
 def _bin_time_stamp(self, origin: pd.Timestamp, ts_scol: Column) -> Column:
-sql_utils = SparkContext._active_spark_context._jvm.PythonSQLUtils
 origin_scol = F.lit(origin)
 (rule_code, n) = (self._offset.rule_code, getattr(self._offset, "n"))
 left_closed, right_closed = (self._closed == "left", self._closed == 
"right")
@@ -191,18 +213,18 @@ class Resampler(Generic[FrameLike], metaclass=ABCMeta):
 truncated_ts_scol = F.date_trunc("MONTH", ts_scol)
 edge_label = truncated_ts_scol
 if left_closed and right_labeled:
-edge_label += sql_utils.makeInterval("MONTH", F.lit(n)._jc)
+edge_label += self.get_make_interval("MONTH", n)

[spark] branch master updated: [SPARK-44315][SQL][CONNECT] Move DefinedByConstructorParams to sql/api

2023-07-06 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 259d3e3b589 [SPARK-44315][SQL][CONNECT] Move 
DefinedByConstructorParams to sql/api
259d3e3b589 is described below

commit 259d3e3b589d723dfe2922550a46175bd82fe97c
Author: Rui Wang 
AuthorDate: Thu Jul 6 17:35:41 2023 -0700

[SPARK-44315][SQL][CONNECT] Move DefinedByConstructorParams to sql/api

### What changes were proposed in this pull request?

Move DefinedByConstructorParams to sql/api.

### Why are the changes needed?

This is an effort to help the Scala client not need to depend on Catalyst.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Existing test

Closes #41873 from amaliujia/scsc.

Authored-by: Rui Wang 
Signed-off-by: Wenchen Fan 
---
 .../sql/catalyst/DefinedByConstructorParams.scala  | 25 ++
 .../spark/sql/catalyst/ScalaReflection.scala   |  8 ---
 2 files changed, 25 insertions(+), 8 deletions(-)

diff --git 
a/sql/api/src/main/scala/org/apache/spark/sql/catalyst/DefinedByConstructorParams.scala
 
b/sql/api/src/main/scala/org/apache/spark/sql/catalyst/DefinedByConstructorParams.scala
new file mode 100644
index 000..fc6bc2095a8
--- /dev/null
+++ 
b/sql/api/src/main/scala/org/apache/spark/sql/catalyst/DefinedByConstructorParams.scala
@@ -0,0 +1,25 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst
+
+/**
+ * A helper trait to create 
[[org.apache.spark.sql.catalyst.encoders.ExpressionEncoder]]s
+ * for classes whose fields are entirely defined by constructor params but 
should not be
+ * case classes.
+ */
+trait DefinedByConstructorParams
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala
index d67ba455438..b0588fc7044 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala
@@ -45,14 +45,6 @@ import org.apache.spark.sql.types._
 import org.apache.spark.unsafe.types.{CalendarInterval, UTF8String}
 
 
-/**
- * A helper trait to create 
[[org.apache.spark.sql.catalyst.encoders.ExpressionEncoder]]s
- * for classes whose fields are entirely defined by constructor params but 
should not be
- * case classes.
- */
-trait DefinedByConstructorParams
-
-
 private[catalyst] object ScalaSubtypeLock
 
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-43321][CONNECT] Dataset#Joinwith

2023-07-06 Thread hvanhovell

This is an automated email from the ASF dual-hosted git repository.

hvanhovell pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 9c7978d80b8 [SPARK-43321][CONNECT] Dataset#Joinwith
9c7978d80b8 is described below

commit 9c7978d80b8a95bd7fcc26769eea581849000862
Author: Zhen Li 
AuthorDate: Thu Jul 6 17:42:53 2023 -0400

[SPARK-43321][CONNECT] Dataset#Joinwith

### What changes were proposed in this pull request?
Impl missing method JoinWith with Join relation operation

The JoinWith adds `left` and `right` struct type info in the Join relation 
proto.
### Why are the changes needed?
Missing Dataset API

### Does this PR introduce _any_ user-facing change?
Yes. Added the missing Dataset#JoinWith method

### How was this patch tested?
E2E tests.

Closes #40997 from zhenlineo/joinwith.

Authored-by: Zhen Li 
Signed-off-by: Herman van Hovell 
---
 .../main/scala/org/apache/spark/sql/Dataset.scala  |  85 +++-
 .../spark/sql/connect/client/SparkResult.scala |  34 +++-
 .../org/apache/spark/sql/ClientE2ETestSuite.scala  | 193 ++
 .../CheckConnectJvmClientCompatibility.scala   |   1 -
 .../main/protobuf/spark/connect/relations.proto|  10 +
 .../sql/connect/planner/SparkConnectPlanner.scala  |  24 ++-
 python/pyspark/sql/connect/proto/relations_pb2.py  | 221 +++--
 python/pyspark/sql/connect/proto/relations_pb2.pyi |  48 -
 .../sql/catalyst/encoders/AgnosticEncoder.scala|  44 ++--
 .../spark/sql/catalyst/plans/logical/object.scala  | 104 +-
 .../spark/sql/errors/QueryCompilationErrors.scala  |  11 +-
 .../main/scala/org/apache/spark/sql/Dataset.scala  | 104 +-
 12 files changed, 639 insertions(+), 240 deletions(-)

diff --git 
a/connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/Dataset.scala
 
b/connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/Dataset.scala
index 2ea3169486b..4fa5c0b9641 100644
--- 
a/connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/Dataset.scala
+++ 
b/connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/Dataset.scala
@@ -20,6 +20,7 @@ import java.util.{Collections, Locale}
 
 import scala.collection.JavaConverters._
 import scala.collection.mutable
+import scala.reflect.ClassTag
 import scala.util.control.NonFatal
 
 import org.apache.spark.SparkException
@@ -568,7 +569,7 @@ class Dataset[T] private[sql] (
 }
   }
 
-  private def toJoinType(name: String): proto.Join.JoinType = {
+  private def toJoinType(name: String, skipSemiAnti: Boolean = false): 
proto.Join.JoinType = {
 name.trim.toLowerCase(Locale.ROOT) match {
   case "inner" =>
 proto.Join.JoinType.JOIN_TYPE_INNER
@@ -580,12 +581,12 @@ class Dataset[T] private[sql] (
 proto.Join.JoinType.JOIN_TYPE_LEFT_OUTER
   case "right" | "rightouter" | "right_outer" =>
 proto.Join.JoinType.JOIN_TYPE_RIGHT_OUTER
-  case "semi" | "leftsemi" | "left_semi" =>
+  case "semi" | "leftsemi" | "left_semi" if !skipSemiAnti =>
 proto.Join.JoinType.JOIN_TYPE_LEFT_SEMI
-  case "anti" | "leftanti" | "left_anti" =>
+  case "anti" | "leftanti" | "left_anti" if !skipSemiAnti =>
 proto.Join.JoinType.JOIN_TYPE_LEFT_ANTI
-  case _ =>
-throw new IllegalArgumentException(s"Unsupported join type 
`joinType`.")
+  case e =>
+throw new IllegalArgumentException(s"Unsupported join type '$e'.")
 }
   }
 
@@ -835,6 +836,80 @@ class Dataset[T] private[sql] (
 }
   }
 
+  /**
+   * Joins this Dataset returning a `Tuple2` for each pair where `condition` 
evaluates to true.
+   *
+   * This is similar to the relation `join` function with one important 
difference in the result
+   * schema. Since `joinWith` preserves objects present on either side of the 
join, the result
+   * schema is similarly nested into a tuple under the column names `_1` and 
`_2`.
+   *
+   * This type of join can be useful both for preserving type-safety with the 
original object
+   * types as well as working with relational data where either side of the 
join has column names
+   * in common.
+   *
+   * @param other
+   *   Right side of the join.
+   * @param condition
+   *   Join expression.
+   * @param joinType
+   *   Type of join to perform. Default `inner`. Must be one of: `inner`, 
`cross`, `outer`,
+   *   `full`, `fullouter`,`full_outer`, `left`, `leftouter`, `left_outer`, 
`right`, `rightouter`,
+   *   `right_outer`.
+   *
+   * @group typedrel
+   * @since 3.5.0
+   */
+  def joinWith[U](other: Dataset[U], condition: Column, joinType: String): 
Dataset[(T, U)] = {
+val joinTypeValue = toJoinType(joinType, skipSemiAnti = true)
+val (leftNullable, rightNullable) = joinTypeValue match {
+  case proto.Join.JoinType.JOIN_TYPE_INNER | 
proto.Join.JoinType.J

[spark] branch master updated: [SPARK-44316][BUILD] Upgrade Jersey to 2.40

2023-07-06 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new f6e0b3906d5 [SPARK-44316][BUILD] Upgrade Jersey to 2.40
f6e0b3906d5 is described below

commit f6e0b3906d533ab719f2423bd136d79215bfa315
Author: panbingkun 
AuthorDate: Thu Jul 6 09:19:10 2023 -0700

[SPARK-44316][BUILD] Upgrade Jersey to 2.40

### What changes were proposed in this pull request?
The pr aims to upgrade Jersey from 2.36 to 2.40.

### Why are the changes needed?
1.This version adapts to ASM9.5, which is also used by Spark currently
[Adopt ASM 9.5](https://github.com/eclipse-ee4j/jersey/pull/5305)

2.Also fix some bugs, eg:
[Fix possible NPE in netty 
client](https://github.com/eclipse-ee4j/jersey/pull/5330)
[Get media type fix](https://github.com/eclipse-ee4j/jersey/pull/5282)

3.Security vulnerability fix:
[CVE for dependency 
jackson-databind](https://github.com/eclipse-ee4j/jersey/issues/5225)

4.Full Release Notes:
https://github.com/eclipse-ee4j/jersey/releases/tag/2.40
https://github.com/eclipse-ee4j/jersey/releases/tag/2.39
https://github.com/eclipse-ee4j/jersey/releases/tag/2.38
https://github.com/eclipse-ee4j/jersey/releases/tag/2.37

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Pass GA.

Closes #41874 from panbingkun/SPARK-44316.

Authored-by: panbingkun 
Signed-off-by: Dongjoon Hyun 
---
 dev/deps/spark-deps-hadoop-3-hive-2.3 | 14 +++---
 pom.xml   |  6 +-
 2 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/dev/deps/spark-deps-hadoop-3-hive-2.3 
b/dev/deps/spark-deps-hadoop-3-hive-2.3
index 1b91686ed4d..663d4441ed8 100644
--- a/dev/deps/spark-deps-hadoop-3-hive-2.3
+++ b/dev/deps/spark-deps-hadoop-3-hive-2.3
@@ -112,19 +112,19 @@ 
jakarta.validation-api/2.0.2//jakarta.validation-api-2.0.2.jar
 jakarta.ws.rs-api/2.1.6//jakarta.ws.rs-api-2.1.6.jar
 jakarta.xml.bind-api/2.3.2//jakarta.xml.bind-api-2.3.2.jar
 janino/3.1.9//janino-3.1.9.jar
-javassist/3.25.0-GA//javassist-3.25.0-GA.jar
+javassist/3.29.2-GA//javassist-3.29.2-GA.jar
 javax.jdo/3.2.0-m3//javax.jdo-3.2.0-m3.jar
 javolution/5.5.1//javolution-5.5.1.jar
 jaxb-runtime/2.3.2//jaxb-runtime-2.3.2.jar
 jcl-over-slf4j/2.0.7//jcl-over-slf4j-2.0.7.jar
 jdo-api/3.0.1//jdo-api-3.0.1.jar
 jdom2/2.0.6//jdom2-2.0.6.jar
-jersey-client/2.36//jersey-client-2.36.jar
-jersey-common/2.36//jersey-common-2.36.jar
-jersey-container-servlet-core/2.36//jersey-container-servlet-core-2.36.jar
-jersey-container-servlet/2.36//jersey-container-servlet-2.36.jar
-jersey-hk2/2.36//jersey-hk2-2.36.jar
-jersey-server/2.36//jersey-server-2.36.jar
+jersey-client/2.40//jersey-client-2.40.jar
+jersey-common/2.40//jersey-common-2.40.jar
+jersey-container-servlet-core/2.40//jersey-container-servlet-core-2.40.jar
+jersey-container-servlet/2.40//jersey-container-servlet-2.40.jar
+jersey-hk2/2.40//jersey-hk2-2.40.jar
+jersey-server/2.40//jersey-server-2.40.jar
 jettison/1.5.4//jettison-1.5.4.jar
 jetty-util-ajax/9.4.51.v20230217//jetty-util-ajax-9.4.51.v20230217.jar
 jetty-util/9.4.51.v20230217//jetty-util-9.4.51.v20230217.jar
diff --git a/pom.xml b/pom.xml
index bc14cdd584e..96375ea904d 100644
--- a/pom.xml
+++ b/pom.xml
@@ -196,7 +196,11 @@
 4.1.17
 14.0.1
 3.1.9
-2.36
+
+2.40
 2.12.5
 3.5.2
 3.0.0


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-44314][BUILD][CORE][TESTS] Add a new checkstyle rule to prohibit the use of `@Test(expected = ExpectedException.class)`

2023-07-06 Thread yangjie01

This is an automated email from the ASF dual-hosted git repository.

yangjie01 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new d12bec39cb1 [SPARK-44314][BUILD][CORE][TESTS] Add a new checkstyle 
rule to prohibit the use of `@Test(expected = ExpectedException.class)`
d12bec39cb1 is described below

commit d12bec39cb141fbfab24d158022b327441d17945
Author: yangjie01 
AuthorDate: Thu Jul 6 22:17:17 2023 +0800

[SPARK-44314][BUILD][CORE][TESTS] Add a new checkstyle rule to prohibit the 
use of `@Test(expected = ExpectedException.class)`

### What changes were proposed in this pull request?
This pr add a new java checkstyle rule

```

  
  

```

to prohibit the use of Test(expected = ExpectedException.class)

### Why are the changes needed?
Refer to https://github.com/junit-team/junit4/wiki/Exception-testing

Junit 4 does not recommend `Specifying the expected annotation via the Test 
annotation.`,

```
The expected parameter should be used with care. The above test will pass 
if any code in the method throws IndexOutOfBoundsException. Using the method 
you also cannot test the value of the message in the exception, or the state of 
a domain object after the exception has been thrown.

For these reasons, the previous approaches are recommended.
```

And `Using assertThrows Method` is recommended.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
- Pass Github Actions
- Manual checked:

run `dev/lint-java` without change `RemoteBlockPushResolverSuite.java`

```
Checkstyle checks failed at following occurrences:
[ERROR] 
src/test/java/org/apache/spark/network/shuffle/RemoteBlockPushResolverSuite.java:[284]
 (regexp) RegexpSinglelineJava: Please use the `assertThrows` method to test 
for exceptions.
[ERROR] 
src/test/java/org/apache/spark/network/shuffle/RemoteBlockPushResolverSuite.java:[301]
 (regexp) RegexpSinglelineJava: Please use the `assertThrows` method to test 
for exceptions.
```

Closes #41872 from LuciferYang/SPARK-44314.

Authored-by: yangjie01 
Signed-off-by: yangjie01 
---
 .../shuffle/RemoteBlockPushResolverSuite.java  | 26 +-
 dev/checkstyle.xml |  4 
 2 files changed, 14 insertions(+), 16 deletions(-)

diff --git 
a/common/network-shuffle/src/test/java/org/apache/spark/network/shuffle/RemoteBlockPushResolverSuite.java
 
b/common/network-shuffle/src/test/java/org/apache/spark/network/shuffle/RemoteBlockPushResolverSuite.java
index 0847121b0cc..e0b3315aad1 100644
--- 
a/common/network-shuffle/src/test/java/org/apache/spark/network/shuffle/RemoteBlockPushResolverSuite.java
+++ 
b/common/network-shuffle/src/test/java/org/apache/spark/network/shuffle/RemoteBlockPushResolverSuite.java
@@ -281,7 +281,7 @@ public class RemoteBlockPushResolverSuite {
 verifyMetrics(4, 0, 0, 0, 0, 0, 4);
   }
 
-  @Test(expected = RuntimeException.class)
+  @Test
   public void testFailureAfterData() throws IOException {
 StreamCallbackWithID stream =
   pushResolver.receiveBlockDataAsStream(
@@ -289,16 +289,13 @@ public class RemoteBlockPushResolverSuite {
 stream.onData(stream.getID(), ByteBuffer.wrap(new byte[4]));
 stream.onFailure(stream.getID(), new RuntimeException("Forced Failure"));
 pushResolver.finalizeShuffleMerge(new FinalizeShuffleMerge(TEST_APP, 
NO_ATTEMPT_ID, 0, 0));
-try {
-  pushResolver.getMergedBlockMeta(TEST_APP, 0, 0, 0);
-} catch (RuntimeException e) {
-  assertTrue(e.getMessage().contains("is empty"));
-  verifyMetrics(4, 0, 0, 0, 0, 0, 4);
-  throw e;
-}
+RuntimeException e = assertThrows(RuntimeException.class,
+  () -> pushResolver.getMergedBlockMeta(TEST_APP, 0, 0, 0));
+assertTrue(e.getMessage().contains("is empty"));
+verifyMetrics(4, 0, 0, 0, 0, 0, 4);
   }
 
-  @Test(expected = RuntimeException.class)
+  @Test
   public void testFailureAfterMultipleDataBlocks() throws IOException {
 StreamCallbackWithID stream =
   pushResolver.receiveBlockDataAsStream(
@@ -308,13 +305,10 @@ public class RemoteBlockPushResolverSuite {
 stream.onData(stream.getID(), ByteBuffer.wrap(new byte[4]));
 stream.onFailure(stream.getID(), new RuntimeException("Forced Failure"));
 pushResolver.finalizeShuffleMerge(new FinalizeShuffleMerge(TEST_APP, 
NO_ATTEMPT_ID, 0, 0));
-try {
-  pushResolver.getMergedBlockMeta(TEST_APP, 0, 0, 0);
-} catch (RuntimeException e) {
-  assertTrue(e.getMessage().contains("is empty"));
-  verifyMetrics(9, 0, 0, 0, 0, 0, 9);
-  throw e;
-}
+RuntimeException e = assertThrows(RuntimeException.class,
+  () -> pushResolver.getMergedBlockMeta(TEST_APP, 0, 0, 0));
+asse

[spark] branch master updated (1adf2866915 -> 7bd13530840)

2023-07-06 Thread hvanhovell

This is an automated email from the ASF dual-hosted git repository.

hvanhovell pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from 1adf2866915 [SPARK-44303][SQL] Assign names to the error class 
_LEGACY_ERROR_TEMP_[2320-2324]
 add 7bd13530840 [SPARK-44283][CONNECT] Move Origin to SQL/API

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/trees/SQLQueryContext.scala |  0
 .../apache/spark/sql/catalyst/trees/origin.scala   | 72 ++
 .../apache/spark/sql/catalyst/trees/TreeNode.scala | 54 
 3 files changed, 72 insertions(+), 54 deletions(-)
 rename sql/{catalyst => 
api}/src/main/scala/org/apache/spark/sql/catalyst/trees/SQLQueryContext.scala 
(100%)
 create mode 100644 
sql/api/src/main/scala/org/apache/spark/sql/catalyst/trees/origin.scala


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (1fbb94b87c0 -> 1adf2866915)

2023-07-06 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from 1fbb94b87c0 [SPARK-44284][CONNECT] Create simple conf system for 
sql/api
 add 1adf2866915 [SPARK-44303][SQL] Assign names to the error class 
_LEGACY_ERROR_TEMP_[2320-2324]

No new revisions were added by this update.

Summary of changes:
 .../src/main/resources/error/error-classes.json| 50 ++--
 .../connect/planner/SparkConnectProtoSuite.scala   |  8 +-
 .../org/apache/spark/sql/jdbc/v2/V2JDBCTest.scala  | 36 +++--
 ...ditions-invalid-observed-metrics-error-class.md | 12 +++
 docs/sql-error-conditions.md   | 12 +++
 .../sql/catalyst/analysis/CheckAnalysis.scala  | 21 +++--
 .../sql/catalyst/analysis/AnalysisSuite.scala  | 27 +--
 .../spark/sql/connector/AlterTableTests.scala  | 92 +-
 .../connector/V2CommandsCaseSensitivitySuite.scala | 34 +++-
 .../v2/jdbc/JDBCTableCatalogSuite.scala| 36 +++--
 10 files changed, 243 insertions(+), 85 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-44284][CONNECT] Create simple conf system for sql/api

2023-07-06 Thread hvanhovell

This is an automated email from the ASF dual-hosted git repository.

hvanhovell pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 1fbb94b87c0 [SPARK-44284][CONNECT] Create simple conf system for 
sql/api
1fbb94b87c0 is described below

commit 1fbb94b87c0b10b6f2ded93f0c7eb253d086887e
Author: Herman van Hovell 
AuthorDate: Thu Jul 6 08:22:18 2023 -0400

[SPARK-44284][CONNECT] Create simple conf system for sql/api

### What changes were proposed in this pull request?
This PR introduces a configuration system for classes that are in sql/api.

### Why are the changes needed?
We are moving a number of components into sql/api that rely on confs being 
set when used with sql/core. The conf system added here gives us that 
flexibility to do so.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Existing tests.

Closes #41838 from hvanhovell/SPARK-44284.

Lead-authored-by: Herman van Hovell 
Co-authored-by: Herman van Hovell 
Signed-off-by: Herman van Hovell 
---
 .../scala/org/apache/spark/sql/SqlApiConf.scala| 63 ++
 .../spark/sql/catalyst/types/DataTypeUtils.scala   |  4 +-
 .../org/apache/spark/sql/internal/SQLConf.scala| 14 +++--
 .../org/apache/spark/sql/types/StructType.scala|  4 +-
 4 files changed, 77 insertions(+), 8 deletions(-)

diff --git a/sql/api/src/main/scala/org/apache/spark/sql/SqlApiConf.scala 
b/sql/api/src/main/scala/org/apache/spark/sql/SqlApiConf.scala
new file mode 100644
index 000..07943683ab6
--- /dev/null
+++ b/sql/api/src/main/scala/org/apache/spark/sql/SqlApiConf.scala
@@ -0,0 +1,63 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.sql
+
+import java.util.concurrent.atomic.AtomicReference
+
+import scala.util.Try
+
+import org.apache.spark.util.SparkClassUtils
+
+/**
+ * Configuration for all objects that are placed in the `sql/api` project. The 
normal way of
+ * accessing this class is through `SqlApiConf.get`. If this code is being 
used with sql/core
+ * then its values are bound to the currently set SQLConf. With Spark Connect, 
it will default to
+ * hardcoded values.
+ */
+private[sql] trait SqlApiConf {
+  def ansiEnabled: Boolean
+  def caseSensitiveAnalysis: Boolean
+  def maxToStringFields: Int
+}
+
+private[sql] object SqlApiConf {
+  /**
+   * Defines a getter that returns the [[SqlApiConf]] within scope.
+   */
+  private val confGetter = new AtomicReference[() => SqlApiConf](() => 
DefaultSqlApiConf)
+
+  /**
+   * Sets the active config getter.
+   */
+  private[sql] def setConfGetter(getter: () => SqlApiConf): Unit = {
+confGetter.set(getter)
+  }
+
+  def get: SqlApiConf = confGetter.get()()
+
+  // Force load SQLConf. This will trigger the installation of a confGetter 
that points to SQLConf.
+  Try(SparkClassUtils.classForName("org.apache.spark.sql.internal.SQLConf$"))
+}
+
+/**
+ * Defaults configurations used when no other [[SqlApiConf]] getter is set.
+ */
+private[sql] object DefaultSqlApiConf extends SqlApiConf {
+  override def ansiEnabled: Boolean = false
+  override def caseSensitiveAnalysis: Boolean = false
+  override def maxToStringFields: Int = 50
+}
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/types/DataTypeUtils.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/types/DataTypeUtils.scala
index 0d2d6c0262c..da0607e0920 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/types/DataTypeUtils.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/types/DataTypeUtils.scala
@@ -16,9 +16,9 @@
  */
 package org.apache.spark.sql.catalyst.types
 
+import org.apache.spark.sql.SqlApiConf
 import org.apache.spark.sql.catalyst.analysis.Resolver
 import org.apache.spark.sql.catalyst.expressions.{AttributeReference, Cast, 
Literal}
-import org.apache.spark.sql.internal.SQLConf
 import org.apache.spark.sql.internal.SQLConf.StoreAssignmentPolicy
 import org.apache.spark.sql.internal.SQLConf.StoreAssignmentPolicy.{ANSI, 
STRICT}
 impor

[spark] branch master updated: Revert "[SPARK-43851][SQL] Support LCA in grouping expressions"

2023-07-06 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new a68e362dca1 Revert "[SPARK-43851][SQL] Support LCA in grouping 
expressions"
a68e362dca1 is described below

commit a68e362dca10f1c0173fbe51bf321428378e4602
Author: Jia Fan 
AuthorDate: Thu Jul 6 15:20:38 2023 +0300

Revert "[SPARK-43851][SQL] Support LCA in grouping expressions"

### What changes were proposed in this pull request?
This reverts commit 9353d67f9290bae1e7d7e16a2caf5256cc4e2f92.

After discussion in #41817 , we should revert LCA in grouping expressions. 
Becuase the current solution has problems.

### Why are the changes needed?
revert PR

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
exist test

Closes #41869 from Hisoka-X/SPARK-43851_revert.

Authored-by: Jia Fan 
Signed-off-by: Max Gekk 
---
 .../src/main/resources/error/error-classes.json|  5 +
 ...r-conditions-unsupported-feature-error-class.md |  4 
 .../analysis/ResolveReferencesInAggregate.scala| 22 ++
 .../column-resolution-aggregate.sql.out| 26 +-
 .../results/column-resolution-aggregate.sql.out| 16 +
 5 files changed, 44 insertions(+), 29 deletions(-)

diff --git a/common/utils/src/main/resources/error/error-classes.json 
b/common/utils/src/main/resources/error/error-classes.json
index 44bec5e8ced..a3b12022b66 100644
--- a/common/utils/src/main/resources/error/error-classes.json
+++ b/common/utils/src/main/resources/error/error-classes.json
@@ -2613,6 +2613,11 @@
   "Referencing lateral column alias  in the aggregate query both 
with window expressions and with having clause. Please rewrite the aggregate 
query by removing the having clause or removing lateral alias reference in the 
SELECT list."
 ]
   },
+  "LATERAL_COLUMN_ALIAS_IN_GROUP_BY" : {
+"message" : [
+  "Referencing a lateral column alias via GROUP BY alias/ALL is not 
supported yet."
+]
+  },
   "LATERAL_COLUMN_ALIAS_IN_WINDOW" : {
 "message" : [
   "Referencing a lateral column alias  in window expression 
."
diff --git a/docs/sql-error-conditions-unsupported-feature-error-class.md 
b/docs/sql-error-conditions-unsupported-feature-error-class.md
index 25f09118f74..a41502b609a 100644
--- a/docs/sql-error-conditions-unsupported-feature-error-class.md
+++ b/docs/sql-error-conditions-unsupported-feature-error-class.md
@@ -85,6 +85,10 @@ Referencing a lateral column alias `` in the aggregate 
function ``
 
 Referencing lateral column alias `` in the aggregate query both with 
window expressions and with having clause. Please rewrite the aggregate query 
by removing the having clause or removing lateral alias reference in the SELECT 
list.
 
+## LATERAL_COLUMN_ALIAS_IN_GROUP_BY
+
+Referencing a lateral column alias via GROUP BY alias/ALL is not supported yet.
+
 ## LATERAL_COLUMN_ALIAS_IN_WINDOW
 
 Referencing a lateral column alias `` in window expression ``.
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveReferencesInAggregate.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveReferencesInAggregate.scala
index 41bcb337c67..09ae87b071f 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveReferencesInAggregate.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveReferencesInAggregate.scala
@@ -17,8 +17,9 @@
 
 package org.apache.spark.sql.catalyst.analysis
 
+import org.apache.spark.sql.AnalysisException
 import org.apache.spark.sql.catalyst.SQLConfHelper
-import org.apache.spark.sql.catalyst.expressions.{AliasHelper, Attribute, 
Expression, LateralColumnAliasReference, NamedExpression}
+import org.apache.spark.sql.catalyst.expressions.{AliasHelper, Attribute, 
Expression, NamedExpression}
 import org.apache.spark.sql.catalyst.expressions.aggregate.AggregateExpression
 import org.apache.spark.sql.catalyst.plans.logical.{Aggregate, AppendColumns, 
LogicalPlan}
 import 
org.apache.spark.sql.catalyst.trees.TreePattern.{LATERAL_COLUMN_ALIAS_REFERENCE,
 UNRESOLVED_ATTRIBUTE}
@@ -73,6 +74,12 @@ object ResolveReferencesInAggregate extends SQLConfHelper
 resolvedAggExprsWithOuter,
 resolveGroupByAlias(resolvedAggExprsWithOuter, 
resolvedGroupExprsNoOuter)
   ).map(resolveOuterRef)
+  // TODO: currently we don't support LCA in `groupingExpressions` yet.
+  if (resolved.exists(_.containsPattern(LATERAL_COLUMN_ALIAS_REFERENCE))) {
+throw new AnalysisException(
+  errorClass = "UNSUPPORTED_FEATURE.LATERAL_COLUMN_ALIAS_IN_GROUP_BY",
+  messageParameters = Map.empty)
+  }

[spark] branch master updated: [SPARK-44299][SQL] Assign names to the error class _LEGACY_ERROR_TEMP_227[4-6,8]

2023-07-06 Thread maxgekk

This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 5d840eb4553 [SPARK-44299][SQL] Assign names to the error class 
_LEGACY_ERROR_TEMP_227[4-6,8]
5d840eb4553 is described below

commit 5d840eb455350ef3f6235a031a1689bf4a51007d
Author: panbingkun 
AuthorDate: Thu Jul 6 10:08:45 2023 +0300

[SPARK-44299][SQL] Assign names to the error class 
_LEGACY_ERROR_TEMP_227[4-6,8]

### What changes were proposed in this pull request?
The pr aims to assign names to the error class, include:
- _LEGACY_ERROR_TEMP_2274 => UNSUPPORTED_FEATURE.REPLACE_NESTED_COLUMN
- _LEGACY_ERROR_TEMP_2275 => CANNOT_INVOKE_IN_TRANSFORMATIONS
- _LEGACY_ERROR_TEMP_2276 => UNSUPPORTED_FEATURE .HIVE_WITH_ANSI_INTERVALS
- _LEGACY_ERROR_TEMP_2278 => INVALID_FORMAT.MISMATCH_INPUT

### Why are the changes needed?
The changes improve the error framework.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
- Update & Add new UT.
- Manually test.
- Pass GA.

Closes #41858 from panbingkun/SPARK-44299.

Authored-by: panbingkun 
Signed-off-by: Max Gekk 
---
 .../src/main/resources/error/error-classes.json| 40 +++---
 ...-error-conditions-invalid-format-error-class.md |  4 +++
 ...r-conditions-unsupported-feature-error-class.md |  8 +
 docs/sql-error-conditions.md   |  6 
 .../spark/sql/catalyst/util/ToNumberParser.scala   |  4 +--
 .../spark/sql/errors/QueryExecutionErrors.scala| 20 +--
 .../expressions/StringExpressionsSuite.scala   |  9 +++--
 .../apache/spark/sql/execution/command/ddl.scala   |  2 +-
 .../sql-tests/results/postgreSQL/numeric.sql.out   | 10 +++---
 .../results/postgreSQL/numeric.sql.out.java21  | 10 +++---
 .../apache/spark/sql/DataFrameFunctionsSuite.scala | 13 +++
 .../spark/sql/DataFrameNaFunctionsSuite.scala  | 12 ---
 .../spark/sql/hive/execution/HiveDDLSuite.scala|  2 +-
 .../command/AlterTableAddColumnsSuite.scala| 13 ---
 14 files changed, 101 insertions(+), 52 deletions(-)

diff --git a/common/utils/src/main/resources/error/error-classes.json 
b/common/utils/src/main/resources/error/error-classes.json
index 8bdb02470ef..44bec5e8ced 100644
--- a/common/utils/src/main/resources/error/error-classes.json
+++ b/common/utils/src/main/resources/error/error-classes.json
@@ -128,6 +128,11 @@
 ],
 "sqlState" : "22546"
   },
+  "CANNOT_INVOKE_IN_TRANSFORMATIONS" : {
+"message" : [
+  "Dataset transformations and actions can only be invoked by the driver, 
not inside of other Dataset transformations; for example, dataset1.map(x => 
dataset2.values.count() * x) is invalid because the values transformation and 
count action cannot be performed inside of the dataset1.map transformation. For 
more information, see SPARK-28702."
+]
+  },
   "CANNOT_LOAD_FUNCTION_CLASS" : {
 "message" : [
   "Cannot load class  when registering the function 
, please make sure it is on the classpath."
@@ -1192,6 +1197,11 @@
   "The escape character is not allowed to precede ."
 ]
   },
+  "MISMATCH_INPUT" : {
+"message" : [
+  "The input  '' does not match the format."
+]
+  },
   "THOUSANDS_SEPS_MUST_BEFORE_DEC" : {
 "message" : [
   "Thousands separators (, or G) may not appear after the decimal 
point in the number format."
@@ -2583,6 +2593,11 @@
   "Drop the namespace ."
 ]
   },
+  "HIVE_WITH_ANSI_INTERVALS" : {
+"message" : [
+  "Hive table  with ANSI intervals."
+]
+  },
   "INSERT_PARTITION_SPEC_IF_NOT_EXISTS" : {
 "message" : [
   "INSERT INTO  with IF NOT EXISTS in the PARTITION spec."
@@ -2663,6 +2678,11 @@
   "Remove a comment from the namespace ."
 ]
   },
+  "REPLACE_NESTED_COLUMN" : {
+"message" : [
+  "The replace function does not support nested column ."
+]
+  },
   "SET_NAMESPACE_PROPERTY" : {
 "message" : [
   " is a reserved namespace property, ."
@@ -5627,31 +5647,11 @@
   ""
 ]
   },
-  "_LEGACY_ERROR_TEMP_2274" : {
-"message" : [
-  "Nested field  is not supported."
-]
-  },
-  "_LEGACY_ERROR_TEMP_2275" : {
-"message" : [
-  "Dataset transformations and actions can only be invoked by the driver, 
not inside of other Dataset transformations; for example, dataset1.map(x => 
dataset2.values.count() * x) is invalid because the values transformation and 
count action cannot be performed inside of the dataset1.map transformation. For 
more information, see SPARK-28702."
-]
-  },
-  "_LEGACY_ERROR_TEMP_2276" : {
-"message" : [
-  "Hive table  wi

[spark] branch master updated: [SPARK-44268][CORE][TEST][FOLLOWUP] Add test to generate `sql-error-conditions` doc automatic

[spark] branch master updated: [SPARK-44312][CONNECT][PYTHON] Allow to set a user agent with an environment variable

[spark] branch master updated (17fac569b4e -> e3c2bf56595)

[spark] branch master updated: [SPARK-43660][CONNECT][PS] Enable `resample` with Spark Connect

[spark] branch master updated: [SPARK-44315][SQL][CONNECT] Move DefinedByConstructorParams to sql/api

[spark] branch master updated: [SPARK-43321][CONNECT] Dataset#Joinwith

[spark] branch master updated: [SPARK-44316][BUILD] Upgrade Jersey to 2.40

[spark] branch master updated: [SPARK-44314][BUILD][CORE][TESTS] Add a new checkstyle rule to prohibit the use of `@Test(expected = ExpectedException.class)`

[spark] branch master updated (1adf2866915 -> 7bd13530840)

[spark] branch master updated (1fbb94b87c0 -> 1adf2866915)

[spark] branch master updated: [SPARK-44284][CONNECT] Create simple conf system for sql/api

[spark] branch master updated: Revert "[SPARK-43851][SQL] Support LCA in grouping expressions"

[spark] branch master updated: [SPARK-44299][SQL] Assign names to the error class _LEGACY_ERROR_TEMP_227[4-6,8]

13 matches

Site Navigation

Mail list logo

Footer information