[spark] branch master updated (4e8980e6ae9 -> 5b5083484cd)

2022-12-13 Thread maxgekk
This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from 4e8980e6ae9 [SPARK-41409][CORE][SQL] Rename `_LEGACY_ERROR_TEMP_1043` 
to `WRONG_NUM_ARGS.WITHOUT_SUGGESTION`
 add 5b5083484cd [SPARK-41248][SQL] Add 
"spark.sql.json.enablePartialResults" to enable/disable JSON partial results

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/json/JacksonParser.scala|  10 +-
 .../org/apache/spark/sql/internal/SQLConf.scala|  11 ++
 sql/core/benchmarks/JsonBenchmark-results.txt  | 155 ++---
 .../org/apache/spark/sql/JsonFunctionsSuite.scala  |  67 +++--
 .../sql/execution/datasources/json/JsonSuite.scala |  25 +++-
 5 files changed, 158 insertions(+), 110 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated: [SPARK-41409][CORE][SQL] Rename `_LEGACY_ERROR_TEMP_1043` to `WRONG_NUM_ARGS.WITHOUT_SUGGESTION`

2022-12-13 Thread maxgekk
This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 4e8980e6ae9 [SPARK-41409][CORE][SQL] Rename `_LEGACY_ERROR_TEMP_1043` 
to `WRONG_NUM_ARGS.WITHOUT_SUGGESTION`
4e8980e6ae9 is described below

commit 4e8980e6ae9a513bb4c990944841a9db073013ea
Author: yangjie01 
AuthorDate: Wed Dec 14 08:22:33 2022 +0300

[SPARK-41409][CORE][SQL] Rename `_LEGACY_ERROR_TEMP_1043` to 
`WRONG_NUM_ARGS.WITHOUT_SUGGESTION`

### What changes were proposed in this pull request?
This pr introduces sub-classes of `WRONG_NUM_ARGS`:

- WITHOUT_SUGGESTION
- WITH_SUGGESTION

then replace existing  `WRONG_NUM_ARGS` to `WRONG_NUM_ARGS.WITH_SUGGESTION` 
and rename error class `_LEGACY_ERROR_TEMP_1043` to 
`WRONG_NUM_ARGS.WITHOUT_SUGGESTION`

### Why are the changes needed?
Proper names of error classes to improve user experience with Spark SQL.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Add new test case

Closes #38940 from LuciferYang/legacy-1043.

Lead-authored-by: yangjie01 
Co-authored-by: YangJie 
Signed-off-by: Max Gekk 
---
 core/src/main/resources/error/error-classes.json| 21 ++---
 .../spark/sql/errors/QueryCompilationErrors.scala   |  8 
 .../resources/sql-tests/results/ansi/date.sql.out   |  2 +-
 .../sql-tests/results/ansi/string-functions.sql.out |  4 ++--
 .../results/ceil-floor-with-scale-param.sql.out |  4 ++--
 .../sql-tests/results/csv-functions.sql.out |  2 +-
 .../test/resources/sql-tests/results/date.sql.out   |  2 +-
 .../sql-tests/results/datetime-legacy.sql.out   |  2 +-
 .../sql-tests/results/json-functions.sql.out|  8 
 .../results/sql-compatibility-functions.sql.out |  2 +-
 .../sql-tests/results/string-functions.sql.out  |  4 ++--
 .../results/table-valued-functions.sql.out  |  2 +-
 .../sql-tests/results/timestamp-ntz.sql.out |  2 +-
 .../resources/sql-tests/results/udaf/udaf.sql.out   |  2 +-
 .../sql-tests/results/udf/udf-udaf.sql.out  |  2 +-
 .../apache/spark/sql/DataFrameFunctionsSuite.scala  |  2 +-
 .../org/apache/spark/sql/DateFunctionsSuite.scala   |  2 +-
 .../scala/org/apache/spark/sql/SQLQuerySuite.scala  |  2 +-
 .../org/apache/spark/sql/StringFunctionsSuite.scala |  2 +-
 .../test/scala/org/apache/spark/sql/UDFSuite.scala  | 11 ++-
 .../sql/errors/QueryCompilationErrorsSuite.scala| 13 +
 .../spark/sql/hive/execution/HiveUDAFSuite.scala|  2 +-
 22 files changed, 57 insertions(+), 44 deletions(-)

diff --git a/core/src/main/resources/error/error-classes.json 
b/core/src/main/resources/error/error-classes.json
index e1df3db4291..f66d6998e26 100644
--- a/core/src/main/resources/error/error-classes.json
+++ b/core/src/main/resources/error/error-classes.json
@@ -1548,8 +1548,20 @@
   },
   "WRONG_NUM_ARGS" : {
 "message" : [
-  "The  requires  parameters but the actual 
number is ."
-]
+  "Invalid number of arguments for the function ."
+],
+"subClass" : {
+  "WITHOUT_SUGGESTION" : {
+"message" : [
+  "Please, refer to 
'https://spark.apache.org/docs/latest/sql-ref-functions.html' for a fix."
+]
+  },
+  "WITH_SUGGESTION" : {
+"message" : [
+  "Consider to change the number of arguments because the function 
requires  parameters but the actual number is ."
+]
+  }
+}
   },
   "_LEGACY_ERROR_TEMP_0001" : {
 "message" : [
@@ -2018,11 +2030,6 @@
   "Undefined function ."
 ]
   },
-  "_LEGACY_ERROR_TEMP_1043" : {
-"message" : [
-  "Invalid arguments for function ."
-]
-  },
   "_LEGACY_ERROR_TEMP_1045" : {
 "message" : [
   "ALTER TABLE SET LOCATION does not support partition for v2 tables."
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
index b329f6689d4..a5ff2084ca8 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala
@@ -640,7 +640,7 @@ private[sql] object QueryCompilationErrors extends 
QueryErrorsBase {
   def invalidFunctionArgumentsError(
   name: String, expectedNum: String, actualNum: Int): Throwable = {
 new AnalysisException(
-  errorClass = "WRONG_NUM_ARGS",
+  errorClass = "WRONG_NUM_ARGS.WITH_SUGGESTION",
   messageParameters = Map(
 "functionName" -> toSQLId(name),
 "expectedNum" -> expectedNum,
@@ -649,10 +649,10 @@ private[sql] object QueryCompilationErrors extends 
QueryErrorsBase {
 
   def 

[spark] branch master updated (ea53dc82f28 -> 1b3a4444b8c)

2022-12-13 Thread dongjoon
This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from ea53dc82f28 [SPARK-41506][CONNECT][TESTS][FOLLOW-UP] Import BinaryType 
in pyspark.sql.tests.connect.test_connect_column
 add 1b3ab8c [SPARK-27561][SQL][FOLLOWUP] Move the two rules for Later 
column alias into one file

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/analysis/Analyzer.scala | 113 +--
 ...rence.scala => ResolveLateralColumnAlias.scala} | 125 -
 .../sql/catalyst/rules/RuleIdCollection.scala  |   2 +-
 3 files changed, 122 insertions(+), 118 deletions(-)
 rename 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/{ResolveLateralColumnAliasReference.scala
 => ResolveLateralColumnAlias.scala} (50%)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated: [SPARK-41506][CONNECT][TESTS][FOLLOW-UP] Import BinaryType in pyspark.sql.tests.connect.test_connect_column

2022-12-13 Thread gurwls223
This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new ea53dc82f28 [SPARK-41506][CONNECT][TESTS][FOLLOW-UP] Import BinaryType 
in pyspark.sql.tests.connect.test_connect_column
ea53dc82f28 is described below

commit ea53dc82f28d2297deee4349f5f86e83f552e624
Author: Hyukjin Kwon 
AuthorDate: Wed Dec 14 10:36:52 2022 +0900

[SPARK-41506][CONNECT][TESTS][FOLLOW-UP] Import BinaryType in 
pyspark.sql.tests.connect.test_connect_column

### What changes were proposed in this pull request?

This PR is a followup of https://github.com/apache/spark/pull/39047 which 
import `BinaryType` that's removed in 
https://github.com/apache/spark/pull/39050. This was a logical conflict.

### Why are the changes needed?

To recover the build.

### Does this PR introduce _any_ user-facing change?

No, test-only.

### How was this patch tested?

Manually verified by running `./dev/lint-python`.

Closes #39055 from HyukjinKwon/SPARK-41506-followup.

Authored-by: Hyukjin Kwon 
Signed-off-by: Hyukjin Kwon 
---
 python/pyspark/sql/tests/connect/test_connect_column.py | 1 +
 1 file changed, 1 insertion(+)

diff --git a/python/pyspark/sql/tests/connect/test_connect_column.py 
b/python/pyspark/sql/tests/connect/test_connect_column.py
index b7645bc4b71..c997f94a1ea 100644
--- a/python/pyspark/sql/tests/connect/test_connect_column.py
+++ b/python/pyspark/sql/tests/connect/test_connect_column.py
@@ -41,6 +41,7 @@ from pyspark.sql.types import (
 TimestampType,
 TimestampNTZType,
 ByteType,
+BinaryType,
 ShortType,
 IntegerType,
 FloatType,


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated: [SPARK-41506][CONNECT][PYTHON] Refactor LiteralExpression to support DataType

2022-12-13 Thread gurwls223
This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new cdc73ad36e5 [SPARK-41506][CONNECT][PYTHON] Refactor LiteralExpression 
to support DataType
cdc73ad36e5 is described below

commit cdc73ad36e53544add2cfb7ea66941202014303e
Author: Ruifeng Zheng 
AuthorDate: Wed Dec 14 09:50:05 2022 +0900

[SPARK-41506][CONNECT][PYTHON] Refactor LiteralExpression to support 
DataType

### What changes were proposed in this pull request?
1, existing `LiteralExpression` is a mixture of `Literal`, `CreateArray`, 
`CreateStruct` and `CreateMap`, since we have added collection functions 
`array`, `struct` and `create_map`, the `CreateXXX` expressions can be replaced 
with `UnresolvedFunction`;
2, add field `dataType` in `LiteralExpression`, so we can specify the 
DataType if needed, a special case is the typed null;
3, it is up to the `lit` function to infer the DataType, not 
`LiteralExpression` itself;

### Why are the changes needed?
Refactor LiteralExpression to support DataType

### Does this PR introduce _any_ user-facing change?
No, `LiteralExpression` is a internal class, should not expose to end users

### How was this patch tested?
added UT

Closes #39047 from zhengruifeng/connect_lit_datatype.

Authored-by: Ruifeng Zheng 
Signed-off-by: Hyukjin Kwon 
---
 .../main/protobuf/spark/connect/expressions.proto  |  23 +--
 .../planner/LiteralValueProtoConverter.scala   |  31 +--
 python/pyspark/sql/connect/column.py   | 213 +++--
 python/pyspark/sql/connect/functions.py|  13 +-
 .../pyspark/sql/connect/proto/expressions_pb2.py   |  86 ++---
 .../pyspark/sql/connect/proto/expressions_pb2.pyi  | 115 +--
 .../sql/tests/connect/test_connect_column.py   | 122 +++-
 .../connect/test_connect_column_expressions.py |  74 +--
 8 files changed, 365 insertions(+), 312 deletions(-)

diff --git 
a/connector/connect/common/src/main/protobuf/spark/connect/expressions.proto 
b/connector/connect/common/src/main/protobuf/spark/connect/expressions.proto
index 6c0facbfeee..c906f15e0a6 100644
--- a/connector/connect/common/src/main/protobuf/spark/connect/expressions.proto
+++ b/connector/connect/common/src/main/protobuf/spark/connect/expressions.proto
@@ -77,9 +77,7 @@ message Expression {
   int32 year_month_interval = 20;
   int64 day_time_interval = 21;
 
-  Array array = 22;
-  Struct struct = 23;
-  Map map = 24;
+  DataType typed_null = 22;
 }
 
 // whether the literal type should be treated as a nullable type. Applies 
to
@@ -107,25 +105,6 @@ message Expression {
   int32 days = 2;
   int64 microseconds = 3;
 }
-
-message Struct {
-  // A possibly heterogeneously typed list of literals
-  repeated Literal fields = 1;
-}
-
-message Array {
-  // A homogeneously typed list of literals
-  repeated Literal values = 1;
-}
-
-message Map {
-  repeated Pair pairs = 1;
-
-  message Pair {
-Literal key = 1;
-Literal value = 2;
-  }
-}
   }
 
   // An unresolved attribute that is not explicitly bound to a specific 
column, but the column
diff --git 
a/connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/LiteralValueProtoConverter.scala
 
b/connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/LiteralValueProtoConverter.scala
index 5a54ad9ac64..46f6db64b8c 100644
--- 
a/connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/LiteralValueProtoConverter.scala
+++ 
b/connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/LiteralValueProtoConverter.scala
@@ -17,11 +17,8 @@
 
 package org.apache.spark.sql.connect.planner
 
-import scala.collection.JavaConverters._
-
 import org.apache.spark.connect.proto
-import org.apache.spark.sql.catalyst.{expressions, InternalRow}
-import org.apache.spark.sql.catalyst.expressions.{CreateArray, CreateMap, 
CreateStruct}
+import org.apache.spark.sql.catalyst.expressions
 import org.apache.spark.sql.types._
 import org.apache.spark.unsafe.types.{CalendarInterval, UTF8String}
 
@@ -99,20 +96,6 @@ object LiteralValueProtoConverter {
   case proto.Expression.Literal.LiteralTypeCase.DAY_TIME_INTERVAL =>
 expressions.Literal(lit.getDayTimeInterval, DayTimeIntervalType())
 
-  case proto.Expression.Literal.LiteralTypeCase.ARRAY =>
-val literals = 
lit.getArray.getValuesList.asScala.toArray.map(toCatalystExpression)
-CreateArray(literals)
-
-  case proto.Expression.Literal.LiteralTypeCase.STRUCT =>
-val literals = 
lit.getStruct.getFieldsList.asScala.toArray.map(toCatalystExpression)
-CreateStruct(literals)
-
-  case 

[spark] branch master updated: [SPARK-41412][CONNECT][TESTS][FOLLOW-UP] Exclude binary casting to make the tests to pass with/without ANSI mode

2022-12-13 Thread gurwls223
This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new a75bc841db2 [SPARK-41412][CONNECT][TESTS][FOLLOW-UP] Exclude binary 
casting to make the tests to pass with/without ANSI mode
a75bc841db2 is described below

commit a75bc841db200a8d79ae8aabf3bff0308bcaadbb
Author: Hyukjin Kwon 
AuthorDate: Wed Dec 14 08:35:04 2022 +0900

[SPARK-41412][CONNECT][TESTS][FOLLOW-UP] Exclude binary casting to make the 
tests to pass with/without ANSI mode

### What changes were proposed in this pull request?

This PR is another followup of https://github.com/apache/spark/pull/39034 
that, instead, make the tests to pass with/without ANSI mode.

### Why are the changes needed?

Spark Connect uses isolated Spark session so setting the configuration in 
PySpark side does not take an effect. Therefore, the test still fails, see 
https://github.com/apache/spark/actions/runs/3681383627/jobs/6228030132.

We should make the tests pass with/without ANSI mode for now.

### Does this PR introduce _any_ user-facing change?
No, test-only

### How was this patch tested?

Manually tested via:

```bash
SPARK_ANSI_SQL_MODE=true ./python/run-tests --testnames 
'pyspark.sql.tests.connect.test_connect_column'
```

Closes #39050 from HyukjinKwon/SPARK-41412.

Authored-by: Hyukjin Kwon 
Signed-off-by: Hyukjin Kwon 
---
 .../sql/tests/connect/test_connect_column.py   | 35 ++
 1 file changed, 15 insertions(+), 20 deletions(-)

diff --git a/python/pyspark/sql/tests/connect/test_connect_column.py 
b/python/pyspark/sql/tests/connect/test_connect_column.py
index e6701231990..8b70b4d9a44 100644
--- a/python/pyspark/sql/tests/connect/test_connect_column.py
+++ b/python/pyspark/sql/tests/connect/test_connect_column.py
@@ -26,7 +26,6 @@ from pyspark.sql.types import (
 DoubleType,
 LongType,
 DecimalType,
-BinaryType,
 BooleanType,
 )
 from pyspark.testing.connectutils import should_test_connect
@@ -153,25 +152,21 @@ class SparkConnectTests(SparkConnectSQLTestCase):
 df.select(df.id.cast("string")).toPandas(), 
df2.select(df2.id.cast("string")).toPandas()
 )
 
-# Test if the arguments can be passed properly.
-# Do not need to check individual behaviour for the ANSI mode 
thoroughly.
-with self.sql_conf({"spark.sql.ansi.enabled": False}):
-for x in [
-StringType(),
-BinaryType(),
-ShortType(),
-IntegerType(),
-LongType(),
-FloatType(),
-DoubleType(),
-ByteType(),
-DecimalType(10, 2),
-BooleanType(),
-DayTimeIntervalType(),
-]:
-self.assert_eq(
-df.select(df.id.cast(x)).toPandas(), 
df2.select(df2.id.cast(x)).toPandas()
-)
+for x in [
+StringType(),
+ShortType(),
+IntegerType(),
+LongType(),
+FloatType(),
+DoubleType(),
+ByteType(),
+DecimalType(10, 2),
+BooleanType(),
+DayTimeIntervalType(),
+]:
+self.assert_eq(
+df.select(df.id.cast(x)).toPandas(), 
df2.select(df2.id.cast(x)).toPandas()
+)
 
 def test_unsupported_functions(self):
 # SPARK-41225: Disable unsupported functions.


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated: [SPARK-41062][SQL] Rename `UNSUPPORTED_CORRELATED_REFERENCE` to `CORRELATED_REFERENCE`

2022-12-13 Thread maxgekk
This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new e29ada0c13e [SPARK-41062][SQL] Rename 
`UNSUPPORTED_CORRELATED_REFERENCE` to `CORRELATED_REFERENCE`
e29ada0c13e is described below

commit e29ada0c13e71aaad0566ef67591a33d4c58fe2a
Author: itholic 
AuthorDate: Tue Dec 13 21:48:11 2022 +0300

[SPARK-41062][SQL] Rename `UNSUPPORTED_CORRELATED_REFERENCE` to 
`CORRELATED_REFERENCE`

### What changes were proposed in this pull request?

This PR proposes to rename `UNSUPPORTED_CORRELATED_REFERENCE` to 
`CORRELATED_REFERENCE`.

Also, show `sqlExprs` rather than `treeNode` which is more useful 
information to users.

### Why are the changes needed?

The sub-error class name is duplicated with its main class, 
`UNSUPPORTED_SUBQUERY_EXPRESSION_CATEGORY`.

We should make the all error class name clear and briefly.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

```
./build/sbt “sql/testOnly org.apache.spark.sql.SQLQueryTestSuite*”
```

Closes #38576 from itholic/SPARK-41062.

Lead-authored-by: itholic 
Co-authored-by: Haejoon Lee <44108233+itho...@users.noreply.github.com>
Signed-off-by: Max Gekk 
---
 core/src/main/resources/error/error-classes.json| 10 +-
 .../apache/spark/sql/catalyst/analysis/CheckAnalysis.scala  |  7 ---
 .../spark/sql/catalyst/analysis/ResolveSubquerySuite.scala  | 13 -
 .../subquery/negative-cases/invalid-correlation.sql.out |  4 ++--
 .../src/test/scala/org/apache/spark/sql/SubquerySuite.scala | 12 +---
 5 files changed, 24 insertions(+), 22 deletions(-)

diff --git a/core/src/main/resources/error/error-classes.json 
b/core/src/main/resources/error/error-classes.json
index 25362d5893f..e1df3db4291 100644
--- a/core/src/main/resources/error/error-classes.json
+++ b/core/src/main/resources/error/error-classes.json
@@ -1471,6 +1471,11 @@
   "A correlated outer name reference within a subquery expression body 
was not found in the enclosing query: "
 ]
   },
+  "CORRELATED_REFERENCE" : {
+"message" : [
+  "Expressions referencing the outer query are not supported outside 
of WHERE/HAVING clauses: "
+]
+  },
   "LATERAL_JOIN_CONDITION_NON_DETERMINISTIC" : {
 "message" : [
   "Lateral join condition cannot be non-deterministic: "
@@ -1496,11 +1501,6 @@
   "Non-deterministic lateral subqueries are not supported when joining 
with outer relations that produce more than one row"
 ]
   },
-  "UNSUPPORTED_CORRELATED_REFERENCE" : {
-"message" : [
-  "Expressions referencing the outer query are not supported outside 
of WHERE/HAVING clauses"
-]
-  },
   "UNSUPPORTED_CORRELATED_REFERENCE_DATA_TYPE" : {
 "message" : [
   "Correlated column reference '' cannot be  type"
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
index e7e153a319d..5303364710c 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
@@ -1089,11 +1089,12 @@ trait CheckAnalysis extends PredicateHelper with 
LookupCatalog with QueryErrorsB
 // 2. Expressions containing outer references on plan nodes other than 
allowed operators.
 def failOnInvalidOuterReference(p: LogicalPlan): Unit = {
   p.expressions.foreach(checkMixedReferencesInsideAggregateExpr)
-  if (!canHostOuter(p) && p.expressions.exists(containsOuter)) {
+  val exprs = stripOuterReferences(p.expressions.filter(expr => 
containsOuter(expr)))
+  if (!canHostOuter(p) && !exprs.isEmpty) {
 p.failAnalysis(
   errorClass =
-
"UNSUPPORTED_SUBQUERY_EXPRESSION_CATEGORY.UNSUPPORTED_CORRELATED_REFERENCE",
-  messageParameters = Map("treeNode" -> planToString(p)))
+"UNSUPPORTED_SUBQUERY_EXPRESSION_CATEGORY.CORRELATED_REFERENCE",
+  messageParameters = Map("sqlExprs" -> 
exprs.map(toSQLExpr).mkString(",")))
   }
 }
 
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/ResolveSubquerySuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/ResolveSubquerySuite.scala
index 577f663d8b1..7b99153acf9 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/ResolveSubquerySuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/ResolveSubquerySuite.scala
@@ -51,11 +51,14 @@ class 

[spark] branch master updated: [SPARK-41482][BUILD] Upgrade dropwizard metrics 4.2.13

2022-12-13 Thread dongjoon
This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new e2474f63f1b [SPARK-41482][BUILD] Upgrade dropwizard metrics 4.2.13
e2474f63f1b is described below

commit e2474f63f1b588f22279ed4e51b50924ecae9e86
Author: yangjie01 
AuthorDate: Tue Dec 13 10:38:17 2022 -0800

[SPARK-41482][BUILD] Upgrade dropwizard metrics 4.2.13

### What changes were proposed in this pull request?
This pr aims upgrade dropwizard metrics to 4.2.13.

### Why are the changes needed?
The release notes as follows:

- https://github.com/dropwizard/metrics/releases/tag/v4.2.13

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Pass Github Actions

Closes #39026 from LuciferYang/metrics-4213.

Lead-authored-by: yangjie01 
Co-authored-by: YangJie 
Signed-off-by: Dongjoon Hyun 
---
 dev/deps/spark-deps-hadoop-2-hive-2.3 | 10 +-
 dev/deps/spark-deps-hadoop-3-hive-2.3 | 10 +-
 pom.xml   |  2 +-
 3 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/dev/deps/spark-deps-hadoop-2-hive-2.3 
b/dev/deps/spark-deps-hadoop-2-hive-2.3
index ae7cc9d592c..40741e1d75b 100644
--- a/dev/deps/spark-deps-hadoop-2-hive-2.3
+++ b/dev/deps/spark-deps-hadoop-2-hive-2.3
@@ -194,11 +194,11 @@ log4j-slf4j2-impl/2.19.0//log4j-slf4j2-impl-2.19.0.jar
 logging-interceptor/3.12.12//logging-interceptor-3.12.12.jar
 lz4-java/1.8.0//lz4-java-1.8.0.jar
 mesos/1.4.3/shaded-protobuf/mesos-1.4.3-shaded-protobuf.jar
-metrics-core/4.2.12//metrics-core-4.2.12.jar
-metrics-graphite/4.2.12//metrics-graphite-4.2.12.jar
-metrics-jmx/4.2.12//metrics-jmx-4.2.12.jar
-metrics-json/4.2.12//metrics-json-4.2.12.jar
-metrics-jvm/4.2.12//metrics-jvm-4.2.12.jar
+metrics-core/4.2.13//metrics-core-4.2.13.jar
+metrics-graphite/4.2.13//metrics-graphite-4.2.13.jar
+metrics-jmx/4.2.13//metrics-jmx-4.2.13.jar
+metrics-json/4.2.13//metrics-json-4.2.13.jar
+metrics-jvm/4.2.13//metrics-jvm-4.2.13.jar
 minlog/1.3.0//minlog-1.3.0.jar
 netty-all/4.1.84.Final//netty-all-4.1.84.Final.jar
 netty-buffer/4.1.84.Final//netty-buffer-4.1.84.Final.jar
diff --git a/dev/deps/spark-deps-hadoop-3-hive-2.3 
b/dev/deps/spark-deps-hadoop-3-hive-2.3
index f70abedd34b..162816bdc39 100644
--- a/dev/deps/spark-deps-hadoop-3-hive-2.3
+++ b/dev/deps/spark-deps-hadoop-3-hive-2.3
@@ -178,11 +178,11 @@ log4j-slf4j2-impl/2.19.0//log4j-slf4j2-impl-2.19.0.jar
 logging-interceptor/3.12.12//logging-interceptor-3.12.12.jar
 lz4-java/1.8.0//lz4-java-1.8.0.jar
 mesos/1.4.3/shaded-protobuf/mesos-1.4.3-shaded-protobuf.jar
-metrics-core/4.2.12//metrics-core-4.2.12.jar
-metrics-graphite/4.2.12//metrics-graphite-4.2.12.jar
-metrics-jmx/4.2.12//metrics-jmx-4.2.12.jar
-metrics-json/4.2.12//metrics-json-4.2.12.jar
-metrics-jvm/4.2.12//metrics-jvm-4.2.12.jar
+metrics-core/4.2.13//metrics-core-4.2.13.jar
+metrics-graphite/4.2.13//metrics-graphite-4.2.13.jar
+metrics-jmx/4.2.13//metrics-jmx-4.2.13.jar
+metrics-json/4.2.13//metrics-json-4.2.13.jar
+metrics-jvm/4.2.13//metrics-jvm-4.2.13.jar
 minlog/1.3.0//minlog-1.3.0.jar
 netty-all/4.1.84.Final//netty-all-4.1.84.Final.jar
 netty-buffer/4.1.84.Final//netty-buffer-4.1.84.Final.jar
diff --git a/pom.xml b/pom.xml
index da7c8eccfce..e5d8d2d06ba 100644
--- a/pom.xml
+++ b/pom.xml
@@ -151,7 +151,7 @@
 If you changes codahale.metrics.version, you also need to change
 the link to metrics.dropwizard.io in docs/monitoring.md.
 -->
-4.2.12
+4.2.13
 
 1.11.1
 1.12.0


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (7e9b88bfceb -> d00771f5ee2)

2022-12-13 Thread dongjoon
This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from 7e9b88bfceb [SPARK-27561][SQL] Support implicit lateral column alias 
resolution on Project
 add d00771f5ee2 [SPARK-39601][YARN][FOLLOWUP] YarnClusterSchedulerBackend 
should call super.stop()

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/scheduler/cluster/YarnClusterSchedulerBackend.scala | 1 +
 1 file changed, 1 insertion(+)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated: [SPARK-27561][SQL] Support implicit lateral column alias resolution on Project

2022-12-13 Thread wenchen
This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 7e9b88bfceb [SPARK-27561][SQL] Support implicit lateral column alias 
resolution on Project
7e9b88bfceb is described below

commit 7e9b88bfceb86d3b32e82a86b672aab3c74def8c
Author: Xinyi Yu 
AuthorDate: Wed Dec 14 00:14:06 2022 +0800

[SPARK-27561][SQL] Support implicit lateral column alias resolution on 
Project

### What changes were proposed in this pull request?
This PR implements a new feature: Implicit lateral column alias  on 
`Project` case, controlled by 
`spark.sql.lateralColumnAlias.enableImplicitResolution` temporarily (default 
false now, but will turn on this conf once the feature is completely merged).

 Lateral column alias
View https://issues.apache.org/jira/browse/SPARK-27561 for more details on 
lateral column alias.
There are two main cases to support: LCA in Project, and LCA in Aggregate.
```sql
-- LCA in Project. The base_salary references an attribute defined by a 
previous alias
SELECT salary AS base_salary, base_salary + bonus AS total_salary
FROM employee

-- LCA in Aggregate. The avg_salary references an attribute defined by a 
previous alias
SELECT dept, average(salary) AS avg_salary, avg_salary + average(bonus)
FROM employee
GROUP BY dept
```
This **implicit** lateral column alias (no explicit keyword, e.g. 
`lateral.base_salary`) should be supported.

 High level design
This PR defines a new Resolution rule, `ResolveLateralColumnAlias` to 
resolve the implicit lateral column alias, covering the `Project` case.
It introduces a new leaf node NamedExpression, 
`LateralColumnAliasReference`, as a placeholder used to hold a referenced that 
has been temporarily resolved as the reference to a lateral column alias.

The whole process is generally divided into two phases:
1) recognize **resolved** lateral alias, wrap the attributes referencing 
them with `LateralColumnAliasReference`.
 2) when the whole operator is resolved, unwrap 
`LateralColumnAliasReference`. For Project, it further resolves the attributes 
and push down the referenced lateral aliases to the new Project.

For example:
```
// Before
Project [age AS a, 'a + 1]
+- Child

// After phase 1
Project [age AS a, lateralalias(a) + 1]
+- Child

// After phase 2
Project [a, a + 1]
+- Project [child output, age AS a]
   +- Child
```

 Resolution order
Given this new rule, the name resolution order will be (higher -> lower):
```
local table column > local metadata attribute > local lateral column alias 
> all others (outer reference of subquery, parameters of SQL UDF, ..)
```

There is a recent refactor that moves the creation of `OuterReference` in 
the Resolution batch: https://github.com/apache/spark/pull/38851.
Because lateral column alias has higher resolution priority than outer 
reference, it will try to resolve an `OuterReference` using lateral column 
alias, similar as an `UnresolvedAttribute`. If success, it strips 
`OuterReference` and also wraps it with `LateralColumnAliasReference`.

### Why are the changes needed?
The lateral column alias is a popular feature wanted for a long time. It is 
supported by lots of other database vendors (Redshift, snowflake, etc) and 
provides a better user experience.

### Does this PR introduce _any_ user-facing change?
Yes, as shown in the above example, it will be able to resolve lateral 
column alias. I will write the migration guide or release note when most PRs of 
this feature are merged.

### How was this patch tested?
Existing tests and newly added tests.

Closes #38776 from anchovYu/SPARK-27561-refactor.

Authored-by: Xinyi Yu 
Signed-off-by: Wenchen Fan 
---
 core/src/main/resources/error/error-classes.json   |   6 +
 .../sql/catalyst/expressions/AttributeMap.scala|   3 +-
 .../sql/catalyst/expressions/AttributeMap.scala|   3 +
 .../spark/sql/catalyst/analysis/Analyzer.scala | 119 +++-
 .../sql/catalyst/analysis/CheckAnalysis.scala  |  25 +-
 .../ResolveLateralColumnAliasReference.scala   | 135 +
 .../catalyst/expressions/namedExpressions.scala|  33 +++
 .../spark/sql/catalyst/expressions/subquery.scala  |   9 +-
 .../sql/catalyst/rules/RuleIdCollection.scala  |   2 +
 .../spark/sql/catalyst/trees/TreePatterns.scala|   1 +
 .../spark/sql/errors/QueryCompilationErrors.scala  |  19 ++
 .../org/apache/spark/sql/internal/SQLConf.scala|  11 +
 .../apache/spark/sql/LateralColumnAliasSuite.scala | 327 +
 13 files changed, 686 insertions(+), 7 deletions(-)

diff --git 

[spark] branch branch-3.2 updated: [SPARK-41360][CORE][BUILD][FOLLOW-UP] Exclude BlockManagerMessages.RegisterBlockManager in MiMa

2022-12-13 Thread gurwls223
This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.2
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.2 by this push:
 new c6bd8207673 [SPARK-41360][CORE][BUILD][FOLLOW-UP] Exclude 
BlockManagerMessages.RegisterBlockManager in MiMa
c6bd8207673 is described below

commit c6bd82076739abc694ab5327f000d6fb1eb9211c
Author: Hyukjin Kwon 
AuthorDate: Tue Dec 13 23:34:09 2022 +0900

[SPARK-41360][CORE][BUILD][FOLLOW-UP] Exclude 
BlockManagerMessages.RegisterBlockManager in MiMa

This PR is a followup of https://github.com/apache/spark/pull/38876 that 
excludes BlockManagerMessages.RegisterBlockManager in MiMa compatibility check.

It fails in MiMa check presumably with Scala 2.13 in other branches. Should 
be safer to exclude them all in the affected branches.

No, dev-only.

Filters copied from error messages. Will monitor the build in other 
branches.

Closes #39052 from HyukjinKwon/SPARK-41360-followup.

Authored-by: Hyukjin Kwon 
Signed-off-by: Hyukjin Kwon 
(cherry picked from commit a2ceff29f9d1c0133fa0c8274fa84c43106e90f0)
Signed-off-by: Hyukjin Kwon 
---
 project/MimaExcludes.scala | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/project/MimaExcludes.scala b/project/MimaExcludes.scala
index 7957062f332..07add4ce469 100644
--- a/project/MimaExcludes.scala
+++ b/project/MimaExcludes.scala
@@ -72,7 +72,13 @@ object MimaExcludes {
 
ProblemFilters.exclude[ReversedMissingMethodProblem]("org.apache.spark.shuffle.api.ShuffleMapOutputWriter.commitAllPartitions"),
 
 // [SPARK-37391][SQL] JdbcConnectionProvider tells if it modifies security 
context
-
ProblemFilters.exclude[ReversedMissingMethodProblem]("org.apache.spark.sql.jdbc.JdbcConnectionProvider.modifiesSecurityContext")
+
ProblemFilters.exclude[ReversedMissingMethodProblem]("org.apache.spark.sql.jdbc.JdbcConnectionProvider.modifiesSecurityContext"),
+
+// [SPARK-41360][CORE] Avoid BlockManager re-registration if the executor 
has been lost
+
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.storage.BlockManagerMessages#RegisterBlockManager.copy"),
+
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.storage.BlockManagerMessages#RegisterBlockManager.this"),
+
ProblemFilters.exclude[MissingTypesProblem]("org.apache.spark.storage.BlockManagerMessages$RegisterBlockManager$"),
+
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.storage.BlockManagerMessages#RegisterBlockManager.apply")
   )
 
   // Exclude rules for 3.1.x


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch branch-3.3 updated: [SPARK-41360][CORE][BUILD][FOLLOW-UP] Exclude BlockManagerMessages.RegisterBlockManager in MiMa

2022-12-13 Thread gurwls223
This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.3
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.3 by this push:
 new 9f7baaec5e2 [SPARK-41360][CORE][BUILD][FOLLOW-UP] Exclude 
BlockManagerMessages.RegisterBlockManager in MiMa
9f7baaec5e2 is described below

commit 9f7baaec5e2850fbd51a146b52309189bf83379c
Author: Hyukjin Kwon 
AuthorDate: Tue Dec 13 23:34:09 2022 +0900

[SPARK-41360][CORE][BUILD][FOLLOW-UP] Exclude 
BlockManagerMessages.RegisterBlockManager in MiMa

This PR is a followup of https://github.com/apache/spark/pull/38876 that 
excludes BlockManagerMessages.RegisterBlockManager in MiMa compatibility check.

It fails in MiMa check presumably with Scala 2.13 in other branches. Should 
be safer to exclude them all in the affected branches.

No, dev-only.

Filters copied from error messages. Will monitor the build in other 
branches.

Closes #39052 from HyukjinKwon/SPARK-41360-followup.

Authored-by: Hyukjin Kwon 
Signed-off-by: Hyukjin Kwon 
(cherry picked from commit a2ceff29f9d1c0133fa0c8274fa84c43106e90f0)
Signed-off-by: Hyukjin Kwon 
---
 project/MimaExcludes.scala | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/project/MimaExcludes.scala b/project/MimaExcludes.scala
index 8f3bd43ec65..ae8aa7e5cb3 100644
--- a/project/MimaExcludes.scala
+++ b/project/MimaExcludes.scala
@@ -68,7 +68,13 @@ object MimaExcludes {
 
 // [SPARK-38908][SQL] Provide query context in runtime error of Casting 
from String to
 // Number/Date/Timestamp/Boolean
-
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.sql.types.Decimal.fromStringANSI")
+
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.sql.types.Decimal.fromStringANSI"),
+
+// [SPARK-41360][CORE] Avoid BlockManager re-registration if the executor 
has been lost
+
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.storage.BlockManagerMessages#RegisterBlockManager.copy"),
+
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.storage.BlockManagerMessages#RegisterBlockManager.this"),
+
ProblemFilters.exclude[MissingTypesProblem]("org.apache.spark.storage.BlockManagerMessages$RegisterBlockManager$"),
+
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.storage.BlockManagerMessages#RegisterBlockManager.apply")
   )
 
   // Exclude rules for 3.2.x from 3.1.1


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated: [SPARK-41360][CORE][BUILD][FOLLOW-UP] Exclude BlockManagerMessages.RegisterBlockManager in MiMa

2022-12-13 Thread gurwls223
This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new a2ceff29f9d [SPARK-41360][CORE][BUILD][FOLLOW-UP] Exclude 
BlockManagerMessages.RegisterBlockManager in MiMa
a2ceff29f9d is described below

commit a2ceff29f9d1c0133fa0c8274fa84c43106e90f0
Author: Hyukjin Kwon 
AuthorDate: Tue Dec 13 23:34:09 2022 +0900

[SPARK-41360][CORE][BUILD][FOLLOW-UP] Exclude 
BlockManagerMessages.RegisterBlockManager in MiMa

### What changes were proposed in this pull request?

This PR is a followup of https://github.com/apache/spark/pull/38876 that 
excludes BlockManagerMessages.RegisterBlockManager in MiMa compatibility check.

### Why are the changes needed?

It fails in MiMa check presumably with Scala 2.13 in other branches. Should 
be safer to exclude them all in the affected branches.

### Does this PR introduce _any_ user-facing change?

No, dev-only.

### How was this patch tested?

Filters copied from error messages. Will monitor the build in other 
branches.

Closes #39052 from HyukjinKwon/SPARK-41360-followup.

Authored-by: Hyukjin Kwon 
Signed-off-by: Hyukjin Kwon 
---
 project/MimaExcludes.scala | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/project/MimaExcludes.scala b/project/MimaExcludes.scala
index eed79d1f204..7ec4ef37a0d 100644
--- a/project/MimaExcludes.scala
+++ b/project/MimaExcludes.scala
@@ -123,7 +123,13 @@ object MimaExcludes {
 
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.sql.streaming.StreamingQueryException.this"),
 
 // [SPARK-41180][SQL] Reuse INVALID_SCHEMA instead of 
_LEGACY_ERROR_TEMP_1227
-
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.sql.types.DataType.parseTypeWithFallback")
+
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.sql.types.DataType.parseTypeWithFallback"),
+
+// [SPARK-41360][CORE] Avoid BlockManager re-registration if the executor 
has been lost
+
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.storage.BlockManagerMessages#RegisterBlockManager.copy"),
+
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.storage.BlockManagerMessages#RegisterBlockManager.this"),
+
ProblemFilters.exclude[MissingTypesProblem]("org.apache.spark.storage.BlockManagerMessages$RegisterBlockManager$"),
+
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.storage.BlockManagerMessages#RegisterBlockManager.apply")
   )
 
   // Defulat exclude rules


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated: [SPARK-39601][YARN] AllocationFailure should not be treated as exitCausedByApp when driver is shutting down

2022-12-13 Thread tgraves
This is an automated email from the ASF dual-hosted git repository.

tgraves pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new e857b7ad1c7 [SPARK-39601][YARN] AllocationFailure should not be 
treated as exitCausedByApp when driver is shutting down
e857b7ad1c7 is described below

commit e857b7ad1c78c57d06436e387473d83e61293c7c
Author: Cheng Pan 
AuthorDate: Tue Dec 13 08:18:08 2022 -0600

[SPARK-39601][YARN] AllocationFailure should not be treated as 
exitCausedByApp when driver is shutting down

### What changes were proposed in this pull request?

Treating container `AllocationFailure` as not "exitCausedByApp" when driver 
is shutting down.

The approach is suggested at 
https://github.com/apache/spark/pull/36991#discussion_r915948343

### Why are the changes needed?

I observed some Spark Applications successfully completed all jobs but 
failed during the shutting down phase w/ reason: Max number of executor 
failures (16) reached, the timeline is

Driver - Job success, Spark starts shutting down procedure.
```
2022-06-23 19:50:55 CST AbstractConnector INFO - Stopped 
Spark74e9431b{HTTP/1.1, (http/1.1)}{0.0.0.0:0}
2022-06-23 19:50:55 CST SparkUI INFO - Stopped Spark web UI at 
http://hadoop2627.xxx.org:28446
2022-06-23 19:50:55 CST YarnClusterSchedulerBackend INFO - Shutting down 
all executors
```

Driver - A container allocate successful during shutting down phase.
```
2022-06-23 19:52:21 CST YarnAllocator INFO - Launching container 
container_e94_1649986670278_7743380_02_25 on host hadoop4388.xxx.org for 
executor with ID 24 for ResourceProfile Id 0
```

Executor - The executor can not connect to driver endpoint because driver 
already stopped the endpoint.
```
Exception in thread "main" java.lang.reflect.UndeclaredThrowableException
  at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1911)
  at 
org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:61)
  at 
org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:393)
  at 
org.apache.spark.executor.YarnCoarseGrainedExecutorBackend$.main(YarnCoarseGrainedExecutorBackend.scala:81)
  at 
org.apache.spark.executor.YarnCoarseGrainedExecutorBackend.main(YarnCoarseGrainedExecutorBackend.scala)
Caused by: org.apache.spark.SparkException: Exception thrown in awaitResult:
  at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:301)
  at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
  at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:101)
  at 
org.apache.spark.executor.CoarseGrainedExecutorBackend$.$anonfun$run$9(CoarseGrainedExecutorBackend.scala:413)
  at 
scala.runtime.java8.JFunction1$mcVI$sp.apply(JFunction1$mcVI$sp.java:23)
  at 
scala.collection.TraversableLike$WithFilter.$anonfun$foreach$1(TraversableLike.scala:877)
  at scala.collection.immutable.Range.foreach(Range.scala:158)
  at 
scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:876)
  at 
org.apache.spark.executor.CoarseGrainedExecutorBackend$.$anonfun$run$7(CoarseGrainedExecutorBackend.scala:411)
  at 
org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:62)
  at 
org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:61)
  at java.security.AccessController.doPrivileged(Native Method)
  at javax.security.auth.Subject.doAs(Subject.java:422)
  at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1893)
  ... 4 more
Caused by: org.apache.spark.rpc.RpcEndpointNotFoundException: Cannot find 
endpoint: spark://CoarseGrainedSchedulerhadoop2627.xxx.org:21956
  at 
org.apache.spark.rpc.netty.NettyRpcEnv.$anonfun$asyncSetupEndpointRefByURI$1(NettyRpcEnv.scala:148)
  at 
org.apache.spark.rpc.netty.NettyRpcEnv.$anonfun$asyncSetupEndpointRefByURI$1$adapted(NettyRpcEnv.scala:144)
  at scala.concurrent.Future.$anonfun$flatMap$1(Future.scala:307)
  at 
scala.concurrent.impl.Promise.$anonfun$transformWith$1(Promise.scala:41)
  at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:64)
  at org.apache.spark.util.ThreadUtils$$anon$1.execute(ThreadUtils.scala:99)
  at 
scala.concurrent.impl.ExecutionContextImpl$$anon$4.execute(ExecutionContextImpl.scala:138)
  at 
scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:72)
  at 
scala.concurrent.impl.Promise$DefaultPromise.$anonfun$tryComplete$1(Promise.scala:288)
  at 
scala.concurrent.impl.Promise$DefaultPromise.$anonfun$tryComplete$1$adapted(Promise.scala:288)
  at 
scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:288)
```
 

[spark] branch master updated (0e2d604fd33 -> 3809ccdca6e)

2022-12-13 Thread maxgekk
This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from 0e2d604fd33 [SPARK-41406][SQL] Refactor error message for 
`NUM_COLUMNS_MISMATCH` to make it more generic
 add 3809ccdca6e [SPARK-41478][SQL] Assign a name to the error class 
_LEGACY_ERROR_TEMP_1234

No new revisions were added by this update.

Summary of changes:
 core/src/main/resources/error/error-classes.json   | 10 +-
 .../spark/sql/errors/QueryCompilationErrors.scala  |  4 ++--
 .../spark/sql/StatisticsCollectionSuite.scala  | 23 +-
 .../apache/spark/sql/execution/SQLViewSuite.scala  | 11 +++
 4 files changed, 28 insertions(+), 20 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated: [SPARK-41406][SQL] Refactor error message for `NUM_COLUMNS_MISMATCH` to make it more generic

2022-12-13 Thread maxgekk
This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 0e2d604fd33 [SPARK-41406][SQL] Refactor error message for 
`NUM_COLUMNS_MISMATCH` to make it more generic
0e2d604fd33 is described below

commit 0e2d604fd33c8236cfa8ae243eeaec42d3176a06
Author: panbingkun 
AuthorDate: Tue Dec 13 14:02:36 2022 +0300

[SPARK-41406][SQL] Refactor error message for `NUM_COLUMNS_MISMATCH` to 
make it more generic

### What changes were proposed in this pull request?
The pr aims to refactor error message for `NUM_COLUMNS_MISMATCH` to make it 
more generic.

### Why are the changes needed?
The changes improve the error framework.

### Does this PR introduce _any_ user-facing change?
Yes.

### How was this patch tested?
Update existed UT.
Pass GA.

Closes #38937 from panbingkun/SPARK-41406.

Authored-by: panbingkun 
Signed-off-by: Max Gekk 
---
 core/src/main/resources/error/error-classes.json   |   2 +-
 .../sql/catalyst/analysis/CheckAnalysis.scala  |   4 +-
 .../plans/logical/basicLogicalOperators.scala  |   4 +-
 .../resources/sql-tests/results/except-all.sql.out |   6 +-
 .../sql-tests/results/intersect-all.sql.out|   6 +-
 .../native/widenSetOperationTypes.sql.out  | 140 ++---
 .../sql-tests/results/udf/udf-except-all.sql.out   |   6 +-
 .../results/udf/udf-intersect-all.sql.out  |   6 +-
 .../spark/sql/DataFrameSetOperationsSuite.scala|   9 +-
 .../scala/org/apache/spark/sql/SQLQuerySuite.scala |  22 +++-
 10 files changed, 110 insertions(+), 95 deletions(-)

diff --git a/core/src/main/resources/error/error-classes.json 
b/core/src/main/resources/error/error-classes.json
index e76328e970d..6faaf0af35f 100644
--- a/core/src/main/resources/error/error-classes.json
+++ b/core/src/main/resources/error/error-classes.json
@@ -943,7 +943,7 @@
   },
   "NUM_COLUMNS_MISMATCH" : {
 "message" : [
-  " can only be performed on tables with the same number of 
columns, but the first table has  columns and the 
 table has  columns."
+  " can only be performed on inputs with the same number of 
columns, but the first input has  columns and the 
 input has  columns."
 ]
   },
   "ORDER_BY_POS_OUT_OF_RANGE" : {
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
index 12dac5c632a..be812adaaa1 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
@@ -552,7 +552,7 @@ trait CheckAnalysis extends PredicateHelper with 
LookupCatalog with QueryErrorsB
   errorClass = "NUM_COLUMNS_MISMATCH",
   messageParameters = Map(
 "operator" -> toSQLStmt(operator.nodeName),
-"refNumColumns" -> ref.length.toString,
+"firstNumColumns" -> ref.length.toString,
 "invalidOrdinalNum" -> ordinalNumber(ti + 1),
 "invalidNumColumns" -> child.output.length.toString))
   }
@@ -565,7 +565,7 @@ trait CheckAnalysis extends PredicateHelper with 
LookupCatalog with QueryErrorsB
   e.failAnalysis(
 errorClass = "_LEGACY_ERROR_TEMP_2430",
 messageParameters = Map(
-  "operator" -> operator.nodeName,
+  "operator" -> toSQLStmt(operator.nodeName),
   "ci" -> ordinalNumber(ci),
   "ti" -> ordinalNumber(ti + 1),
   "dt1" -> dt1.catalogString,
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala
index 60586e4166c..878ad91c088 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala
@@ -342,7 +342,7 @@ case class Intersect(
 right: LogicalPlan,
 isAll: Boolean) extends SetOperation(left, right) {
 
-  override def nodeName: String = getClass.getSimpleName + ( if ( isAll ) 
"All" else "" )
+  override def nodeName: String = getClass.getSimpleName + ( if ( isAll ) " 
All" else "" )
 
   final override val nodePatterns: Seq[TreePattern] = Seq(INTERSECT)
 
@@ -372,7 +372,7 @@ case class Except(
 left: LogicalPlan,
 right: LogicalPlan,
 isAll: Boolean) extends SetOperation(left, right) {
-  override def nodeName: String = 

[spark] branch master updated: [SPARK-41468][SQL][FOLLOWUP] Handle NamedLambdaVariables in EquivalentExpressions

2022-12-13 Thread wenchen
This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 27f4d1ef848 [SPARK-41468][SQL][FOLLOWUP] Handle NamedLambdaVariables 
in EquivalentExpressions
27f4d1ef848 is described below

commit 27f4d1ef848caf357faaf90d7ee4f625e0a3b5d3
Author: Peter Toth 
AuthorDate: Tue Dec 13 17:05:08 2022 +0800

[SPARK-41468][SQL][FOLLOWUP] Handle NamedLambdaVariables in 
EquivalentExpressions

### What changes were proposed in this pull request?
This is a follow-up PR to https://github.com/apache/spark/pull/39010 to 
handle `NamedLambdaVariable`s too.

### Why are the changes needed?
To avoid possible issues with higer-order functions.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Existing UTs.

Closes #39046 from 
peter-toth/SPARK-41468-fix-planexpressions-in-equivalentexpressions-follow-up.

Authored-by: Peter Toth 
Signed-off-by: Wenchen Fan 
---
 .../spark/sql/catalyst/expressions/EquivalentExpressions.scala   | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/EquivalentExpressions.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/EquivalentExpressions.scala
index 3ffd9f9d887..330d66a21be 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/EquivalentExpressions.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/EquivalentExpressions.scala
@@ -144,9 +144,10 @@ class EquivalentExpressions {
 
   private def supportedExpression(e: Expression) = {
 !e.exists {
-  // `LambdaVariable` is usually used as a loop variable, which can't be 
evaluated ahead of the
-  // loop. So we can't evaluate sub-expressions containing 
`LambdaVariable` at the beginning.
+  // `LambdaVariable` is usually used as a loop variable and 
`NamedLambdaVariable` is used in
+  // higher-order functions, which can't be evaluated ahead of the 
execution.
   case _: LambdaVariable => true
+  case _: NamedLambdaVariable => true
 
   // `PlanExpression` wraps query plan. To compare query plans of 
`PlanExpression` on executor,
   // can cause error like NPE.


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org