[spark] branch master updated: [SPARK-38739][SQL][TESTS] Test the error class: INVALID_SYNTAX_FOR_CAST

2022-05-13 Thread maxgekk
This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 7221ea31b6b [SPARK-38739][SQL][TESTS] Test the error class: 
INVALID_SYNTAX_FOR_CAST
7221ea31b6b is described below

commit 7221ea31b6bbad0d87b22e5413b8979bee56321c
Author: panbingkun 
AuthorDate: Fri May 13 23:20:42 2022 +0300

[SPARK-38739][SQL][TESTS] Test the error class: INVALID_SYNTAX_FOR_CAST

## What changes were proposed in this pull request?
This PR aims to add a test for the error class INVALID_SYNTAX_FOR_CAST to 
`QueryExecutionErrors`. Also the method `invalidInputSyntaxForNumericError` is 
removed as no longer used.

### Why are the changes needed?
The changes improve test coverage, and document expected error messages in 
tests.

### Does this PR introduce any user-facing change?
No.

### How was this patch tested?
By running new test:
```
$ build/sbt "test:testOnly *QueryExecutionAnsiErrorsSuite"
```

Closes #36493 from panbingkun/SPARK-38739.

Authored-by: panbingkun 
Signed-off-by: Max Gekk 
---
 .../apache/spark/sql/errors/QueryExecutionErrors.scala  |  9 +
 .../sql/errors/QueryExecutionAnsiErrorsSuite.scala  | 17 -
 2 files changed, 17 insertions(+), 9 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala
index 447a820a128..e687417d7cc 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala
@@ -115,17 +115,10 @@ object QueryExecutionErrors extends QueryErrorsBase {
 context))
   }
 
-  def invalidInputSyntaxForNumericError(
-  e: NumberFormatException,
-  errorContext: String): NumberFormatException = {
-new NumberFormatException(s"${e.getMessage}. To return NULL instead, use 
'try_cast'. " +
-  s"If necessary set ${SQLConf.ANSI_ENABLED.key} to false to bypass this 
error." + errorContext)
-  }
-
   def invalidInputSyntaxForNumericError(
   to: DataType,
   s: UTF8String,
-  errorContext: String): NumberFormatException = {
+  errorContext: String): SparkNumberFormatException = {
 new SparkNumberFormatException(errorClass = "INVALID_SYNTAX_FOR_CAST",
   messageParameters = Array(toSQLType(to), toSQLValue(s, StringType),
 SQLConf.ANSI_ENABLED.key, errorContext))
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryExecutionAnsiErrorsSuite.scala
 
b/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryExecutionAnsiErrorsSuite.scala
index 78b78f99ab0..8aef4c6f345 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryExecutionAnsiErrorsSuite.scala
+++ 
b/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryExecutionAnsiErrorsSuite.scala
@@ -16,7 +16,7 @@
  */
 package org.apache.spark.sql.errors
 
-import org.apache.spark.{SparkArithmeticException, 
SparkArrayIndexOutOfBoundsException, SparkConf, SparkDateTimeException, 
SparkNoSuchElementException}
+import org.apache.spark.{SparkArithmeticException, 
SparkArrayIndexOutOfBoundsException, SparkConf, SparkDateTimeException, 
SparkNoSuchElementException, SparkNumberFormatException}
 import org.apache.spark.sql.QueryTest
 import org.apache.spark.sql.internal.SQLConf
 
@@ -124,4 +124,19 @@ class QueryExecutionAnsiErrorsSuite extends QueryTest with 
QueryErrorsSuiteBase
   |""".stripMargin
 )
   }
+
+  test("INVALID_SYNTAX_FOR_CAST: cast string to double") {
+checkErrorClass(
+  exception = intercept[SparkNumberFormatException] {
+sql("select CAST('xe23' AS DOUBLE)").collect()
+  },
+  errorClass = "INVALID_SYNTAX_FOR_CAST",
+  msg = """Invalid input syntax for type "DOUBLE": 'xe23'. """ 
+
+"""To return NULL instead, use 'try_cast'. If necessary set """ +
+"""spark.sql.ansi.enabled to false to bypass this error.
+  |== SQL(line 1, position 7) ==
+  |select CAST('xe23' AS DOUBLE)
+  |   ^^
+  |""".stripMargin)
+  }
 }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated (3d9049533d8 -> 23d0879a585)

2022-05-13 Thread dongjoon
This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from 3d9049533d8 [SPARK-39181][SQL] `SessionCatalog.reset` should not drop 
temp functions twice
 add 23d0879a585 [SPARK-39182][BUILD] Upgrade to Arrow 8.0.0

No new revisions were added by this update.

Summary of changes:
 dev/deps/spark-deps-hadoop-2-hive-2.3 | 8 
 dev/deps/spark-deps-hadoop-3-hive-2.3 | 8 
 pom.xml   | 2 +-
 3 files changed, 9 insertions(+), 9 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated: [SPARK-39181][SQL] `SessionCatalog.reset` should not drop temp functions twice

2022-05-13 Thread dongjoon
This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 3d9049533d8 [SPARK-39181][SQL] `SessionCatalog.reset` should not drop 
temp functions twice
3d9049533d8 is described below

commit 3d9049533d8bc75cc2c4832dc2fd6f8ac53efe21
Author: Wenchen Fan 
AuthorDate: Fri May 13 10:21:15 2022 -0700

[SPARK-39181][SQL] `SessionCatalog.reset` should not drop temp functions 
twice

### What changes were proposed in this pull request?

`SessionCatalog.reset` is a test only API and it drops the temp functions 
twice currently:
1. once with `listFunctions(DEFAULT_DATABASE)...` which list temp functions 
as well
2. once with `functionRegistry.clear()`

This PR changes `listFunctions` to `externalCatalog.listFunctions` which 
only list permanent functions.

### Why are the changes needed?

code simplification and probably makes tests run faster

### Does this PR introduce _any_ user-facing change?

no

### How was this patch tested?

N/A

Closes #36542 from cloud-fan/minor.

Authored-by: Wenchen Fan 
Signed-off-by: Dongjoon Hyun 
---
 .../apache/spark/sql/catalyst/catalog/SessionCatalog.scala| 11 ---
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
index 6b7f8a207d6..d6c80f98bf7 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
@@ -1818,13 +1818,10 @@ class SessionCatalog(
 listTables(DEFAULT_DATABASE).foreach { table =>
   dropTable(table, ignoreIfNotExists = false, purge = false)
 }
-listFunctions(DEFAULT_DATABASE).map(_._1).foreach { func =>
-  if (func.database.isDefined) {
-dropFunction(func, ignoreIfNotExists = false)
-  } else {
-dropTempFunction(func.funcName, ignoreIfNotExists = false)
-  }
-}
+// Temp functions are dropped below, we only need to drop permanent 
functions here.
+externalCatalog.listFunctions(DEFAULT_DATABASE, "*").map { f =>
+  FunctionIdentifier(f, Some(DEFAULT_DATABASE))
+}.foreach(dropFunction(_, ignoreIfNotExists = false))
 clearTempTables()
 globalTempViewManager.clear()
 functionRegistry.clear()


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated: [SPARK-38751][SQL][TESTS] Test the error class: UNRECOGNIZED_SQL_TYPE

2022-05-13 Thread maxgekk
This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new bbf3a2eafa0 [SPARK-38751][SQL][TESTS] Test the error class: 
UNRECOGNIZED_SQL_TYPE
bbf3a2eafa0 is described below

commit bbf3a2eafa004f712799261ef883dcc457a072fd
Author: panbingkun 
AuthorDate: Fri May 13 19:29:02 2022 +0300

[SPARK-38751][SQL][TESTS] Test the error class: UNRECOGNIZED_SQL_TYPE

## What changes were proposed in this pull request?
This PR aims to add a test for the error class UNRECOGNIZED_SQL_TYPE to 
`QueryExecutionErrorsSuite`.

### Why are the changes needed?
The changes improve test coverage, and document expected error messages in 
tests.

### Does this PR introduce any user-facing change?
No

### How was this patch tested?
By running new test:
```
$ build/sbt "sql/testOnly *QueryExecutionErrorsSuite*"
```

Closes #36463 from panbingkun/SPARK-38751.

Authored-by: panbingkun 
Signed-off-by: Max Gekk 
---
 .../sql/errors/QueryExecutionErrorsSuite.scala | 89 +-
 1 file changed, 86 insertions(+), 3 deletions(-)

diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryExecutionErrorsSuite.scala
 
b/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryExecutionErrorsSuite.scala
index 7a5592c148a..cf1551298a8 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryExecutionErrorsSuite.scala
+++ 
b/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryExecutionErrorsSuite.scala
@@ -19,23 +19,27 @@ package org.apache.spark.sql.errors
 
 import java.io.IOException
 import java.net.URL
-import java.util.{Locale, ServiceConfigurationError}
+import java.sql.{Connection, DriverManager, PreparedStatement, ResultSet, 
ResultSetMetaData}
+import java.util.{Locale, Properties, ServiceConfigurationError}
 
 import org.apache.hadoop.fs.{LocalFileSystem, Path}
 import org.apache.hadoop.fs.permission.FsPermission
+import org.mockito.Mockito.{mock, when}
 import test.org.apache.spark.sql.connector.JavaSimpleWritableDataSource
 
-import org.apache.spark.{SparkArithmeticException, 
SparkClassNotFoundException, SparkException, SparkIllegalArgumentException, 
SparkIllegalStateException, SparkRuntimeException, SparkSecurityException, 
SparkUnsupportedOperationException, SparkUpgradeException}
+import org.apache.spark.{SparkArithmeticException, 
SparkClassNotFoundException, SparkException, SparkIllegalArgumentException, 
SparkIllegalStateException, SparkRuntimeException, SparkSecurityException, 
SparkSQLException, SparkUnsupportedOperationException, SparkUpgradeException}
 import org.apache.spark.sql.{AnalysisException, DataFrame, QueryTest, SaveMode}
 import org.apache.spark.sql.catalyst.util.BadRecordException
 import org.apache.spark.sql.connector.SimpleWritableDataSource
 import org.apache.spark.sql.execution.QueryExecutionException
+import org.apache.spark.sql.execution.datasources.jdbc.{DriverRegistry, 
JDBCOptions}
 import org.apache.spark.sql.execution.datasources.orc.OrcTest
 import org.apache.spark.sql.execution.datasources.parquet.ParquetTest
 import org.apache.spark.sql.functions.{lit, lower, struct, sum, udf}
 import org.apache.spark.sql.internal.SQLConf
 import org.apache.spark.sql.internal.SQLConf.LegacyBehaviorPolicy.EXCEPTION
-import org.apache.spark.sql.types.{DecimalType, StructType, TimestampType}
+import org.apache.spark.sql.jdbc.{JdbcDialect, JdbcDialects}
+import org.apache.spark.sql.types.{DataType, DecimalType, MetadataBuilder, 
StructType, TimestampType}
 import org.apache.spark.sql.util.ArrowUtils
 import org.apache.spark.util.Utils
 
@@ -514,6 +518,85 @@ class QueryExecutionErrorsSuite
   "META-INF/services/org.apache.spark.sql.sources.DataSourceRegister")
 }
   }
+
+  test("UNRECOGNIZED_SQL_TYPE: unrecognized SQL type -100") {
+Utils.classForName("org.h2.Driver")
+
+val properties = new Properties()
+properties.setProperty("user", "testUser")
+properties.setProperty("password", "testPass")
+
+val url = "jdbc:h2:mem:testdb0"
+val urlWithUserAndPass = 
"jdbc:h2:mem:testdb0;user=testUser;password=testPass"
+val tableName = "test.table1"
+val unrecognizedColumnType = -100
+
+var conn: java.sql.Connection = null
+try {
+  conn = DriverManager.getConnection(url, properties)
+  conn.prepareStatement("create schema test").executeUpdate()
+  conn.commit()
+
+  conn.prepareStatement(s"create table $tableName (a INT)").executeUpdate()
+  conn.prepareStatement(
+s"insert into $tableName values (1)").executeUpdate()
+  conn.commit()
+} finally {
+  if (null != conn) {
+conn.close()
+  }
+}
+
+val testH2DialectUnrecognizedSQLType = new JdbcDialect {
+  override def canHandle(url: String): 

[spark] branch master updated: [SPARK-39138][SQL] Add ANSI general value specification and function - user

2022-05-13 Thread yao
This is an automated email from the ASF dual-hosted git repository.

yao pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new aabe3a75454 [SPARK-39138][SQL] Add ANSI general value specification 
and function - user
aabe3a75454 is described below

commit aabe3a75454462c9ce665a3d947a7edb91f37d13
Author: Kent Yao 
AuthorDate: Sat May 14 00:03:23 2022 +0800

[SPARK-39138][SQL] Add ANSI general value specification and function - user

### What changes were proposed in this pull request?

Add ANSI general value specification and function - user

### Why are the changes needed?

According to ANSI SQL,
```
CURRENT_USER and USER are semantically the same
```

USER is also supported by other systems like MySQL, PG, hive, etc.

### Does this PR introduce _any_ user-facing change?

new function added

for ansi mode, and if enforceReservedKeywords, USER always reserved

otherwise, it will be resolved as a literal if no attribute matches.

### How was this patch tested?

new tests

Closes #36497 from yaooqinn/SPARK-39138.

Authored-by: Kent Yao 
Signed-off-by: Kent Yao 
---
 .../spark/sql/catalyst/parser/SqlBaseParser.g4 |  2 +-
 .../spark/sql/catalyst/analysis/Analyzer.scala |  1 +
 .../sql/catalyst/analysis/FunctionRegistry.scala   |  1 +
 .../spark/sql/catalyst/parser/AstBuilder.scala |  2 +-
 .../sql-functions/sql-expression-schema.md |  1 +
 .../org/apache/spark/sql/MiscFunctionsSuite.scala  | 15 +++--
 .../ThriftServerWithSparkContextSuite.scala| 25 ++
 7 files changed, 30 insertions(+), 17 deletions(-)

diff --git 
a/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseParser.g4
 
b/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseParser.g4
index 7eace05b6b2..ed57e9062c1 100644
--- 
a/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseParser.g4
+++ 
b/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseParser.g4
@@ -816,7 +816,7 @@ datetimeUnit
 ;
 
 primaryExpression
-: name=(CURRENT_DATE | CURRENT_TIMESTAMP | CURRENT_USER)   
#currentLike
+: name=(CURRENT_DATE | CURRENT_TIMESTAMP | CURRENT_USER | USER)
   #currentLike
 | name=(TIMESTAMPADD | DATEADD) LEFT_PAREN unit=datetimeUnit COMMA 
unitsAmount=valueExpression COMMA timestamp=valueExpression RIGHT_PAREN 
#timestampadd
 | name=(TIMESTAMPDIFF | DATEDIFF) LEFT_PAREN unit=datetimeUnit COMMA 
startTimestamp=valueExpression COMMA endTimestamp=valueExpression RIGHT_PAREN   
 #timestampdiff
 | CASE whenClause+ (ELSE elseExpression=expression)? END   
#searchedCase
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
index 817a62fd1d8..20c1756ef4e 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
@@ -1682,6 +1682,7 @@ class Analyzer(override val catalogManager: 
CatalogManager)
 (CurrentDate().prettyName, () => CurrentDate(), toPrettySQL(_)),
 (CurrentTimestamp().prettyName, () => CurrentTimestamp(), toPrettySQL(_)),
 (CurrentUser().prettyName, () => CurrentUser(), toPrettySQL),
+("user", () => CurrentUser(), toPrettySQL),
 (VirtualColumn.hiveGroupingIdName, () => GroupingID(Nil), _ => 
VirtualColumn.hiveGroupingIdName)
   )
 
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
index 50f376c0ce6..5084753d2d4 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala
@@ -714,6 +714,7 @@ object FunctionRegistry {
 expression[CurrentDatabase]("current_database"),
 expression[CurrentCatalog]("current_catalog"),
 expression[CurrentUser]("current_user"),
+expression[CurrentUser]("user", setAlias = true),
 expression[CallMethodViaReflection]("reflect"),
 expression[CallMethodViaReflection]("java_method", true),
 expression[SparkVersion]("version"),
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
index e6f7dba863b..ff3b99fb815 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
+++ 

[spark] branch branch-3.3 updated: [SPARK-39157][SQL] H2Dialect should override getJDBCType so as make the data type is correct

2022-05-13 Thread wenchen
This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch branch-3.3
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.3 by this push:
 new 30834b847e7 [SPARK-39157][SQL] H2Dialect should override getJDBCType 
so as make the data type is correct
30834b847e7 is described below

commit 30834b847e7577cf694558d43fb618fc0b1eb09e
Author: Jiaan Geng 
AuthorDate: Fri May 13 22:01:04 2022 +0800

[SPARK-39157][SQL] H2Dialect should override getJDBCType so as make the 
data type is correct

### What changes were proposed in this pull request?
Currently, `H2Dialect` not implement `getJDBCType` of `JdbcDialect`, so the 
DS V2 push-down will throw exception show below:
```
Job aborted due to stage failure: Task 0 in stage 13.0 failed 1 times, most 
recent failure: Lost task 0.0 in stage 13.0 (TID 13) (jiaan-gengdembp executor 
driver):
 org.h2.jdbc.JdbcSQLNonTransientException: Unknown data type: "STRING"; SQL 
statement:
SELECT "DEPT","NAME","SALARY","BONUS","IS_MANAGER" FROM "test"."employee"  
WHERE ("BONUS" IS NOT NULL) AND ("DEPT" IS NOT NULL) AND (CAST("BONUS" AS 
string) LIKE '%30%') AND (CAST("DEPT" AS byte) > 1) AND (CAST("DEPT" AS short) 
> 1) AND (CAST("BONUS" AS decimal(20,2)) > 1200.00)[50004-210]
```
H2Dialect should implement `getJDBCType` of `JdbcDialect`.

### Why are the changes needed?
 make the H2 data type is correct.

### Does this PR introduce _any_ user-facing change?
'Yes'.
Fix a bug for `H2Dialect`.

### How was this patch tested?
New tests.

Closes #36516 from beliefer/SPARK-39157.

Authored-by: Jiaan Geng 
Signed-off-by: Wenchen Fan 
(cherry picked from commit fa3f096e02d408fbeab5f69af451ef8bc8f5b3db)
Signed-off-by: Wenchen Fan 
---
 .../scala/org/apache/spark/sql/jdbc/H2Dialect.scala | 13 -
 .../scala/org/apache/spark/sql/jdbc/JDBCV2Suite.scala   | 17 +
 2 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/sql/core/src/main/scala/org/apache/spark/sql/jdbc/H2Dialect.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/jdbc/H2Dialect.scala
index 0aa971c0d3a..56cadbe8e2c 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/jdbc/H2Dialect.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/jdbc/H2Dialect.scala
@@ -17,7 +17,7 @@
 
 package org.apache.spark.sql.jdbc
 
-import java.sql.SQLException
+import java.sql.{SQLException, Types}
 import java.util.Locale
 
 import scala.util.control.NonFatal
@@ -27,6 +27,8 @@ import 
org.apache.spark.sql.catalyst.analysis.{NoSuchNamespaceException, NoSuchT
 import org.apache.spark.sql.connector.expressions.Expression
 import org.apache.spark.sql.connector.expressions.aggregate.{AggregateFunc, 
GeneralAggregateFunc}
 import org.apache.spark.sql.errors.QueryCompilationErrors
+import org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils
+import org.apache.spark.sql.types.{BooleanType, ByteType, DataType, 
DecimalType, ShortType, StringType}
 
 private object H2Dialect extends JdbcDialect {
   override def canHandle(url: String): Boolean =
@@ -90,6 +92,15 @@ private object H2Dialect extends JdbcDialect {
 )
   }
 
+  override def getJDBCType(dt: DataType): Option[JdbcType] = dt match {
+case StringType => Option(JdbcType("CLOB", Types.CLOB))
+case BooleanType => Some(JdbcType("BOOLEAN", Types.BOOLEAN))
+case ShortType | ByteType => Some(JdbcType("SMALLINT", Types.SMALLINT))
+case t: DecimalType => Some(
+  JdbcType(s"NUMERIC(${t.precision},${t.scale})", Types.NUMERIC))
+case _ => JdbcUtils.getCommonJDBCType(dt)
+  }
+
   override def classifyException(message: String, e: Throwable): 
AnalysisException = {
 e match {
   case exception: SQLException =>
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCV2Suite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCV2Suite.scala
index b6f36b912f8..91526cef507 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCV2Suite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCV2Suite.scala
@@ -466,6 +466,23 @@ class JDBCV2Suite extends QueryTest with 
SharedSparkSession with ExplainSuiteHel
 checkFiltersRemoved(df7, false)
 checkPushedInfo(df7, "PushedFilters: [DEPT IS NOT NULL]")
 checkAnswer(df7, Seq(Row(6, "jen", 12000, 1200, true)))
+
+val df8 = sql(
+  """
+|SELECT * FROM h2.test.employee
+|WHERE cast(bonus as string) like '%30%'
+|AND cast(dept as byte) > 1
+|AND cast(dept as short) > 1
+|AND cast(bonus as decimal(20, 2)) > 1200""".stripMargin)
+checkFiltersRemoved(df8, ansiMode)
+val expectedPlanFragment8 = if (ansiMode) {
+  "PushedFilters: [BONUS IS NOT NULL, DEPT IS NOT NULL, " +
+"CAST(BONUS AS string) 

[spark] branch master updated (d7317b03e97 -> fa3f096e02d)

2022-05-13 Thread wenchen
This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from d7317b03e97 [SPARK-39178][CORE] SparkFatalException should show root 
cause when print error stack
 add fa3f096e02d [SPARK-39157][SQL] H2Dialect should override getJDBCType 
so as make the data type is correct

No new revisions were added by this update.

Summary of changes:
 .../scala/org/apache/spark/sql/jdbc/H2Dialect.scala | 13 -
 .../scala/org/apache/spark/sql/jdbc/JDBCV2Suite.scala   | 17 +
 2 files changed, 29 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch branch-3.3 updated: [SPARK-39178][CORE] SparkFatalException should show root cause when print error stack

2022-05-13 Thread maxgekk
This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch branch-3.3
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.3 by this push:
 new e743e68ce62 [SPARK-39178][CORE] SparkFatalException should show root 
cause when print error stack
e743e68ce62 is described below

commit e743e68ce62e18ced6c49a22f5d101c72b7bfbe2
Author: Angerszh 
AuthorDate: Fri May 13 16:47:11 2022 +0300

[SPARK-39178][CORE] SparkFatalException should show root cause when print 
error stack

### What changes were proposed in this pull request?
Our user meet an case when running broadcast, throw `SparkFatalException`, 
but in error stack, it don't show the error case.

### Why are the changes needed?
Make exception more clear

### Does this PR introduce _any_ user-facing change?
User can got root cause when application throw `SparkFatalException`.

### How was this patch tested?
For ut
```
  test("") {
throw new SparkFatalException(
  new OutOfMemoryError("Not enough memory to build and broadcast the 
table to all " +
  "worker nodes. As a workaround, you can either disable broadcast by 
setting " +
  s"driver memory by setting ${SparkLauncher.DRIVER_MEMORY} to a higher 
value.")
  .initCause(null))
  }
```

Before this pr:
```
[info]   org.apache.spark.util.SparkFatalException:
[info]   at 
org.apache.spark.SparkContextSuite.$anonfun$new$1(SparkContextSuite.scala:59)
[info]   at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85)
[info]   at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83)
[info]   at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
[info]   at org.scalatest.Transformer.apply(Transformer.scala:22)
[info]   at org.scalatest.Transformer.apply(Transformer.scala:20)
[info]   at 
org.scalatest.funsuite.AnyFunSuiteLike$$anon$1.apply(AnyFunSuiteLike.scala:190)
[info]   at 
org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:203)
[info]   at 
org.scalatest.funsuite.AnyFunSuiteLike.invokeWithFixture$1(AnyFunSuiteLike.scala:188)
[info]   at 
org.scalatest.funsuite.AnyFunSuiteLike.$anonfun$runTest$1(AnyFunSuiteLike.scala:200)
[info]   at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306)
[info]   at 
org.scalatest.funsuite.AnyFunSuiteLike.runTest(AnyFunSuiteLike.scala:200)
[info]   at 
org.scalatest.funsuite.AnyFunSuiteLike.runTest$(AnyFunSuiteLike.scala:182)
[info]   at 
org.apache.spark.SparkFunSuite.org$scalatest$BeforeAndAfterEach$$super$runTest(SparkFunSuite.scala:64)
[info]   at 
org.scalatest.BeforeAndAfterEach.runTest(BeforeAndAfterEach.scala:234)
[info]   at 
org.scalatest.BeforeAndAfterEach.runTest$(BeforeAndAfterEach.scala:227)
[info]   at org.apache.spark.SparkFunSuite.runTest(SparkFunSuite.scala:64)
[info]   at 
org.scalatest.funsuite.AnyFunSuiteLike.$anonfun$runTests$1(AnyFunSuiteLike.scala:233)
[info]   at 
org.scalatest.SuperEngine.$anonfun$runTestsInBranch$1(Engine.scala:413)
[info]   at scala.collection.immutable.List.foreach(List.scala:431)
[info]   at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401)
[info]   at org.scalatest.SuperEngine.runTestsInBranch(Engine.scala:396)
[info]   at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:475)
[info]   at 
org.scalatest.funsuite.AnyFunSuiteLike.runTests(AnyFunSuiteLike.scala:233)
[info]   at 
org.scalatest.funsuite.AnyFunSuiteLike.runTests$(AnyFunSuiteLike.scala:232)
[info]   at 
org.scalatest.funsuite.AnyFunSuite.runTests(AnyFunSuite.scala:1563)
[info]   at org.scalatest.Suite.run(Suite.scala:1112)
```

After this pr:
```
[info]   org.apache.spark.util.SparkFatalException: 
java.lang.OutOfMemoryError: Not enough memory to build and broadcast the table 
to all worker nodes. As a workaround, you can either disable broadcast by 
setting driver memory by setting spark.driver.memory to a higher value.
[info]   at 
org.apache.spark.SparkContextSuite.$anonfun$new$1(SparkContextSuite.scala:59)
[info]   at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85)
[info]   at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83)
[info]   at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
[info]   at org.scalatest.Transformer.apply(Transformer.scala:22)
[info]   at org.scalatest.Transformer.apply(Transformer.scala:20)
[info]   at 
org.scalatest.funsuite.AnyFunSuiteLike$$anon$1.apply(AnyFunSuiteLike.scala:190)
[info]   at 
org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:203)
[info]   at 
org.scalatest.funsuite.AnyFunSuiteLike.invokeWithFixture$1(AnyFunSuiteLike.scala:188)
[info]   at 
org.scalatest.funsuite.AnyFunSuiteLike.$anonfun$runTest$1(AnyFunSuiteLike.scala:200)
[info]   at 

[spark] branch master updated: [SPARK-39178][CORE] SparkFatalException should show root cause when print error stack

2022-05-13 Thread maxgekk
This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new d7317b03e97 [SPARK-39178][CORE] SparkFatalException should show root 
cause when print error stack
d7317b03e97 is described below

commit d7317b03e975f8dc1a8c276dd0a931e00c478717
Author: Angerszh 
AuthorDate: Fri May 13 16:47:11 2022 +0300

[SPARK-39178][CORE] SparkFatalException should show root cause when print 
error stack

### What changes were proposed in this pull request?
Our user meet an case when running broadcast, throw `SparkFatalException`, 
but in error stack, it don't show the error case.

### Why are the changes needed?
Make exception more clear

### Does this PR introduce _any_ user-facing change?
User can got root cause when application throw `SparkFatalException`.

### How was this patch tested?
For ut
```
  test("") {
throw new SparkFatalException(
  new OutOfMemoryError("Not enough memory to build and broadcast the 
table to all " +
  "worker nodes. As a workaround, you can either disable broadcast by 
setting " +
  s"driver memory by setting ${SparkLauncher.DRIVER_MEMORY} to a higher 
value.")
  .initCause(null))
  }
```

Before this pr:
```
[info]   org.apache.spark.util.SparkFatalException:
[info]   at 
org.apache.spark.SparkContextSuite.$anonfun$new$1(SparkContextSuite.scala:59)
[info]   at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85)
[info]   at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83)
[info]   at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
[info]   at org.scalatest.Transformer.apply(Transformer.scala:22)
[info]   at org.scalatest.Transformer.apply(Transformer.scala:20)
[info]   at 
org.scalatest.funsuite.AnyFunSuiteLike$$anon$1.apply(AnyFunSuiteLike.scala:190)
[info]   at 
org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:203)
[info]   at 
org.scalatest.funsuite.AnyFunSuiteLike.invokeWithFixture$1(AnyFunSuiteLike.scala:188)
[info]   at 
org.scalatest.funsuite.AnyFunSuiteLike.$anonfun$runTest$1(AnyFunSuiteLike.scala:200)
[info]   at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306)
[info]   at 
org.scalatest.funsuite.AnyFunSuiteLike.runTest(AnyFunSuiteLike.scala:200)
[info]   at 
org.scalatest.funsuite.AnyFunSuiteLike.runTest$(AnyFunSuiteLike.scala:182)
[info]   at 
org.apache.spark.SparkFunSuite.org$scalatest$BeforeAndAfterEach$$super$runTest(SparkFunSuite.scala:64)
[info]   at 
org.scalatest.BeforeAndAfterEach.runTest(BeforeAndAfterEach.scala:234)
[info]   at 
org.scalatest.BeforeAndAfterEach.runTest$(BeforeAndAfterEach.scala:227)
[info]   at org.apache.spark.SparkFunSuite.runTest(SparkFunSuite.scala:64)
[info]   at 
org.scalatest.funsuite.AnyFunSuiteLike.$anonfun$runTests$1(AnyFunSuiteLike.scala:233)
[info]   at 
org.scalatest.SuperEngine.$anonfun$runTestsInBranch$1(Engine.scala:413)
[info]   at scala.collection.immutable.List.foreach(List.scala:431)
[info]   at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401)
[info]   at org.scalatest.SuperEngine.runTestsInBranch(Engine.scala:396)
[info]   at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:475)
[info]   at 
org.scalatest.funsuite.AnyFunSuiteLike.runTests(AnyFunSuiteLike.scala:233)
[info]   at 
org.scalatest.funsuite.AnyFunSuiteLike.runTests$(AnyFunSuiteLike.scala:232)
[info]   at 
org.scalatest.funsuite.AnyFunSuite.runTests(AnyFunSuite.scala:1563)
[info]   at org.scalatest.Suite.run(Suite.scala:1112)
```

After this pr:
```
[info]   org.apache.spark.util.SparkFatalException: 
java.lang.OutOfMemoryError: Not enough memory to build and broadcast the table 
to all worker nodes. As a workaround, you can either disable broadcast by 
setting driver memory by setting spark.driver.memory to a higher value.
[info]   at 
org.apache.spark.SparkContextSuite.$anonfun$new$1(SparkContextSuite.scala:59)
[info]   at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85)
[info]   at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83)
[info]   at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
[info]   at org.scalatest.Transformer.apply(Transformer.scala:22)
[info]   at org.scalatest.Transformer.apply(Transformer.scala:20)
[info]   at 
org.scalatest.funsuite.AnyFunSuiteLike$$anon$1.apply(AnyFunSuiteLike.scala:190)
[info]   at 
org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:203)
[info]   at 
org.scalatest.funsuite.AnyFunSuiteLike.invokeWithFixture$1(AnyFunSuiteLike.scala:188)
[info]   at 
org.scalatest.funsuite.AnyFunSuiteLike.$anonfun$runTest$1(AnyFunSuiteLike.scala:200)
[info]   at 

[spark] branch master updated: [SPARK-39177][SQL] Provide query context on map key not exists error when WSCG is off

2022-05-13 Thread gengliang
This is an automated email from the ASF dual-hosted git repository.

gengliang pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 1afddf40743 [SPARK-39177][SQL] Provide query context on map key not 
exists error when WSCG is off
1afddf40743 is described below

commit 1afddf407436c3b315ec601fab5a4a1b2028e672
Author: Gengliang Wang 
AuthorDate: Fri May 13 21:45:06 2022 +0800

[SPARK-39177][SQL] Provide query context on map key not exists error when 
WSCG is off

### What changes were proposed in this pull request?

Similar to https://github.com/apache/spark/pull/36525, this PR provides 
query context for "map key not exists" runtime error when WSCG is off.

### Why are the changes needed?

Enhance the runtime error query context for "map key not exists" runtime 
error. After changes, it works when the whole stage codegen is not available.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

UT

Closes #36538 from gengliangwang/fixMapKeyContext.

Authored-by: Gengliang Wang 
Signed-off-by: Gengliang Wang 
---
 .../catalyst/expressions/collectionOperations.scala   |  6 ++
 .../catalyst/expressions/complexTypeExtractors.scala  | 13 ++---
 .../scala/org/apache/spark/sql/SQLQuerySuite.scala| 19 +++
 3 files changed, 35 insertions(+), 3 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
index 1b42ea5eb87..1bd934214f5 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
@@ -2261,6 +2261,12 @@ case class ElementAt(
 
   override protected def withNewChildrenInternal(
 newLeft: Expression, newRight: Expression): ElementAt = copy(left = 
newLeft, right = newRight)
+
+  override def initQueryContext(): String = if (failOnError) {
+origin.context
+  } else {
+""
+  }
 }
 
 /**
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeExtractors.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeExtractors.scala
index 45661c00c51..b84050c1837 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeExtractors.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeExtractors.scala
@@ -339,7 +339,8 @@ trait GetArrayItemUtil {
 /**
  * Common trait for [[GetMapValue]] and [[ElementAt]].
  */
-trait GetMapValueUtil extends BinaryExpression with ImplicitCastInputTypes {
+trait GetMapValueUtil
+  extends BinaryExpression with ImplicitCastInputTypes with 
SupportQueryContext {
 
   // todo: current search is O(n), improve it.
   def getValueEval(
@@ -365,7 +366,7 @@ trait GetMapValueUtil extends BinaryExpression with 
ImplicitCastInputTypes {
 
 if (!found) {
   if (failOnError) {
-throw QueryExecutionErrors.mapKeyNotExistError(ordinal, keyType, 
origin.context)
+throw QueryExecutionErrors.mapKeyNotExistError(ordinal, keyType, 
queryContext)
   } else {
 null
   }
@@ -398,7 +399,7 @@ trait GetMapValueUtil extends BinaryExpression with 
ImplicitCastInputTypes {
 }
 
 val keyJavaType = CodeGenerator.javaType(keyType)
-lazy val errorContext = ctx.addReferenceObj("errCtx", origin.context)
+lazy val errorContext = ctx.addReferenceObj("errCtx", queryContext)
 val keyDt = ctx.addReferenceObj("keyType", keyType, 
keyType.getClass.getName)
 nullSafeCodeGen(ctx, ev, (eval1, eval2) => {
   val keyNotFoundBranch = if (failOnError) {
@@ -488,4 +489,10 @@ case class GetMapValue(
   override protected def withNewChildrenInternal(
   newLeft: Expression, newRight: Expression): GetMapValue =
 copy(child = newLeft, key = newRight)
+
+  override def initQueryContext(): String = if (failOnError) {
+origin.context
+  } else {
+""
+  }
 }
diff --git a/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala
index c0f609ad817..0355bd90d04 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala
@@ -4357,6 +4357,25 @@ class SQLQuerySuite extends QueryTest with 
SharedSparkSession with AdaptiveSpark
 }
   }
 
+  test("SPARK-39177: Query context of getting map value should be serialized 
to executors" +
+" when WSCG is off") {
+withSQLConf(SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key -> "false",
+  SQLConf.ANSI_ENABLED.key -> 

[spark] branch branch-3.3 updated: [SPARK-39177][SQL] Provide query context on map key not exists error when WSCG is off

2022-05-13 Thread gengliang
This is an automated email from the ASF dual-hosted git repository.

gengliang pushed a commit to branch branch-3.3
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.3 by this push:
 new 1a49de67e3f [SPARK-39177][SQL] Provide query context on map key not 
exists error when WSCG is off
1a49de67e3f is described below

commit 1a49de67e3fa0d25e84540313688cde82d6001df
Author: Gengliang Wang 
AuthorDate: Fri May 13 21:45:06 2022 +0800

[SPARK-39177][SQL] Provide query context on map key not exists error when 
WSCG is off

### What changes were proposed in this pull request?

Similar to https://github.com/apache/spark/pull/36525, this PR provides 
query context for "map key not exists" runtime error when WSCG is off.

### Why are the changes needed?

Enhance the runtime error query context for "map key not exists" runtime 
error. After changes, it works when the whole stage codegen is not available.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

UT

Closes #36538 from gengliangwang/fixMapKeyContext.

Authored-by: Gengliang Wang 
Signed-off-by: Gengliang Wang 
(cherry picked from commit 1afddf407436c3b315ec601fab5a4a1b2028e672)
Signed-off-by: Gengliang Wang 
---
 .../catalyst/expressions/collectionOperations.scala   |  6 ++
 .../catalyst/expressions/complexTypeExtractors.scala  | 13 ++---
 .../scala/org/apache/spark/sql/SQLQuerySuite.scala| 19 +++
 3 files changed, 35 insertions(+), 3 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
index 1b42ea5eb87..1bd934214f5 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
@@ -2261,6 +2261,12 @@ case class ElementAt(
 
   override protected def withNewChildrenInternal(
 newLeft: Expression, newRight: Expression): ElementAt = copy(left = 
newLeft, right = newRight)
+
+  override def initQueryContext(): String = if (failOnError) {
+origin.context
+  } else {
+""
+  }
 }
 
 /**
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeExtractors.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeExtractors.scala
index 45661c00c51..b84050c1837 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeExtractors.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeExtractors.scala
@@ -339,7 +339,8 @@ trait GetArrayItemUtil {
 /**
  * Common trait for [[GetMapValue]] and [[ElementAt]].
  */
-trait GetMapValueUtil extends BinaryExpression with ImplicitCastInputTypes {
+trait GetMapValueUtil
+  extends BinaryExpression with ImplicitCastInputTypes with 
SupportQueryContext {
 
   // todo: current search is O(n), improve it.
   def getValueEval(
@@ -365,7 +366,7 @@ trait GetMapValueUtil extends BinaryExpression with 
ImplicitCastInputTypes {
 
 if (!found) {
   if (failOnError) {
-throw QueryExecutionErrors.mapKeyNotExistError(ordinal, keyType, 
origin.context)
+throw QueryExecutionErrors.mapKeyNotExistError(ordinal, keyType, 
queryContext)
   } else {
 null
   }
@@ -398,7 +399,7 @@ trait GetMapValueUtil extends BinaryExpression with 
ImplicitCastInputTypes {
 }
 
 val keyJavaType = CodeGenerator.javaType(keyType)
-lazy val errorContext = ctx.addReferenceObj("errCtx", origin.context)
+lazy val errorContext = ctx.addReferenceObj("errCtx", queryContext)
 val keyDt = ctx.addReferenceObj("keyType", keyType, 
keyType.getClass.getName)
 nullSafeCodeGen(ctx, ev, (eval1, eval2) => {
   val keyNotFoundBranch = if (failOnError) {
@@ -488,4 +489,10 @@ case class GetMapValue(
   override protected def withNewChildrenInternal(
   newLeft: Expression, newRight: Expression): GetMapValue =
 copy(child = newLeft, key = newRight)
+
+  override def initQueryContext(): String = if (failOnError) {
+origin.context
+  } else {
+""
+  }
 }
diff --git a/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala
index 68db57ea364..21ce009a907 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala
@@ -4404,6 +4404,25 @@ class SQLQuerySuite extends QueryTest with 
SharedSparkSession with AdaptiveSpark
 }
   }
 
+  test("SPARK-39177: Query context of getting map value should be serialized 
to executors" +
+" when 

[spark] branch branch-3.3 updated: [SPARK-39164][SQL][3.3] Wrap asserts/illegal state exceptions by the INTERNAL_ERROR exception in actions

2022-05-13 Thread maxgekk
This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch branch-3.3
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.3 by this push:
 new 1372f312052 [SPARK-39164][SQL][3.3] Wrap asserts/illegal state 
exceptions by the INTERNAL_ERROR exception in actions
1372f312052 is described below

commit 1372f312052dd0361e371e2ed63436f3e299c617
Author: Max Gekk 
AuthorDate: Fri May 13 16:43:53 2022 +0300

[SPARK-39164][SQL][3.3] Wrap asserts/illegal state exceptions by the 
INTERNAL_ERROR exception in actions

### What changes were proposed in this pull request?
In the PR, I propose to catch `java.lang.IllegalStateException` and 
`java.lang.AssertionError` (raised by asserts), and wrap them by Spark's 
exception w/ the `INTERNAL_ERROR` error class. The modification affects only 
actions so far.

This PR affects the case of missing bucket file. After the changes, Spark 
throws `SparkException` w/ `INTERNAL_ERROR` instead of `IllegalStateException`. 
Since this is not Spark's illegal state, the exception should be replaced by 
another runtime exception. Created the ticket SPARK-39163 to fix this.

This is a backport of https://github.com/apache/spark/pull/36500.

### Why are the changes needed?
To improve user experience with Spark SQL and unify representation of 
internal errors by using error classes like for other errors. Usually, users 
shouldn't observe asserts and illegal states, but even if such situation 
happens, they should see errors in the same way as other errors (w/ error class 
`INTERNAL_ERROR`).

### Does this PR introduce _any_ user-facing change?
Yes. At least, in one particular case, see the modified test suites and 
SPARK-39163.

### How was this patch tested?
By running the affected test suites:
```
$ build/sbt "test:testOnly *.BucketedReadWithoutHiveSupportSuite"
$ build/sbt "test:testOnly *.AdaptiveQueryExecSuite"
$ build/sbt "test:testOnly *.WholeStageCodegenSuite"
```

Authored-by: Max Gekk 
Signed-off-by: Max Gekk 
(cherry picked from commit f5c3f0c228fef7808d1f927e134595ddd4d31723)
Signed-off-by: Max Gekk 

Closes #36533 from MaxGekk/class-internal-error-3.3.

Authored-by: Max Gekk 
Signed-off-by: Max Gekk 
---
 .../main/scala/org/apache/spark/sql/Dataset.scala   | 21 -
 .../spark/sql/execution/DataSourceScanExec.scala|  1 +
 .../org/apache/spark/sql/execution/subquery.scala   |  1 +
 .../scala/org/apache/spark/sql/SubquerySuite.scala  | 10 ++
 .../sql/execution/WholeStageCodegenSuite.scala  | 14 --
 .../execution/adaptive/AdaptiveQueryExecSuite.scala |  9 ++---
 .../spark/sql/sources/BucketedReadSuite.scala   |  8 +---
 7 files changed, 43 insertions(+), 21 deletions(-)

diff --git a/sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
index 7d16a2f5eee..56f0e8978ec 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
@@ -27,7 +27,7 @@ import scala.util.control.NonFatal
 
 import org.apache.commons.lang3.StringUtils
 
-import org.apache.spark.TaskContext
+import org.apache.spark.{SparkException, SparkThrowable, TaskContext}
 import org.apache.spark.annotation.{DeveloperApi, Stable, Unstable}
 import org.apache.spark.api.java.JavaRDD
 import org.apache.spark.api.java.function._
@@ -3848,12 +3848,23 @@ class Dataset[T] private[sql](
 
   /**
* Wrap a Dataset action to track the QueryExecution and time cost, then 
report to the
-   * user-registered callback functions.
+   * user-registered callback functions, and also to convert asserts/illegal 
states to
+   * the internal error exception.
*/
   private def withAction[U](name: String, qe: QueryExecution)(action: 
SparkPlan => U) = {
-SQLExecution.withNewExecutionId(qe, Some(name)) {
-  qe.executedPlan.resetMetrics()
-  action(qe.executedPlan)
+try {
+  SQLExecution.withNewExecutionId(qe, Some(name)) {
+qe.executedPlan.resetMetrics()
+action(qe.executedPlan)
+  }
+} catch {
+  case e: SparkThrowable => throw e
+  case e @ (_: java.lang.IllegalStateException | _: 
java.lang.AssertionError) =>
+throw new SparkException(
+  errorClass = "INTERNAL_ERROR",
+  messageParameters = Array(s"""The "$name" action failed."""),
+  cause = e)
+  case e: Throwable => throw e
 }
   }
 
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala
index ac0f3af5725..1ec93a614b7 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala
+++ 

[spark] branch branch-3.3 updated: [SPARK-39165][SQL][3.3] Replace `sys.error` by `IllegalStateException`

2022-05-13 Thread maxgekk
This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch branch-3.3
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.3 by this push:
 new c2bd7bac76a [SPARK-39165][SQL][3.3] Replace `sys.error` by 
`IllegalStateException`
c2bd7bac76a is described below

commit c2bd7bac76a5cf7ffc5ef61a1df2b8bb5a72f131
Author: Max Gekk 
AuthorDate: Fri May 13 12:47:53 2022 +0300

[SPARK-39165][SQL][3.3] Replace `sys.error` by `IllegalStateException`

### What changes were proposed in this pull request?
Replace all invokes of `sys.error()` by throwing of `IllegalStateException` 
in the `sql` namespace.

This is a backport of https://github.com/apache/spark/pull/36524.

### Why are the changes needed?
In the context of wrapping all internal errors like asserts/illegal state 
exceptions (see https://github.com/apache/spark/pull/36500), it is impossible 
to distinguish `RuntimeException` of `sys.error()` from Spark's exceptions like 
`SparkRuntimeException`. The last one can be propagated to the user space but 
`sys.error` exceptions shouldn't be visible to users in regular cases.

### Does this PR introduce _any_ user-facing change?
No, shouldn't. sys.error shouldn't propagate exception to user space in 
regular cases.

### How was this patch tested?
By running the existing test suites.

Authored-by: Max Gekk 
Signed-off-by: Max Gekk 
(cherry picked from commit 95c7efd7571464d8adfb76fb22e47a5816cf73fb)
Signed-off-by: Max Gekk 

Closes #36532 from MaxGekk/sys_error-internal-3.3.

Authored-by: Max Gekk 
Signed-off-by: Max Gekk 
---
 .../scala/org/apache/spark/sql/execution/SparkStrategies.scala| 4 ++--
 .../org/apache/spark/sql/execution/datasources/DataSource.scala   | 8 
 .../sql/execution/datasources/parquet/ParquetWriteSupport.scala   | 3 +--
 .../apache/spark/sql/execution/exchange/ShuffleExchangeExec.scala | 4 ++--
 .../org/apache/spark/sql/execution/python/ExtractPythonUDFs.scala | 5 +++--
 .../scala/org/apache/spark/sql/execution/streaming/memory.scala   | 3 ++-
 .../execution/streaming/sources/TextSocketMicroBatchStream.scala  | 3 ++-
 .../src/main/scala/org/apache/spark/sql/execution/subquery.scala  | 3 ++-
 .../apache/spark/sql/execution/window/AggregateProcessor.scala| 2 +-
 .../org/apache/spark/sql/execution/window/WindowExecBase.scala| 8 
 .../src/main/scala/org/apache/spark/sql/hive/HiveInspectors.scala | 3 ++-
 .../scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala   | 2 +-
 12 files changed, 26 insertions(+), 22 deletions(-)

diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala
index 3b8a70ffe94..17f3cfbda89 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala
@@ -503,8 +503,8 @@ abstract class SparkStrategies extends 
QueryPlanner[SparkPlan] {
   
_.aggregateFunction.children.filterNot(_.foldable).toSet).distinct.length > 1) {
   // This is a sanity check. We should not reach here when we have 
multiple distinct
   // column sets. Our `RewriteDistinctAggregates` should take care 
this case.
-  sys.error("You hit a query analyzer bug. Please report your query to 
" +
-  "Spark user mailing list.")
+  throw new IllegalStateException(
+"You hit a query analyzer bug. Please report your query to Spark 
user mailing list.")
 }
 
 // Ideally this should be done in `NormalizeFloatingNumbers`, but we 
do it here because
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala
index 2bb3d48c145..143fb4cf960 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala
@@ -539,8 +539,8 @@ case class DataSource(
 DataWritingCommand.propogateMetrics(sparkSession.sparkContext, 
resolved, metrics)
 // Replace the schema with that of the DataFrame we just wrote out to 
avoid re-inferring
 copy(userSpecifiedSchema = 
Some(outputColumns.toStructType.asNullable)).resolveRelation()
-  case _ =>
-sys.error(s"${providingClass.getCanonicalName} does not allow create 
table as select.")
+  case _ => throw new IllegalStateException(
+s"${providingClass.getCanonicalName} does not allow create table as 
select.")
 }
   }
 
@@ -556,8 +556,8 @@ case class DataSource(
 disallowWritingIntervals(data.schema.map(_.dataType), 
forbidAnsiIntervals = false)
 

[spark] branch master updated (ab1bceff228 -> cdd33e83c39)

2022-05-13 Thread gengliang
This is an automated email from the ASF dual-hosted git repository.

gengliang pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from ab1bceff228 [SPARK-39159][SQL] Add new Dataset API for Offset
 add cdd33e83c39 [SPARK-39175][SQL] Provide runtime error query context for 
Cast when WSCG is off

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/expressions/Cast.scala  | 67 --
 .../scala/org/apache/spark/sql/SQLQuerySuite.scala | 27 -
 2 files changed, 64 insertions(+), 30 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch branch-3.3 updated: [SPARK-39175][SQL] Provide runtime error query context for Cast when WSCG is off

2022-05-13 Thread gengliang
This is an automated email from the ASF dual-hosted git repository.

gengliang pushed a commit to branch branch-3.3
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.3 by this push:
 new 27c03e5741a [SPARK-39175][SQL] Provide runtime error query context for 
Cast when WSCG is off
27c03e5741a is described below

commit 27c03e5741af25b7afacac727865e23f60ce61fa
Author: Gengliang Wang 
AuthorDate: Fri May 13 17:46:33 2022 +0800

[SPARK-39175][SQL] Provide runtime error query context for Cast when WSCG 
is off

### What changes were proposed in this pull request?

Similar to https://github.com/apache/spark/pull/36525, this PR provides 
runtime error query context for the Cast expression when WSCG is off.

### Why are the changes needed?

Enhance the runtime error query context of Cast expression. After changes, 
it works when the whole stage codegen is not available.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

UT

Closes #36535 from gengliangwang/fixCastContext.

Authored-by: Gengliang Wang 
Signed-off-by: Gengliang Wang 
(cherry picked from commit cdd33e83c3919c4475e2e1ef387acb604bea81b9)
Signed-off-by: Gengliang Wang 
---
 .../spark/sql/catalyst/expressions/Cast.scala  | 67 --
 .../scala/org/apache/spark/sql/SQLQuerySuite.scala | 27 -
 2 files changed, 64 insertions(+), 30 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
index 335a34514c2..17d571a70f2 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
@@ -277,7 +277,10 @@ object Cast {
   }
 }
 
-abstract class CastBase extends UnaryExpression with TimeZoneAwareExpression 
with NullIntolerant {
+abstract class CastBase extends UnaryExpression
+with TimeZoneAwareExpression
+with NullIntolerant
+with SupportQueryContext {
 
   def child: Expression
 
@@ -307,6 +310,12 @@ abstract class CastBase extends UnaryExpression with 
TimeZoneAwareExpression wit
 
   protected def ansiEnabled: Boolean
 
+  override def initQueryContext(): String = if (ansiEnabled) {
+origin.context
+  } else {
+""
+  }
+
   // When this cast involves TimeZone, it's only resolved if the timeZoneId is 
set;
   // Otherwise behave like Expression.resolved.
   override lazy val resolved: Boolean =
@@ -467,7 +476,7 @@ abstract class CastBase extends UnaryExpression with 
TimeZoneAwareExpression wit
   false
 } else {
   if (ansiEnabled) {
-throw QueryExecutionErrors.invalidInputSyntaxForBooleanError(s, 
origin.context)
+throw QueryExecutionErrors.invalidInputSyntaxForBooleanError(s, 
queryContext)
   } else {
 null
   }
@@ -499,7 +508,7 @@ abstract class CastBase extends UnaryExpression with 
TimeZoneAwareExpression wit
 case StringType =>
   buildCast[UTF8String](_, utfs => {
 if (ansiEnabled) {
-  DateTimeUtils.stringToTimestampAnsi(utfs, zoneId, origin.context)
+  DateTimeUtils.stringToTimestampAnsi(utfs, zoneId, queryContext)
 } else {
   DateTimeUtils.stringToTimestamp(utfs, zoneId).orNull
 }
@@ -524,14 +533,14 @@ abstract class CastBase extends UnaryExpression with 
TimeZoneAwareExpression wit
 // TimestampWritable.doubleToTimestamp
 case DoubleType =>
   if (ansiEnabled) {
-buildCast[Double](_, d => doubleToTimestampAnsi(d, origin.context))
+buildCast[Double](_, d => doubleToTimestampAnsi(d, queryContext))
   } else {
 buildCast[Double](_, d => doubleToTimestamp(d))
   }
 // TimestampWritable.floatToTimestamp
 case FloatType =>
   if (ansiEnabled) {
-buildCast[Float](_, f => doubleToTimestampAnsi(f.toDouble, 
origin.context))
+buildCast[Float](_, f => doubleToTimestampAnsi(f.toDouble, 
queryContext))
   } else {
 buildCast[Float](_, f => doubleToTimestamp(f.toDouble))
   }
@@ -541,7 +550,7 @@ abstract class CastBase extends UnaryExpression with 
TimeZoneAwareExpression wit
 case StringType =>
   buildCast[UTF8String](_, utfs => {
 if (ansiEnabled) {
-  DateTimeUtils.stringToTimestampWithoutTimeZoneAnsi(utfs, 
origin.context)
+  DateTimeUtils.stringToTimestampWithoutTimeZoneAnsi(utfs, 
queryContext)
 } else {
   DateTimeUtils.stringToTimestampWithoutTimeZone(utfs).orNull
 }
@@ -574,7 +583,7 @@ abstract class CastBase extends UnaryExpression with 
TimeZoneAwareExpression wit
   private[this] def castToDate(from: DataType): Any => Any = from match {
 case StringType =>
   if 

[spark] branch master updated (e336567c8a9 -> ab1bceff228)

2022-05-13 Thread wenchen
This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


from e336567c8a9 [SPARK-39166][SQL] Provide runtime error query context for 
binary arithmetic when WSCG is off
 add ab1bceff228 [SPARK-39159][SQL] Add new Dataset API for Offset

No new revisions were added by this update.

Summary of changes:
 .../sql/catalyst/analysis/CheckAnalysis.scala  |  7 ---
 .../sql/catalyst/analysis/AnalysisErrorSuite.scala |  7 ---
 .../main/scala/org/apache/spark/sql/Dataset.scala  | 10 +
 .../org/apache/spark/sql/DataFrameSuite.scala  | 24 ++
 4 files changed, 34 insertions(+), 14 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org