[spark] branch branch-3.2 updated (68d89898a81 -> 0eb0abe804c)
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a change to branch branch-3.2 in repository https://gitbox.apache.org/repos/asf/spark.git from 68d89898a81 [SPARK-39240][INFRA][BUILD] Source and binary releases using different tool to generate hashes for integrity add 0eb0abe804c [SPARK-39237][DOCS][3.2] Update the ANSI SQL mode documentation No new revisions were added by this update. Summary of changes: docs/sql-ref-ansi-compliance.md | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-39167][SQL] Throw an exception w/ an error class for multiple rows from a subquery used as an expression
This is an automated email from the ASF dual-hosted git repository. maxgekk pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 49562f41678 [SPARK-39167][SQL] Throw an exception w/ an error class for multiple rows from a subquery used as an expression 49562f41678 is described below commit 49562f416788cab05b3f82a2471a1f2f6561a1d8 Author: panbingkun AuthorDate: Sat May 21 07:50:59 2022 +0300 [SPARK-39167][SQL] Throw an exception w/ an error class for multiple rows from a subquery used as an expression ### What changes were proposed in this pull request? In the PR, I propose to use the MULTI_VALUE_SUBQUERY_ERROR error classes for multiple rows from a subquery used as an expression. ### Why are the changes needed? Porting the executing errors for multiple rows from a subquery used as an expression to the new error framework should improve user experience with Spark SQL. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Added new test suite Closes #36580 from panbingkun/SPARK-39167. Authored-by: panbingkun Signed-off-by: Max Gekk --- core/src/main/resources/error/error-classes.json | 3 +++ .../spark/sql/errors/QueryExecutionErrors.scala| 5 .../org/apache/spark/sql/execution/subquery.scala | 5 ++-- .../scala/org/apache/spark/sql/SubquerySuite.scala | 11 - .../sql/errors/QueryExecutionErrorsSuite.scala | 27 ++ 5 files changed, 37 insertions(+), 14 deletions(-) diff --git a/core/src/main/resources/error/error-classes.json b/core/src/main/resources/error/error-classes.json index 1a139c018e8..f6fba105872 100644 --- a/core/src/main/resources/error/error-classes.json +++ b/core/src/main/resources/error/error-classes.json @@ -160,6 +160,9 @@ "MULTI_UDF_INTERFACE_ERROR" : { "message" : [ "Not allowed to implement multiple UDF interfaces, UDF class " ] }, + "MULTI_VALUE_SUBQUERY_ERROR" : { +"message" : [ "more than one row returned by a subquery used as an expression: " ] + }, "NON_LITERAL_PIVOT_VALUES" : { "message" : [ "Literal expressions required for pivot values, found ''" ], "sqlState" : "42000" diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala index 1e664100545..f79b30f0d0f 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala @@ -2005,4 +2005,9 @@ object QueryExecutionErrors extends QueryErrorsBase { new SparkException(errorClass = "INVALID_BUCKET_FILE", messageParameters = Array(path), cause = null) } + + def multipleRowSubqueryError(plan: String): Throwable = { +new SparkException( + errorClass = "MULTI_VALUE_SUBQUERY_ERROR", messageParameters = Array(plan), cause = null) + } } diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/subquery.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/subquery.scala index 209b0f79243..c6f5983f243 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/subquery.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/subquery.scala @@ -25,6 +25,7 @@ import org.apache.spark.sql.catalyst.expressions.codegen.{CodegenContext, ExprCo import org.apache.spark.sql.catalyst.rules.Rule import org.apache.spark.sql.catalyst.trees.{LeafLike, UnaryLike} import org.apache.spark.sql.catalyst.trees.TreePattern._ +import org.apache.spark.sql.errors.QueryExecutionErrors import org.apache.spark.sql.internal.SQLConf import org.apache.spark.sql.types.{BooleanType, DataType} @@ -79,9 +80,7 @@ case class ScalarSubquery( def updateResult(): Unit = { val rows = plan.executeCollect() if (rows.length > 1) { - // TODO(SPARK-39167): Throw an exception w/ an error class for multiple rows from a subquery - throw new IllegalStateException( -s"more than one row returned by a subquery used as an expression:\n$plan") + throw QueryExecutionErrors.multipleRowSubqueryError(plan.toString) } if (rows.length == 1) { assert(rows(0).numFields == 1, diff --git a/sql/core/src/test/scala/org/apache/spark/sql/SubquerySuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/SubquerySuite.scala index 396fca47634..500913fb289 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/SubquerySuite.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/SubquerySuite.scala @@ -19,7 +19,6 @@ package org.apache.spark.sql import scala.collection.mutable.ArrayBuffer -import org.apache.spark.SparkException import org.apache.spark.sql.catalyst.expressions.SubqueryExpression import
[spark] branch master updated: [SPARK-39213][SQL] Create ANY_VALUE aggregate function
This is an automated email from the ASF dual-hosted git repository. maxgekk pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new efc1e8ac8bc [SPARK-39213][SQL] Create ANY_VALUE aggregate function efc1e8ac8bc is described below commit efc1e8ac8bc61872601ac2244629a9d54f8889fb Author: Vitalii Li AuthorDate: Fri May 20 22:28:18 2022 +0300 [SPARK-39213][SQL] Create ANY_VALUE aggregate function ### What changes were proposed in this pull request? Adding implementation for ANY_VALUE aggregate function. During optimization stage it is rewritten to `First` aggregate function. ### Why are the changes needed? This feature provides feature parity with popular DBs and DWHs ### Does this PR introduce _any_ user-facing change? Yes - introducing new aggregate function `ANY_VALUE`. Respective documentation is updated. ### How was this patch tested? Unit tests Closes #36584 from vli-databricks/SPARK-39213. Authored-by: Vitalii Li Signed-off-by: Max Gekk --- docs/sql-ref-ansi-compliance.md| 1 + .../spark/sql/catalyst/parser/SqlBaseLexer.g4 | 1 + .../spark/sql/catalyst/parser/SqlBaseParser.g4 | 3 + .../spark/sql/catalyst/analysis/Analyzer.scala | 1 + .../sql/catalyst/analysis/FunctionRegistry.scala | 1 + .../catalyst/expressions/aggregate/AnyValue.scala | 64 +++ .../spark/sql/catalyst/parser/AstBuilder.scala | 10 +- .../spark/sql/catalyst/SQLKeywordSuite.scala | 2 +- .../expressions/aggregate/FirstLastTestSuite.scala | 4 + .../sql-functions/sql-expression-schema.md | 1 + .../resources/sql-tests/inputs/udf/udf-window.sql | 8 +- .../src/test/resources/sql-tests/inputs/window.sql | 29 +- .../sql-tests/results/udf/udf-window.sql.out | 46 +- .../resources/sql-tests/results/window.sql.out | 574 +++-- 14 files changed, 446 insertions(+), 299 deletions(-) diff --git a/docs/sql-ref-ansi-compliance.md b/docs/sql-ref-ansi-compliance.md index 257f53caef1..bb55cec52f5 100644 --- a/docs/sql-ref-ansi-compliance.md +++ b/docs/sql-ref-ansi-compliance.md @@ -346,6 +346,7 @@ Below is a list of all the keywords in Spark SQL. |AND|reserved|non-reserved|reserved| |ANTI|non-reserved|strict-non-reserved|non-reserved| |ANY|reserved|non-reserved|reserved| +|ANY_VALUE|non-reserved|non-reserved|non-reserved| |ARCHIVE|non-reserved|non-reserved|non-reserved| |ARRAY|non-reserved|non-reserved|reserved| |AS|reserved|non-reserved|reserved| diff --git a/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseLexer.g4 b/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseLexer.g4 index fac87c62de0..1cbd6d24dea 100644 --- a/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseLexer.g4 +++ b/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseLexer.g4 @@ -95,6 +95,7 @@ ANALYZE: 'ANALYZE'; AND: 'AND'; ANTI: 'ANTI'; ANY: 'ANY'; +ANY_VALUE: 'ANY_VALUE'; ARCHIVE: 'ARCHIVE'; ARRAY: 'ARRAY'; AS: 'AS'; diff --git a/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseParser.g4 b/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseParser.g4 index ed57e9062c1..ce37a09d5ba 100644 --- a/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseParser.g4 +++ b/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseParser.g4 @@ -824,6 +824,7 @@ primaryExpression | name=(CAST | TRY_CAST) LEFT_PAREN expression AS dataType RIGHT_PAREN #cast | STRUCT LEFT_PAREN (argument+=namedExpression (COMMA argument+=namedExpression)*)? RIGHT_PAREN #struct | FIRST LEFT_PAREN expression (IGNORE NULLS)? RIGHT_PAREN #first +| ANY_VALUE LEFT_PAREN expression (IGNORE NULLS)? RIGHT_PAREN #any_value | LAST LEFT_PAREN expression (IGNORE NULLS)? RIGHT_PAREN #last | POSITION LEFT_PAREN substr=valueExpression IN str=valueExpression RIGHT_PAREN#position | constant #constantDefault @@ -1072,6 +1073,7 @@ ansiNonReserved | ALTER | ANALYZE | ANTI +| ANY_VALUE | ARCHIVE | ARRAY | ASC @@ -1314,6 +1316,7 @@ nonReserved | ANALYZE | AND | ANY +| ANY_VALUE | ARCHIVE | ARRAY | AS diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala index 4dd2081c67f..c5bee6f55fe 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala +++
[spark] branch branch-3.2 updated: [SPARK-39240][INFRA][BUILD] Source and binary releases using different tool to generate hashes for integrity
This is an automated email from the ASF dual-hosted git repository. srowen pushed a commit to branch branch-3.2 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.2 by this push: new 68d89898a81 [SPARK-39240][INFRA][BUILD] Source and binary releases using different tool to generate hashes for integrity 68d89898a81 is described below commit 68d89898a81c39802b5f5036d8d3690acbb36ef4 Author: Kent Yao AuthorDate: Fri May 20 10:54:53 2022 -0500 [SPARK-39240][INFRA][BUILD] Source and binary releases using different tool to generate hashes for integrity ### What changes were proposed in this pull request? unify the hash generator for release files. ### Why are the changes needed? Currently, we use `shasum` for source but `gpg` for binary, since https://github.com/apache/spark/pull/30123 this confuses me when validating the integrities of spark 3.3.0 RC https://dist.apache.org/repos/dist/dev/spark/v3.3.0-rc2-bin/ ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? test script manually Closes #36619 from yaooqinn/SPARK-39240. Authored-by: Kent Yao Signed-off-by: Sean Owen (cherry picked from commit 3e783375097d14f1c28eb9b0e08075f1f8daa4a2) Signed-off-by: Sean Owen --- dev/create-release/release-build.sh | 12 +++- 1 file changed, 3 insertions(+), 9 deletions(-) diff --git a/dev/create-release/release-build.sh b/dev/create-release/release-build.sh index 96b8d4e3d9c..f49d77be350 100755 --- a/dev/create-release/release-build.sh +++ b/dev/create-release/release-build.sh @@ -283,9 +283,7 @@ if [[ "$1" == "package" ]]; then echo $GPG_PASSPHRASE | $GPG --passphrase-fd 0 --armour \ --output $R_DIST_NAME.asc \ --detach-sig $R_DIST_NAME - echo $GPG_PASSPHRASE | $GPG --passphrase-fd 0 --print-md \ -SHA512 $R_DIST_NAME > \ -$R_DIST_NAME.sha512 + shasum -a 512 $R_DIST_NAME > $R_DIST_NAME.sha512 fi if [[ -n $PIP_FLAG ]]; then @@ -296,9 +294,7 @@ if [[ "$1" == "package" ]]; then echo $GPG_PASSPHRASE | $GPG --passphrase-fd 0 --armour \ --output $PYTHON_DIST_NAME.asc \ --detach-sig $PYTHON_DIST_NAME - echo $GPG_PASSPHRASE | $GPG --passphrase-fd 0 --print-md \ -SHA512 $PYTHON_DIST_NAME > \ -$PYTHON_DIST_NAME.sha512 + shasum -a 512 $PYTHON_DIST_NAME > $PYTHON_DIST_NAME.sha512 fi echo "Copying and signing regular binary distribution" @@ -306,9 +302,7 @@ if [[ "$1" == "package" ]]; then echo $GPG_PASSPHRASE | $GPG --passphrase-fd 0 --armour \ --output spark-$SPARK_VERSION-bin-$NAME.tgz.asc \ --detach-sig spark-$SPARK_VERSION-bin-$NAME.tgz -echo $GPG_PASSPHRASE | $GPG --passphrase-fd 0 --print-md \ - SHA512 spark-$SPARK_VERSION-bin-$NAME.tgz > \ - spark-$SPARK_VERSION-bin-$NAME.tgz.sha512 +shasum -a 512 spark-$SPARK_VERSION-bin-$NAME.tgz > spark-$SPARK_VERSION-bin-$NAME.tgz.sha512 } # List of binary packages built. Populates two associative arrays, where the key is the "name" of - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.3 updated: [SPARK-39240][INFRA][BUILD] Source and binary releases using different tool to generate hashes for integrity
This is an automated email from the ASF dual-hosted git repository. srowen pushed a commit to branch branch-3.3 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.3 by this push: new 3f77be288ff [SPARK-39240][INFRA][BUILD] Source and binary releases using different tool to generate hashes for integrity 3f77be288ff is described below commit 3f77be288ffee792ef6bb49c65132f48b269e142 Author: Kent Yao AuthorDate: Fri May 20 10:54:53 2022 -0500 [SPARK-39240][INFRA][BUILD] Source and binary releases using different tool to generate hashes for integrity ### What changes were proposed in this pull request? unify the hash generator for release files. ### Why are the changes needed? Currently, we use `shasum` for source but `gpg` for binary, since https://github.com/apache/spark/pull/30123 this confuses me when validating the integrities of spark 3.3.0 RC https://dist.apache.org/repos/dist/dev/spark/v3.3.0-rc2-bin/ ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? test script manually Closes #36619 from yaooqinn/SPARK-39240. Authored-by: Kent Yao Signed-off-by: Sean Owen (cherry picked from commit 3e783375097d14f1c28eb9b0e08075f1f8daa4a2) Signed-off-by: Sean Owen --- dev/create-release/release-build.sh | 12 +++- 1 file changed, 3 insertions(+), 9 deletions(-) diff --git a/dev/create-release/release-build.sh b/dev/create-release/release-build.sh index a65d02289c0..78fd06ba2be 100755 --- a/dev/create-release/release-build.sh +++ b/dev/create-release/release-build.sh @@ -283,9 +283,7 @@ if [[ "$1" == "package" ]]; then echo $GPG_PASSPHRASE | $GPG --passphrase-fd 0 --armour \ --output $R_DIST_NAME.asc \ --detach-sig $R_DIST_NAME - echo $GPG_PASSPHRASE | $GPG --passphrase-fd 0 --print-md \ -SHA512 $R_DIST_NAME > \ -$R_DIST_NAME.sha512 + shasum -a 512 $R_DIST_NAME > $R_DIST_NAME.sha512 fi if [[ -n $PIP_FLAG ]]; then @@ -296,9 +294,7 @@ if [[ "$1" == "package" ]]; then echo $GPG_PASSPHRASE | $GPG --passphrase-fd 0 --armour \ --output $PYTHON_DIST_NAME.asc \ --detach-sig $PYTHON_DIST_NAME - echo $GPG_PASSPHRASE | $GPG --passphrase-fd 0 --print-md \ -SHA512 $PYTHON_DIST_NAME > \ -$PYTHON_DIST_NAME.sha512 + shasum -a 512 $PYTHON_DIST_NAME > $PYTHON_DIST_NAME.sha512 fi echo "Copying and signing regular binary distribution" @@ -306,9 +302,7 @@ if [[ "$1" == "package" ]]; then echo $GPG_PASSPHRASE | $GPG --passphrase-fd 0 --armour \ --output spark-$SPARK_VERSION-bin-$NAME.tgz.asc \ --detach-sig spark-$SPARK_VERSION-bin-$NAME.tgz -echo $GPG_PASSPHRASE | $GPG --passphrase-fd 0 --print-md \ - SHA512 spark-$SPARK_VERSION-bin-$NAME.tgz > \ - spark-$SPARK_VERSION-bin-$NAME.tgz.sha512 +shasum -a 512 spark-$SPARK_VERSION-bin-$NAME.tgz > spark-$SPARK_VERSION-bin-$NAME.tgz.sha512 } # List of binary packages built. Populates two associative arrays, where the key is the "name" of - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-39240][INFRA][BUILD] Source and binary releases using different tool to generate hashes for integrity
This is an automated email from the ASF dual-hosted git repository. srowen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 3e783375097 [SPARK-39240][INFRA][BUILD] Source and binary releases using different tool to generate hashes for integrity 3e783375097 is described below commit 3e783375097d14f1c28eb9b0e08075f1f8daa4a2 Author: Kent Yao AuthorDate: Fri May 20 10:54:53 2022 -0500 [SPARK-39240][INFRA][BUILD] Source and binary releases using different tool to generate hashes for integrity ### What changes were proposed in this pull request? unify the hash generator for release files. ### Why are the changes needed? Currently, we use `shasum` for source but `gpg` for binary, since https://github.com/apache/spark/pull/30123 this confuses me when validating the integrities of spark 3.3.0 RC https://dist.apache.org/repos/dist/dev/spark/v3.3.0-rc2-bin/ ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? test script manually Closes #36619 from yaooqinn/SPARK-39240. Authored-by: Kent Yao Signed-off-by: Sean Owen --- dev/create-release/release-build.sh | 12 +++- 1 file changed, 3 insertions(+), 9 deletions(-) diff --git a/dev/create-release/release-build.sh b/dev/create-release/release-build.sh index a65d02289c0..78fd06ba2be 100755 --- a/dev/create-release/release-build.sh +++ b/dev/create-release/release-build.sh @@ -283,9 +283,7 @@ if [[ "$1" == "package" ]]; then echo $GPG_PASSPHRASE | $GPG --passphrase-fd 0 --armour \ --output $R_DIST_NAME.asc \ --detach-sig $R_DIST_NAME - echo $GPG_PASSPHRASE | $GPG --passphrase-fd 0 --print-md \ -SHA512 $R_DIST_NAME > \ -$R_DIST_NAME.sha512 + shasum -a 512 $R_DIST_NAME > $R_DIST_NAME.sha512 fi if [[ -n $PIP_FLAG ]]; then @@ -296,9 +294,7 @@ if [[ "$1" == "package" ]]; then echo $GPG_PASSPHRASE | $GPG --passphrase-fd 0 --armour \ --output $PYTHON_DIST_NAME.asc \ --detach-sig $PYTHON_DIST_NAME - echo $GPG_PASSPHRASE | $GPG --passphrase-fd 0 --print-md \ -SHA512 $PYTHON_DIST_NAME > \ -$PYTHON_DIST_NAME.sha512 + shasum -a 512 $PYTHON_DIST_NAME > $PYTHON_DIST_NAME.sha512 fi echo "Copying and signing regular binary distribution" @@ -306,9 +302,7 @@ if [[ "$1" == "package" ]]; then echo $GPG_PASSPHRASE | $GPG --passphrase-fd 0 --armour \ --output spark-$SPARK_VERSION-bin-$NAME.tgz.asc \ --detach-sig spark-$SPARK_VERSION-bin-$NAME.tgz -echo $GPG_PASSPHRASE | $GPG --passphrase-fd 0 --print-md \ - SHA512 spark-$SPARK_VERSION-bin-$NAME.tgz > \ - spark-$SPARK_VERSION-bin-$NAME.tgz.sha512 +shasum -a 512 spark-$SPARK_VERSION-bin-$NAME.tgz > spark-$SPARK_VERSION-bin-$NAME.tgz.sha512 } # List of binary packages built. Populates two associative arrays, where the key is the "name" of - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark-website] branch asf-site updated: Add info for verification of releases
This is an automated email from the ASF dual-hosted git repository. srowen pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/spark-website.git The following commit(s) were added to refs/heads/asf-site by this push: new d717e4d3b Add info for verification of releases d717e4d3b is described below commit d717e4d3b5fdb3bc101fff8de68280faa04d3456 Author: Kent Yao AuthorDate: Fri May 20 10:35:49 2022 -0500 Add info for verification of releases Author: Kent Yao Closes #390 from yaooqinn/verify. --- downloads.md| 2 +- site/downloads.html | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/downloads.md b/downloads.md index bf5a98ef0..fb3347fa8 100644 --- a/downloads.md +++ b/downloads.md @@ -26,7 +26,7 @@ window.onload = function () { 3. Download Spark: -4. Verify this release using the and [project release KEYS](https://downloads.apache.org/spark/KEYS). +4. Verify this release using the and [project release KEYS](https://downloads.apache.org/spark/KEYS) by following these [procedures](https://www.apache.org/info/verification.html). Note that Spark 3 is pre-built with Scala 2.12 in general and Spark 3.2+ provides additional pre-built distribution with Scala 2.13. diff --git a/site/downloads.html b/site/downloads.html index 097736b8e..76b02d590 100644 --- a/site/downloads.html +++ b/site/downloads.html @@ -146,7 +146,7 @@ window.onload = function () { Download Spark: -Verify this release using the and https://downloads.apache.org/spark/KEYS;>project release KEYS. +Verify this release using the and https://downloads.apache.org/spark/KEYS;>project release KEYS by following these https://www.apache.org/info/verification.html;>procedures. - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] srowen closed pull request #390: Add info for verification of releases
srowen closed pull request #390: Add info for verification of releases URL: https://github.com/apache/spark-website/pull/390 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] yaooqinn opened a new pull request, #390: Add info for verification of releases
yaooqinn opened a new pull request, #390: URL: https://github.com/apache/spark-website/pull/390 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.3 updated: [SPARK-39237][DOCS] Update the ANSI SQL mode documentation
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a commit to branch branch-3.3 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.3 by this push: new ab057c72509 [SPARK-39237][DOCS] Update the ANSI SQL mode documentation ab057c72509 is described below commit ab057c72509006cbd8b501b6be4eb26793dc1e71 Author: Gengliang Wang AuthorDate: Fri May 20 16:58:22 2022 +0800 [SPARK-39237][DOCS] Update the ANSI SQL mode documentation ### What changes were proposed in this pull request? 1. Remove the Experimental notation in ANSI SQL compliance doc 2. Update the description of `spark.sql.ansi.enabled`, since the ANSI reversed keyword is disabled by default now ### Why are the changes needed? 1. The ANSI SQL dialect is GAed in Spark 3.2 release: https://spark.apache.org/releases/spark-release-3-2-0.html We should not mark it as "Experimental" in the doc. 2. The ANSI reversed keyword is disabled by default now ### Does this PR introduce _any_ user-facing change? No, just doc change ### How was this patch tested? Doc preview: https://user-images.githubusercontent.com/1097932/169444094-de9c33c2-1b01-4fc3-b583-b752c71e16d8.png;> https://user-images.githubusercontent.com/1097932/169472239-1edf218f-1f7b-48ec-bf2a-5d043600f1bc.png;> Closes #36614 from gengliangwang/updateAnsiDoc. Authored-by: Gengliang Wang Signed-off-by: Gengliang Wang (cherry picked from commit 86a351c13d62644d596cc5249fc1c45d318a0bbf) Signed-off-by: Gengliang Wang --- docs/sql-ref-ansi-compliance.md | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/docs/sql-ref-ansi-compliance.md b/docs/sql-ref-ansi-compliance.md index 94ef94a5e7b..c4572c71f4a 100644 --- a/docs/sql-ref-ansi-compliance.md +++ b/docs/sql-ref-ansi-compliance.md @@ -19,7 +19,7 @@ license: | limitations under the License. --- -Since Spark 3.0, Spark SQL introduces two experimental options to comply with the SQL standard: `spark.sql.ansi.enabled` and `spark.sql.storeAssignmentPolicy` (See a table below for details). +In Spark SQL, there are two options to comply with the SQL standard: `spark.sql.ansi.enabled` and `spark.sql.storeAssignmentPolicy` (See a table below for details). When `spark.sql.ansi.enabled` is set to `true`, Spark SQL uses an ANSI compliant dialect instead of being Hive compliant. For example, Spark will throw an exception at runtime instead of returning null results if the inputs to a SQL operator/function are invalid. Some ANSI dialect features may be not from the ANSI SQL standard directly, but their behaviors align with ANSI SQL's style. @@ -28,10 +28,10 @@ The casting behaviours are defined as store assignment rules in the standard. When `spark.sql.storeAssignmentPolicy` is set to `ANSI`, Spark SQL complies with the ANSI store assignment rules. This is a separate configuration because its default value is `ANSI`, while the configuration `spark.sql.ansi.enabled` is disabled by default. -|Property Name|Default|Meaning|Since Version| -|-|---|---|-| -|`spark.sql.ansi.enabled`|false|(Experimental) When true, Spark tries to conform to the ANSI SQL specification: 1. Spark will throw a runtime exception if an overflow occurs in any operation on integral/decimal field. 2. Spark will forbid using the reserved keywords of ANSI SQL as identifiers in the SQL parser.|3.0.0| -|`spark.sql.storeAssignmentPolicy`|ANSI|(Experimental) When inserting a value into a column with different data type, Spark will perform type conversion. Currently, we support 3 policies for the type coercion rules: ANSI, legacy and strict. With ANSI policy, Spark performs the type coercion as per ANSI SQL. In practice, the behavior is mostly the same as PostgreSQL. It disallows certain unreasonable type conversions such as converting string to int or double to boolean. With legacy po [...] +|Property Name|Default| Meaning [...]
[spark] branch master updated: [SPARK-39237][DOCS] Update the ANSI SQL mode documentation
This is an automated email from the ASF dual-hosted git repository. gengliang pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 86a351c13d6 [SPARK-39237][DOCS] Update the ANSI SQL mode documentation 86a351c13d6 is described below commit 86a351c13d62644d596cc5249fc1c45d318a0bbf Author: Gengliang Wang AuthorDate: Fri May 20 16:58:22 2022 +0800 [SPARK-39237][DOCS] Update the ANSI SQL mode documentation ### What changes were proposed in this pull request? 1. Remove the Experimental notation in ANSI SQL compliance doc 2. Update the description of `spark.sql.ansi.enabled`, since the ANSI reversed keyword is disabled by default now ### Why are the changes needed? 1. The ANSI SQL dialect is GAed in Spark 3.2 release: https://spark.apache.org/releases/spark-release-3-2-0.html We should not mark it as "Experimental" in the doc. 2. The ANSI reversed keyword is disabled by default now ### Does this PR introduce _any_ user-facing change? No, just doc change ### How was this patch tested? Doc preview: https://user-images.githubusercontent.com/1097932/169444094-de9c33c2-1b01-4fc3-b583-b752c71e16d8.png;> https://user-images.githubusercontent.com/1097932/169472239-1edf218f-1f7b-48ec-bf2a-5d043600f1bc.png;> Closes #36614 from gengliangwang/updateAnsiDoc. Authored-by: Gengliang Wang Signed-off-by: Gengliang Wang --- docs/sql-ref-ansi-compliance.md | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/docs/sql-ref-ansi-compliance.md b/docs/sql-ref-ansi-compliance.md index f06a0415d2a..257f53caef1 100644 --- a/docs/sql-ref-ansi-compliance.md +++ b/docs/sql-ref-ansi-compliance.md @@ -19,7 +19,7 @@ license: | limitations under the License. --- -Since Spark 3.0, Spark SQL introduces two experimental options to comply with the SQL standard: `spark.sql.ansi.enabled` and `spark.sql.storeAssignmentPolicy` (See a table below for details). +In Spark SQL, there are two options to comply with the SQL standard: `spark.sql.ansi.enabled` and `spark.sql.storeAssignmentPolicy` (See a table below for details). When `spark.sql.ansi.enabled` is set to `true`, Spark SQL uses an ANSI compliant dialect instead of being Hive compliant. For example, Spark will throw an exception at runtime instead of returning null results if the inputs to a SQL operator/function are invalid. Some ANSI dialect features may be not from the ANSI SQL standard directly, but their behaviors align with ANSI SQL's style. @@ -28,10 +28,10 @@ The casting behaviours are defined as store assignment rules in the standard. When `spark.sql.storeAssignmentPolicy` is set to `ANSI`, Spark SQL complies with the ANSI store assignment rules. This is a separate configuration because its default value is `ANSI`, while the configuration `spark.sql.ansi.enabled` is disabled by default. -|Property Name|Default|Meaning|Since Version| -|-|---|---|-| -|`spark.sql.ansi.enabled`|false|(Experimental) When true, Spark tries to conform to the ANSI SQL specification: 1. Spark will throw a runtime exception if an overflow occurs in any operation on integral/decimal field. 2. Spark will forbid using the reserved keywords of ANSI SQL as identifiers in the SQL parser.|3.0.0| -|`spark.sql.storeAssignmentPolicy`|ANSI|(Experimental) When inserting a value into a column with different data type, Spark will perform type conversion. Currently, we support 3 policies for the type coercion rules: ANSI, legacy and strict. With ANSI policy, Spark performs the type coercion as per ANSI SQL. In practice, the behavior is mostly the same as PostgreSQL. It disallows certain unreasonable type conversions such as converting string to int or double to boolean. With legacy po [...] +|Property Name|Default| Meaning [...] +|-|---|- [...]