[spark] branch branch-3.2 updated (68d89898a81 -> 0eb0abe804c)

2022-05-20 Thread gengliang
This is an automated email from the ASF dual-hosted git repository.

gengliang pushed a change to branch branch-3.2
in repository https://gitbox.apache.org/repos/asf/spark.git


from 68d89898a81 [SPARK-39240][INFRA][BUILD] Source and binary releases 
using different tool to generate hashes for integrity
 add 0eb0abe804c [SPARK-39237][DOCS][3.2] Update the ANSI SQL mode 
documentation

No new revisions were added by this update.

Summary of changes:
 docs/sql-ref-ansi-compliance.md | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated: [SPARK-39167][SQL] Throw an exception w/ an error class for multiple rows from a subquery used as an expression

2022-05-20 Thread maxgekk
This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 49562f41678 [SPARK-39167][SQL] Throw an exception w/ an error class 
for multiple rows from a subquery used as an expression
49562f41678 is described below

commit 49562f416788cab05b3f82a2471a1f2f6561a1d8
Author: panbingkun 
AuthorDate: Sat May 21 07:50:59 2022 +0300

[SPARK-39167][SQL] Throw an exception w/ an error class for multiple rows 
from a subquery used as an expression

### What changes were proposed in this pull request?
In the PR, I propose to use the MULTI_VALUE_SUBQUERY_ERROR error classes 
for multiple rows from a subquery used as an expression.

### Why are the changes needed?
Porting the executing errors for multiple rows from a subquery used as an 
expression to the new error framework should improve user experience with Spark 
SQL.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Added new test suite

Closes #36580 from panbingkun/SPARK-39167.

Authored-by: panbingkun 
Signed-off-by: Max Gekk 
---
 core/src/main/resources/error/error-classes.json   |  3 +++
 .../spark/sql/errors/QueryExecutionErrors.scala|  5 
 .../org/apache/spark/sql/execution/subquery.scala  |  5 ++--
 .../scala/org/apache/spark/sql/SubquerySuite.scala | 11 -
 .../sql/errors/QueryExecutionErrorsSuite.scala | 27 ++
 5 files changed, 37 insertions(+), 14 deletions(-)

diff --git a/core/src/main/resources/error/error-classes.json 
b/core/src/main/resources/error/error-classes.json
index 1a139c018e8..f6fba105872 100644
--- a/core/src/main/resources/error/error-classes.json
+++ b/core/src/main/resources/error/error-classes.json
@@ -160,6 +160,9 @@
   "MULTI_UDF_INTERFACE_ERROR" : {
 "message" : [ "Not allowed to implement multiple UDF interfaces, UDF class 
" ]
   },
+  "MULTI_VALUE_SUBQUERY_ERROR" : {
+"message" : [ "more than one row returned by a subquery used as an 
expression: " ]
+  },
   "NON_LITERAL_PIVOT_VALUES" : {
 "message" : [ "Literal expressions required for pivot values, found 
''" ],
 "sqlState" : "42000"
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala
index 1e664100545..f79b30f0d0f 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala
@@ -2005,4 +2005,9 @@ object QueryExecutionErrors extends QueryErrorsBase {
 new SparkException(errorClass = "INVALID_BUCKET_FILE", messageParameters = 
Array(path),
   cause = null)
   }
+
+  def multipleRowSubqueryError(plan: String): Throwable = {
+new SparkException(
+  errorClass = "MULTI_VALUE_SUBQUERY_ERROR", messageParameters = 
Array(plan), cause = null)
+  }
 }
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/subquery.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/subquery.scala
index 209b0f79243..c6f5983f243 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/execution/subquery.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/subquery.scala
@@ -25,6 +25,7 @@ import 
org.apache.spark.sql.catalyst.expressions.codegen.{CodegenContext, ExprCo
 import org.apache.spark.sql.catalyst.rules.Rule
 import org.apache.spark.sql.catalyst.trees.{LeafLike, UnaryLike}
 import org.apache.spark.sql.catalyst.trees.TreePattern._
+import org.apache.spark.sql.errors.QueryExecutionErrors
 import org.apache.spark.sql.internal.SQLConf
 import org.apache.spark.sql.types.{BooleanType, DataType}
 
@@ -79,9 +80,7 @@ case class ScalarSubquery(
   def updateResult(): Unit = {
 val rows = plan.executeCollect()
 if (rows.length > 1) {
-  // TODO(SPARK-39167): Throw an exception w/ an error class for multiple 
rows from a subquery
-  throw new IllegalStateException(
-s"more than one row returned by a subquery used as an 
expression:\n$plan")
+  throw QueryExecutionErrors.multipleRowSubqueryError(plan.toString)
 }
 if (rows.length == 1) {
   assert(rows(0).numFields == 1,
diff --git a/sql/core/src/test/scala/org/apache/spark/sql/SubquerySuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/SubquerySuite.scala
index 396fca47634..500913fb289 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/SubquerySuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/SubquerySuite.scala
@@ -19,7 +19,6 @@ package org.apache.spark.sql
 
 import scala.collection.mutable.ArrayBuffer
 
-import org.apache.spark.SparkException
 import org.apache.spark.sql.catalyst.expressions.SubqueryExpression
 import 

[spark] branch master updated: [SPARK-39213][SQL] Create ANY_VALUE aggregate function

2022-05-20 Thread maxgekk
This is an automated email from the ASF dual-hosted git repository.

maxgekk pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new efc1e8ac8bc [SPARK-39213][SQL] Create ANY_VALUE aggregate function
efc1e8ac8bc is described below

commit efc1e8ac8bc61872601ac2244629a9d54f8889fb
Author: Vitalii Li 
AuthorDate: Fri May 20 22:28:18 2022 +0300

[SPARK-39213][SQL] Create ANY_VALUE aggregate function

### What changes were proposed in this pull request?

Adding implementation for ANY_VALUE aggregate function. During optimization 
stage it is rewritten to `First` aggregate function.

### Why are the changes needed?

This feature provides feature parity with popular DBs and DWHs

### Does this PR introduce _any_ user-facing change?

Yes - introducing new aggregate function `ANY_VALUE`. Respective 
documentation is updated.

### How was this patch tested?

Unit tests

Closes #36584 from vli-databricks/SPARK-39213.

Authored-by: Vitalii Li 
Signed-off-by: Max Gekk 
---
 docs/sql-ref-ansi-compliance.md|   1 +
 .../spark/sql/catalyst/parser/SqlBaseLexer.g4  |   1 +
 .../spark/sql/catalyst/parser/SqlBaseParser.g4 |   3 +
 .../spark/sql/catalyst/analysis/Analyzer.scala |   1 +
 .../sql/catalyst/analysis/FunctionRegistry.scala   |   1 +
 .../catalyst/expressions/aggregate/AnyValue.scala  |  64 +++
 .../spark/sql/catalyst/parser/AstBuilder.scala |  10 +-
 .../spark/sql/catalyst/SQLKeywordSuite.scala   |   2 +-
 .../expressions/aggregate/FirstLastTestSuite.scala |   4 +
 .../sql-functions/sql-expression-schema.md |   1 +
 .../resources/sql-tests/inputs/udf/udf-window.sql  |   8 +-
 .../src/test/resources/sql-tests/inputs/window.sql |  29 +-
 .../sql-tests/results/udf/udf-window.sql.out   |  46 +-
 .../resources/sql-tests/results/window.sql.out | 574 +++--
 14 files changed, 446 insertions(+), 299 deletions(-)

diff --git a/docs/sql-ref-ansi-compliance.md b/docs/sql-ref-ansi-compliance.md
index 257f53caef1..bb55cec52f5 100644
--- a/docs/sql-ref-ansi-compliance.md
+++ b/docs/sql-ref-ansi-compliance.md
@@ -346,6 +346,7 @@ Below is a list of all the keywords in Spark SQL.
 |AND|reserved|non-reserved|reserved|
 |ANTI|non-reserved|strict-non-reserved|non-reserved|
 |ANY|reserved|non-reserved|reserved|
+|ANY_VALUE|non-reserved|non-reserved|non-reserved|
 |ARCHIVE|non-reserved|non-reserved|non-reserved|
 |ARRAY|non-reserved|non-reserved|reserved|
 |AS|reserved|non-reserved|reserved|
diff --git 
a/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseLexer.g4
 
b/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseLexer.g4
index fac87c62de0..1cbd6d24dea 100644
--- 
a/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseLexer.g4
+++ 
b/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseLexer.g4
@@ -95,6 +95,7 @@ ANALYZE: 'ANALYZE';
 AND: 'AND';
 ANTI: 'ANTI';
 ANY: 'ANY';
+ANY_VALUE: 'ANY_VALUE';
 ARCHIVE: 'ARCHIVE';
 ARRAY: 'ARRAY';
 AS: 'AS';
diff --git 
a/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseParser.g4
 
b/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseParser.g4
index ed57e9062c1..ce37a09d5ba 100644
--- 
a/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseParser.g4
+++ 
b/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBaseParser.g4
@@ -824,6 +824,7 @@ primaryExpression
 | name=(CAST | TRY_CAST) LEFT_PAREN expression AS dataType RIGHT_PAREN 
#cast
 | STRUCT LEFT_PAREN (argument+=namedExpression (COMMA 
argument+=namedExpression)*)? RIGHT_PAREN #struct
 | FIRST LEFT_PAREN expression (IGNORE NULLS)? RIGHT_PAREN  
#first
+| ANY_VALUE LEFT_PAREN expression (IGNORE NULLS)? RIGHT_PAREN  
#any_value
 | LAST LEFT_PAREN expression (IGNORE NULLS)? RIGHT_PAREN   
#last
 | POSITION LEFT_PAREN substr=valueExpression IN str=valueExpression 
RIGHT_PAREN#position
 | constant 
#constantDefault
@@ -1072,6 +1073,7 @@ ansiNonReserved
 | ALTER
 | ANALYZE
 | ANTI
+| ANY_VALUE
 | ARCHIVE
 | ARRAY
 | ASC
@@ -1314,6 +1316,7 @@ nonReserved
 | ANALYZE
 | AND
 | ANY
+| ANY_VALUE
 | ARCHIVE
 | ARRAY
 | AS
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
index 4dd2081c67f..c5bee6f55fe 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
+++ 

[spark] branch branch-3.2 updated: [SPARK-39240][INFRA][BUILD] Source and binary releases using different tool to generate hashes for integrity

2022-05-20 Thread srowen
This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch branch-3.2
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.2 by this push:
 new 68d89898a81 [SPARK-39240][INFRA][BUILD] Source and binary releases 
using different tool to generate hashes for integrity
68d89898a81 is described below

commit 68d89898a81c39802b5f5036d8d3690acbb36ef4
Author: Kent Yao 
AuthorDate: Fri May 20 10:54:53 2022 -0500

[SPARK-39240][INFRA][BUILD] Source and binary releases using different tool 
to generate hashes for integrity

### What changes were proposed in this pull request?

unify the hash generator for release files.

### Why are the changes needed?

Currently, we use `shasum` for source but `gpg` for binary, since 
https://github.com/apache/spark/pull/30123

this confuses me when validating the integrities of spark 3.3.0 RC 
https://dist.apache.org/repos/dist/dev/spark/v3.3.0-rc2-bin/

### Does this PR introduce _any_ user-facing change?

no

### How was this patch tested?

test script manually

Closes #36619 from yaooqinn/SPARK-39240.

Authored-by: Kent Yao 
Signed-off-by: Sean Owen 
(cherry picked from commit 3e783375097d14f1c28eb9b0e08075f1f8daa4a2)
Signed-off-by: Sean Owen 
---
 dev/create-release/release-build.sh | 12 +++-
 1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/dev/create-release/release-build.sh 
b/dev/create-release/release-build.sh
index 96b8d4e3d9c..f49d77be350 100755
--- a/dev/create-release/release-build.sh
+++ b/dev/create-release/release-build.sh
@@ -283,9 +283,7 @@ if [[ "$1" == "package" ]]; then
   echo $GPG_PASSPHRASE | $GPG --passphrase-fd 0 --armour \
 --output $R_DIST_NAME.asc \
 --detach-sig $R_DIST_NAME
-  echo $GPG_PASSPHRASE | $GPG --passphrase-fd 0 --print-md \
-SHA512 $R_DIST_NAME > \
-$R_DIST_NAME.sha512
+  shasum -a 512 $R_DIST_NAME > $R_DIST_NAME.sha512
 fi
 
 if [[ -n $PIP_FLAG ]]; then
@@ -296,9 +294,7 @@ if [[ "$1" == "package" ]]; then
   echo $GPG_PASSPHRASE | $GPG --passphrase-fd 0 --armour \
 --output $PYTHON_DIST_NAME.asc \
 --detach-sig $PYTHON_DIST_NAME
-  echo $GPG_PASSPHRASE | $GPG --passphrase-fd 0 --print-md \
-SHA512 $PYTHON_DIST_NAME > \
-$PYTHON_DIST_NAME.sha512
+  shasum -a 512 $PYTHON_DIST_NAME > $PYTHON_DIST_NAME.sha512
 fi
 
 echo "Copying and signing regular binary distribution"
@@ -306,9 +302,7 @@ if [[ "$1" == "package" ]]; then
 echo $GPG_PASSPHRASE | $GPG --passphrase-fd 0 --armour \
   --output spark-$SPARK_VERSION-bin-$NAME.tgz.asc \
   --detach-sig spark-$SPARK_VERSION-bin-$NAME.tgz
-echo $GPG_PASSPHRASE | $GPG --passphrase-fd 0 --print-md \
-  SHA512 spark-$SPARK_VERSION-bin-$NAME.tgz > \
-  spark-$SPARK_VERSION-bin-$NAME.tgz.sha512
+shasum -a 512 spark-$SPARK_VERSION-bin-$NAME.tgz > 
spark-$SPARK_VERSION-bin-$NAME.tgz.sha512
   }
 
   # List of binary packages built. Populates two associative arrays, where the 
key is the "name" of


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch branch-3.3 updated: [SPARK-39240][INFRA][BUILD] Source and binary releases using different tool to generate hashes for integrity

2022-05-20 Thread srowen
This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch branch-3.3
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.3 by this push:
 new 3f77be288ff [SPARK-39240][INFRA][BUILD] Source and binary releases 
using different tool to generate hashes for integrity
3f77be288ff is described below

commit 3f77be288ffee792ef6bb49c65132f48b269e142
Author: Kent Yao 
AuthorDate: Fri May 20 10:54:53 2022 -0500

[SPARK-39240][INFRA][BUILD] Source and binary releases using different tool 
to generate hashes for integrity

### What changes were proposed in this pull request?

unify the hash generator for release files.

### Why are the changes needed?

Currently, we use `shasum` for source but `gpg` for binary, since 
https://github.com/apache/spark/pull/30123

this confuses me when validating the integrities of spark 3.3.0 RC 
https://dist.apache.org/repos/dist/dev/spark/v3.3.0-rc2-bin/

### Does this PR introduce _any_ user-facing change?

no

### How was this patch tested?

test script manually

Closes #36619 from yaooqinn/SPARK-39240.

Authored-by: Kent Yao 
Signed-off-by: Sean Owen 
(cherry picked from commit 3e783375097d14f1c28eb9b0e08075f1f8daa4a2)
Signed-off-by: Sean Owen 
---
 dev/create-release/release-build.sh | 12 +++-
 1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/dev/create-release/release-build.sh 
b/dev/create-release/release-build.sh
index a65d02289c0..78fd06ba2be 100755
--- a/dev/create-release/release-build.sh
+++ b/dev/create-release/release-build.sh
@@ -283,9 +283,7 @@ if [[ "$1" == "package" ]]; then
   echo $GPG_PASSPHRASE | $GPG --passphrase-fd 0 --armour \
 --output $R_DIST_NAME.asc \
 --detach-sig $R_DIST_NAME
-  echo $GPG_PASSPHRASE | $GPG --passphrase-fd 0 --print-md \
-SHA512 $R_DIST_NAME > \
-$R_DIST_NAME.sha512
+  shasum -a 512 $R_DIST_NAME > $R_DIST_NAME.sha512
 fi
 
 if [[ -n $PIP_FLAG ]]; then
@@ -296,9 +294,7 @@ if [[ "$1" == "package" ]]; then
   echo $GPG_PASSPHRASE | $GPG --passphrase-fd 0 --armour \
 --output $PYTHON_DIST_NAME.asc \
 --detach-sig $PYTHON_DIST_NAME
-  echo $GPG_PASSPHRASE | $GPG --passphrase-fd 0 --print-md \
-SHA512 $PYTHON_DIST_NAME > \
-$PYTHON_DIST_NAME.sha512
+  shasum -a 512 $PYTHON_DIST_NAME > $PYTHON_DIST_NAME.sha512
 fi
 
 echo "Copying and signing regular binary distribution"
@@ -306,9 +302,7 @@ if [[ "$1" == "package" ]]; then
 echo $GPG_PASSPHRASE | $GPG --passphrase-fd 0 --armour \
   --output spark-$SPARK_VERSION-bin-$NAME.tgz.asc \
   --detach-sig spark-$SPARK_VERSION-bin-$NAME.tgz
-echo $GPG_PASSPHRASE | $GPG --passphrase-fd 0 --print-md \
-  SHA512 spark-$SPARK_VERSION-bin-$NAME.tgz > \
-  spark-$SPARK_VERSION-bin-$NAME.tgz.sha512
+shasum -a 512 spark-$SPARK_VERSION-bin-$NAME.tgz > 
spark-$SPARK_VERSION-bin-$NAME.tgz.sha512
   }
 
   # List of binary packages built. Populates two associative arrays, where the 
key is the "name" of


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch master updated: [SPARK-39240][INFRA][BUILD] Source and binary releases using different tool to generate hashes for integrity

2022-05-20 Thread srowen
This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 3e783375097 [SPARK-39240][INFRA][BUILD] Source and binary releases 
using different tool to generate hashes for integrity
3e783375097 is described below

commit 3e783375097d14f1c28eb9b0e08075f1f8daa4a2
Author: Kent Yao 
AuthorDate: Fri May 20 10:54:53 2022 -0500

[SPARK-39240][INFRA][BUILD] Source and binary releases using different tool 
to generate hashes for integrity

### What changes were proposed in this pull request?

unify the hash generator for release files.

### Why are the changes needed?

Currently, we use `shasum` for source but `gpg` for binary, since 
https://github.com/apache/spark/pull/30123

this confuses me when validating the integrities of spark 3.3.0 RC 
https://dist.apache.org/repos/dist/dev/spark/v3.3.0-rc2-bin/

### Does this PR introduce _any_ user-facing change?

no

### How was this patch tested?

test script manually

Closes #36619 from yaooqinn/SPARK-39240.

Authored-by: Kent Yao 
Signed-off-by: Sean Owen 
---
 dev/create-release/release-build.sh | 12 +++-
 1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/dev/create-release/release-build.sh 
b/dev/create-release/release-build.sh
index a65d02289c0..78fd06ba2be 100755
--- a/dev/create-release/release-build.sh
+++ b/dev/create-release/release-build.sh
@@ -283,9 +283,7 @@ if [[ "$1" == "package" ]]; then
   echo $GPG_PASSPHRASE | $GPG --passphrase-fd 0 --armour \
 --output $R_DIST_NAME.asc \
 --detach-sig $R_DIST_NAME
-  echo $GPG_PASSPHRASE | $GPG --passphrase-fd 0 --print-md \
-SHA512 $R_DIST_NAME > \
-$R_DIST_NAME.sha512
+  shasum -a 512 $R_DIST_NAME > $R_DIST_NAME.sha512
 fi
 
 if [[ -n $PIP_FLAG ]]; then
@@ -296,9 +294,7 @@ if [[ "$1" == "package" ]]; then
   echo $GPG_PASSPHRASE | $GPG --passphrase-fd 0 --armour \
 --output $PYTHON_DIST_NAME.asc \
 --detach-sig $PYTHON_DIST_NAME
-  echo $GPG_PASSPHRASE | $GPG --passphrase-fd 0 --print-md \
-SHA512 $PYTHON_DIST_NAME > \
-$PYTHON_DIST_NAME.sha512
+  shasum -a 512 $PYTHON_DIST_NAME > $PYTHON_DIST_NAME.sha512
 fi
 
 echo "Copying and signing regular binary distribution"
@@ -306,9 +302,7 @@ if [[ "$1" == "package" ]]; then
 echo $GPG_PASSPHRASE | $GPG --passphrase-fd 0 --armour \
   --output spark-$SPARK_VERSION-bin-$NAME.tgz.asc \
   --detach-sig spark-$SPARK_VERSION-bin-$NAME.tgz
-echo $GPG_PASSPHRASE | $GPG --passphrase-fd 0 --print-md \
-  SHA512 spark-$SPARK_VERSION-bin-$NAME.tgz > \
-  spark-$SPARK_VERSION-bin-$NAME.tgz.sha512
+shasum -a 512 spark-$SPARK_VERSION-bin-$NAME.tgz > 
spark-$SPARK_VERSION-bin-$NAME.tgz.sha512
   }
 
   # List of binary packages built. Populates two associative arrays, where the 
key is the "name" of


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark-website] branch asf-site updated: Add info for verification of releases

2022-05-20 Thread srowen
This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/spark-website.git


The following commit(s) were added to refs/heads/asf-site by this push:
 new d717e4d3b Add info for verification of releases
d717e4d3b is described below

commit d717e4d3b5fdb3bc101fff8de68280faa04d3456
Author: Kent Yao 
AuthorDate: Fri May 20 10:35:49 2022 -0500

Add info for verification of releases



Author: Kent Yao 

Closes #390 from yaooqinn/verify.
---
 downloads.md| 2 +-
 site/downloads.html | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/downloads.md b/downloads.md
index bf5a98ef0..fb3347fa8 100644
--- a/downloads.md
+++ b/downloads.md
@@ -26,7 +26,7 @@ window.onload = function () {
 
 3. Download Spark: 
 
-4. Verify this release using the  and 
[project release KEYS](https://downloads.apache.org/spark/KEYS).
+4. Verify this release using the  and 
[project release KEYS](https://downloads.apache.org/spark/KEYS) by following 
these [procedures](https://www.apache.org/info/verification.html).
 
 Note that Spark 3 is pre-built with Scala 2.12 in general and Spark 3.2+ 
provides additional pre-built distribution with Scala 2.13.
 
diff --git a/site/downloads.html b/site/downloads.html
index 097736b8e..76b02d590 100644
--- a/site/downloads.html
+++ b/site/downloads.html
@@ -146,7 +146,7 @@ window.onload = function () {
 Download Spark: 
   
   
-Verify this release using the  
and https://downloads.apache.org/spark/KEYS;>project release 
KEYS.
+Verify this release using the  
and https://downloads.apache.org/spark/KEYS;>project release KEYS 
by following these https://www.apache.org/info/verification.html;>procedures.
   
 
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[GitHub] [spark-website] srowen closed pull request #390: Add info for verification of releases

2022-05-20 Thread GitBox


srowen closed pull request #390: Add info for verification of releases
URL: https://github.com/apache/spark-website/pull/390


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[GitHub] [spark-website] yaooqinn opened a new pull request, #390: Add info for verification of releases

2022-05-20 Thread GitBox


yaooqinn opened a new pull request, #390:
URL: https://github.com/apache/spark-website/pull/390

   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[spark] branch branch-3.3 updated: [SPARK-39237][DOCS] Update the ANSI SQL mode documentation

2022-05-20 Thread gengliang
This is an automated email from the ASF dual-hosted git repository.

gengliang pushed a commit to branch branch-3.3
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.3 by this push:
 new ab057c72509 [SPARK-39237][DOCS] Update the ANSI SQL mode documentation
ab057c72509 is described below

commit ab057c72509006cbd8b501b6be4eb26793dc1e71
Author: Gengliang Wang 
AuthorDate: Fri May 20 16:58:22 2022 +0800

[SPARK-39237][DOCS] Update the ANSI SQL mode documentation

### What changes were proposed in this pull request?

1. Remove the Experimental notation in ANSI SQL compliance doc
2. Update the description of `spark.sql.ansi.enabled`, since the ANSI 
reversed keyword is disabled by default now

### Why are the changes needed?

1. The ANSI SQL dialect is GAed in Spark 3.2 release: 
https://spark.apache.org/releases/spark-release-3-2-0.html
We should not mark it as "Experimental" in the doc.
2. The ANSI reversed keyword is disabled by default now

### Does this PR introduce _any_ user-facing change?

No, just doc change
### How was this patch tested?

Doc preview:
https://user-images.githubusercontent.com/1097932/169444094-de9c33c2-1b01-4fc3-b583-b752c71e16d8.png;>

https://user-images.githubusercontent.com/1097932/169472239-1edf218f-1f7b-48ec-bf2a-5d043600f1bc.png;>

Closes #36614 from gengliangwang/updateAnsiDoc.

Authored-by: Gengliang Wang 
Signed-off-by: Gengliang Wang 
(cherry picked from commit 86a351c13d62644d596cc5249fc1c45d318a0bbf)
Signed-off-by: Gengliang Wang 
---
 docs/sql-ref-ansi-compliance.md | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/docs/sql-ref-ansi-compliance.md b/docs/sql-ref-ansi-compliance.md
index 94ef94a5e7b..c4572c71f4a 100644
--- a/docs/sql-ref-ansi-compliance.md
+++ b/docs/sql-ref-ansi-compliance.md
@@ -19,7 +19,7 @@ license: |
   limitations under the License.
 ---
 
-Since Spark 3.0, Spark SQL introduces two experimental options to comply with 
the SQL standard: `spark.sql.ansi.enabled` and 
`spark.sql.storeAssignmentPolicy` (See a table below for details).
+In Spark SQL, there are two options to comply with the SQL standard: 
`spark.sql.ansi.enabled` and `spark.sql.storeAssignmentPolicy` (See a table 
below for details).
 
 When `spark.sql.ansi.enabled` is set to `true`, Spark SQL uses an ANSI 
compliant dialect instead of being Hive compliant. For example, Spark will 
throw an exception at runtime instead of returning null results if the inputs 
to a SQL operator/function are invalid. Some ANSI dialect features may be not 
from the ANSI SQL standard directly, but their behaviors align with ANSI SQL's 
style.
 
@@ -28,10 +28,10 @@ The casting behaviours are defined as store assignment 
rules in the standard.
 
 When `spark.sql.storeAssignmentPolicy` is set to `ANSI`, Spark SQL complies 
with the ANSI store assignment rules. This is a separate configuration because 
its default value is `ANSI`, while the configuration `spark.sql.ansi.enabled` 
is disabled by default.
 
-|Property Name|Default|Meaning|Since Version|
-|-|---|---|-|
-|`spark.sql.ansi.enabled`|false|(Experimental) When true, Spark tries to 
conform to the ANSI SQL specification:  1. Spark will throw a runtime 
exception if an overflow occurs in any operation on integral/decimal field. 
 2. Spark will forbid using the reserved keywords of ANSI SQL as 
identifiers in the SQL parser.|3.0.0|
-|`spark.sql.storeAssignmentPolicy`|ANSI|(Experimental) When inserting a value 
into a column with different data type, Spark will perform type conversion.  
Currently, we support 3 policies for the type coercion rules: ANSI, legacy and 
strict. With ANSI policy, Spark performs the type coercion as per ANSI SQL. In 
practice, the behavior is mostly the same as PostgreSQL.  It disallows certain 
unreasonable type conversions such as converting string to int or double to 
boolean.  With legacy po [...]
+|Property Name|Default| Meaning





  [...]

[spark] branch master updated: [SPARK-39237][DOCS] Update the ANSI SQL mode documentation

2022-05-20 Thread gengliang
This is an automated email from the ASF dual-hosted git repository.

gengliang pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 86a351c13d6 [SPARK-39237][DOCS] Update the ANSI SQL mode documentation
86a351c13d6 is described below

commit 86a351c13d62644d596cc5249fc1c45d318a0bbf
Author: Gengliang Wang 
AuthorDate: Fri May 20 16:58:22 2022 +0800

[SPARK-39237][DOCS] Update the ANSI SQL mode documentation

### What changes were proposed in this pull request?

1. Remove the Experimental notation in ANSI SQL compliance doc
2. Update the description of `spark.sql.ansi.enabled`, since the ANSI 
reversed keyword is disabled by default now

### Why are the changes needed?

1. The ANSI SQL dialect is GAed in Spark 3.2 release: 
https://spark.apache.org/releases/spark-release-3-2-0.html
We should not mark it as "Experimental" in the doc.
2. The ANSI reversed keyword is disabled by default now

### Does this PR introduce _any_ user-facing change?

No, just doc change
### How was this patch tested?

Doc preview:
https://user-images.githubusercontent.com/1097932/169444094-de9c33c2-1b01-4fc3-b583-b752c71e16d8.png;>

https://user-images.githubusercontent.com/1097932/169472239-1edf218f-1f7b-48ec-bf2a-5d043600f1bc.png;>

Closes #36614 from gengliangwang/updateAnsiDoc.

Authored-by: Gengliang Wang 
Signed-off-by: Gengliang Wang 
---
 docs/sql-ref-ansi-compliance.md | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/docs/sql-ref-ansi-compliance.md b/docs/sql-ref-ansi-compliance.md
index f06a0415d2a..257f53caef1 100644
--- a/docs/sql-ref-ansi-compliance.md
+++ b/docs/sql-ref-ansi-compliance.md
@@ -19,7 +19,7 @@ license: |
   limitations under the License.
 ---
 
-Since Spark 3.0, Spark SQL introduces two experimental options to comply with 
the SQL standard: `spark.sql.ansi.enabled` and 
`spark.sql.storeAssignmentPolicy` (See a table below for details).
+In Spark SQL, there are two options to comply with the SQL standard: 
`spark.sql.ansi.enabled` and `spark.sql.storeAssignmentPolicy` (See a table 
below for details).
 
 When `spark.sql.ansi.enabled` is set to `true`, Spark SQL uses an ANSI 
compliant dialect instead of being Hive compliant. For example, Spark will 
throw an exception at runtime instead of returning null results if the inputs 
to a SQL operator/function are invalid. Some ANSI dialect features may be not 
from the ANSI SQL standard directly, but their behaviors align with ANSI SQL's 
style.
 
@@ -28,10 +28,10 @@ The casting behaviours are defined as store assignment 
rules in the standard.
 
 When `spark.sql.storeAssignmentPolicy` is set to `ANSI`, Spark SQL complies 
with the ANSI store assignment rules. This is a separate configuration because 
its default value is `ANSI`, while the configuration `spark.sql.ansi.enabled` 
is disabled by default.
 
-|Property Name|Default|Meaning|Since Version|
-|-|---|---|-|
-|`spark.sql.ansi.enabled`|false|(Experimental) When true, Spark tries to 
conform to the ANSI SQL specification:  1. Spark will throw a runtime 
exception if an overflow occurs in any operation on integral/decimal field. 
 2. Spark will forbid using the reserved keywords of ANSI SQL as 
identifiers in the SQL parser.|3.0.0|
-|`spark.sql.storeAssignmentPolicy`|ANSI|(Experimental) When inserting a value 
into a column with different data type, Spark will perform type conversion.  
Currently, we support 3 policies for the type coercion rules: ANSI, legacy and 
strict. With ANSI policy, Spark performs the type coercion as per ANSI SQL. In 
practice, the behavior is mostly the same as PostgreSQL.  It disallows certain 
unreasonable type conversions such as converting string to int or double to 
boolean.  With legacy po [...]
+|Property Name|Default| Meaning





  [...]
+|-|---|-
 [...]