from:"\"yamamuro\""

[spark] branch master updated (499f620 -> 56edb81)

2021-03-02 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 499f620  [MINOR][SQL][DOCS] Fix some wrong default values in SQL 
tuning guide's AQE section
 add 56edb81  [SPARK-33474][SQL] Support TypeConstructed partition spec 
value

No new revisions were added by this update.

Summary of changes:
 docs/sql-migration-guide.md|  2 +
 docs/sql-ref-syntax-ddl-alter-table.md |  8 ++--
 docs/sql-ref-syntax-dml-insert-into.md | 15 ++-
 docs/sql-ref-syntax-dml-insert-overwrite-table.md  | 25 ++-
 .../spark/sql/catalyst/parser/AstBuilder.scala | 14 +--
 .../spark/sql/catalyst/parser/DDLParserSuite.scala | 30 --
 .../org/apache/spark/sql/SQLInsertTestSuite.scala  | 48 ++
 .../command/AlterTableAddPartitionSuiteBase.scala  |  8 
 .../command/AlterTableDropPartitionSuiteBase.scala | 10 +
 .../AlterTableRenamePartitionSuiteBase.scala   | 11 +
 10 files changed, 158 insertions(+), 13 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (9ac5ee2e -> dbce74d)

2021-03-04 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 9ac5ee2e [SPARK-32924][WEBUI] Make duration column in master UI sorted 
in the correct order
 add dbce74d  [SPARK-34607][SQL] Add `Utils.isMemberClass` to fix a 
malformed class name error on jdk8u

No new revisions were added by this update.

Summary of changes:
 .../main/scala/org/apache/spark/util/Utils.scala   | 28 +
 .../spark/sql/catalyst/encoders/OuterScopes.scala  |  2 +-
 .../sql/catalyst/expressions/objects/objects.scala |  2 +-
 .../catalyst/encoders/ExpressionEncoderSuite.scala | 70 ++
 4 files changed, 100 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (f72b906 -> 1a97224)

2021-03-06 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from f72b906  [SPARK-34643][R][DOCS] Use CRAN URL in canonical form
 add 1a97224  [SPARK-34595][SQL] DPP support RLIKE

No new revisions were added by this update.

Summary of changes:
 .../dynamicpruning/PartitionPruning.scala  |  2 +-
 .../spark/sql/DynamicPartitionPruningSuite.scala   | 26 ++
 2 files changed, 27 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-34665][SQL][DOCS] Revise the type coercion section of ANSI Compliance

2021-03-08 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new ee756fd  [SPARK-34665][SQL][DOCS] Revise the type coercion section of 
ANSI Compliance
ee756fd is described below

commit ee756fd69528f90f63ffd45edc821c6b69a8a35e
Author: Gengliang Wang 
AuthorDate: Tue Mar 9 13:19:14 2021 +0900

[SPARK-34665][SQL][DOCS] Revise the type coercion section of ANSI Compliance

### What changes were proposed in this pull request?

1. Fix the table of valid type coercion combinations. Binary type should be 
allowed casting to String type and disallowed casting to Numeric types.
2. Summary all the `CAST`s that can cause runtime exceptions.

### Why are the changes needed?

Fix a mistake in the docs.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Run `jekyll serve` and preview:


![image](https://user-images.githubusercontent.com/1097932/110334374-8fab5a80-7fd7-11eb-86e7-c519cfa41b99.png)

Closes #31781 from gengliangwang/reviseAnsiDoc2.

Authored-by: Gengliang Wang 
Signed-off-by: Takeshi Yamamuro 
---
 docs/sql-ref-ansi-compliance.md | 22 +-
 1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/docs/sql-ref-ansi-compliance.md b/docs/sql-ref-ansi-compliance.md
index 99e230b..4b3ff46 100644
--- a/docs/sql-ref-ansi-compliance.md
+++ b/docs/sql-ref-ansi-compliance.md
@@ -72,16 +72,23 @@ The type conversion of Spark ANSI mode follows the syntax 
rules of section 6.13
 
 | Source\Target | Numeric | String | Date | Timestamp | Interval | Boolean | 
Binary | Array | Map | Struct |
 
|---|-||--|---|--|-||---|-||
-| Numeric   | Y   | Y  | N| N | N| Y   | N 
 | N | N   | N  |
-| String| Y   | Y  | Y| Y | Y| Y   | Y 
 | N | N   | N  |
+| Numeric   | **Y** | Y  | N| N 
| N| Y   | N  | N | N   | N  |
+| String| **Y** | Y | **Y** | **Y** | **Y** | **Y** | Y | N   
  | N   | N  |
 | Date  | N   | Y  | Y| Y | N| N   | N 
 | N | N   | N  |
 | Timestamp | N   | Y  | Y| Y | N| N   | N 
 | N | N   | N  |
 | Interval  | N   | Y  | N| N | Y| N   | N 
 | N | N   | N  |
 | Boolean   | Y   | Y  | N| N | N| Y   | N 
 | N | N   | N  |
-| Binary| Y   | N  | N| N | N| N   | Y 
 | N | N   | N  |
-| Array | N   | N  | N| N | N| N   | N 
 | Y | N   | N  |
-| Map   | N   | N  | N| N | N| N   | N 
 | N | Y   | N  |
-| Struct| N   | N  | N| N | N| N   | N 
 | N | N   | Y  |
+| Binary| N   | Y  | N| N | N| N   | Y 
 | N | N   | N  |
+| Array | N   | N  | N| N | N| N   | N 
 | **Y** | N   | N  |
+| Map   | N   | N  | N| N | N| N   | N 
 | N | **Y** | N  |
+| Struct| N   | N  | N| N | N| N   | N 
 | N | N   | **Y** |
+
+In the table above, all the `CAST`s that can cause runtime exceptions are 
marked as red **Y**:
+* CAST(Numeric AS Numeric): raise an overflow exception if the value is out of 
the target data type's range.
+* CAST(String AS (Numeric/Date/Timestamp/Interval/Boolean)): raise a runtime 
exception if the value can't be parsed as the target data type.
+* CAST(Array AS Array): raise an exception if there is any on the conversion 
of the elements.
+* CAST(Map AS Map): raise an exception if there is any on the conversion of 
the keys and the values.
+* CAST(Struct AS Struct): raise an exception if there is any on the conversion 
of the struct fields.
 
 Currently, the ANSI mode affects explicit casting and assignment casting only.
 In future releases, the behaviour of type coercion might change along with the 
other two type conversion rules.
@@ -163,9 +170,6 @@ The behavior of some SQL functions can be different under 
ANSI mode (`spark.sql.
 The behavior of some SQL operators can be different under ANSI mode 
(`spark.sql.ansi.enabled=true`).
   - `array_col[index]`: This operator throws `ArrayIndexOutOfBoundsException` 
if using invalid indices.
   - `map_col[key]`: This operator throws `NoSuchElementException` if key does 
not exist in map.
-  - `CAST(string_col AS TIMESTAMP)`: This operator should fail with an 
except

[spark] branch master updated (48637a9 -> bf4570b)

2021-03-17 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 48637a9  [SPARK-34766][SQL] Do not capture maven config for views
 add bf4570b  [SPARK-34749][SQL] Simplify ResolveCreateNamedStruct

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/catalyst/analysis/Analyzer.scala  |  2 --
 .../sql/catalyst/expressions/complexTypeCreator.scala  | 10 +-
 .../sql/catalyst/expressions/complexTypeExtractors.scala   | 14 +-
 .../spark/sql/catalyst/parser/ExpressionParserSuite.scala  |  2 +-
 4 files changed, 11 insertions(+), 17 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (bf4570b -> 9f7b0a0)

2021-03-17 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from bf4570b  [SPARK-34749][SQL] Simplify ResolveCreateNamedStruct
 add 9f7b0a0  [SPARK-34758][SQL] Simplify Analyzer.resolveLiteralFunction

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/analysis/Analyzer.scala | 29 ++
 1 file changed, 7 insertions(+), 22 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.1 updated: [SPARK-34749][SQL][3.1] Simplify ResolveCreateNamedStruct

2021-03-17 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.1 by this push:
 new 448b8d0  [SPARK-34749][SQL][3.1] Simplify ResolveCreateNamedStruct
448b8d0 is described below

commit 448b8d07df41040058c21e6102406e1656727599
Author: Wenchen Fan 
AuthorDate: Thu Mar 18 07:44:11 2021 +0900

[SPARK-34749][SQL][3.1] Simplify ResolveCreateNamedStruct

backports https://github.com/apache/spark/pull/31843

### What changes were proposed in this pull request?

This is a follow-up of https://github.com/apache/spark/pull/31808 and 
simplifies its fix to one line (excluding comments).

### Why are the changes needed?

code simplification

### Does this PR introduce _any_ user-facing change?

no

### How was this patch tested?

N/A

Closes #31867 from cloud-fan/backport.

Authored-by: Wenchen Fan 
Signed-off-by: Takeshi Yamamuro 
---
 .../org/apache/spark/sql/catalyst/analysis/Analyzer.scala |  2 --
 .../spark/sql/catalyst/expressions/complexTypeCreator.scala   | 10 +-
 .../sql/catalyst/expressions/complexTypeExtractors.scala  | 11 +--
 .../spark/sql/catalyst/parser/ExpressionParserSuite.scala |  2 +-
 4 files changed, 11 insertions(+), 14 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
index f98f33b..f4cdeab 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
@@ -3840,8 +3840,6 @@ object ResolveCreateNamedStruct extends Rule[LogicalPlan] 
{
   val children = e.children.grouped(2).flatMap {
 case Seq(NamePlaceholder, e: NamedExpression) if e.resolved =>
   Seq(Literal(e.name), e)
-case Seq(NamePlaceholder, e: ExtractValue) if e.resolved && 
e.name.isDefined =>
-  Seq(Literal(e.name.get), e)
 case kv =>
   kv
   }
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala
index cb59fbd..1779d41 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala
@@ -20,7 +20,7 @@ package org.apache.spark.sql.catalyst.expressions
 import scala.collection.mutable.ArrayBuffer
 
 import org.apache.spark.sql.catalyst.InternalRow
-import org.apache.spark.sql.catalyst.analysis.{Resolver, TypeCheckResult, 
TypeCoercion, UnresolvedExtractValue}
+import org.apache.spark.sql.catalyst.analysis.{Resolver, TypeCheckResult, 
TypeCoercion, UnresolvedAttribute, UnresolvedExtractValue}
 import org.apache.spark.sql.catalyst.analysis.FunctionRegistry.{FUNC_ALIAS, 
FunctionBuilder}
 import org.apache.spark.sql.catalyst.expressions.codegen._
 import org.apache.spark.sql.catalyst.expressions.codegen.Block._
@@ -336,6 +336,14 @@ object CreateStruct {
*/
   def apply(children: Seq[Expression]): CreateNamedStruct = {
 CreateNamedStruct(children.zipWithIndex.flatMap {
+  // For multi-part column name like `struct(a.b.c)`, it may be resolved 
into:
+  //   1. Attribute if `a.b.c` is simply a qualified column name.
+  //   2. GetStructField if `a.b` refers to a struct-type column.
+  //   3. GetArrayStructFields if `a.b` refers to a array-of-struct-type 
column.
+  //   4. GetMapValue if `a.b` refers to a map-type column.
+  // We should always use the last part of the column name (`c` in the 
above example) as the
+  // alias name inside CreateNamedStruct.
+  case (u: UnresolvedAttribute, _) => Seq(Literal(u.nameParts.last), u)
   case (e: NamedExpression, _) if e.resolved => Seq(Literal(e.name), e)
   case (e: NamedExpression, _) => Seq(NamePlaceholder, e)
   case (e, index) => Seq(Literal(s"col${index + 1}"), e)
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeExtractors.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeExtractors.scala
index 9b80140..ef247ef 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeExtractors.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeExtractors.scala
@@ -94,10 +94,7 @@ object ExtractValue {
   }
 }
 
-trait ExtractValue extends Expression {
-  // The name that is used to extract the value.
-  def name: Option[String]
-}

[spark] branch master updated: [SPARK-34781][SQL] Eliminate LEFT SEMI/ANTI joins to its left child side in AQE

2021-03-18 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 8207e2f  [SPARK-34781][SQL] Eliminate LEFT SEMI/ANTI joins to its left 
child side in AQE
8207e2f is described below

commit 8207e2f65cc2ce2d87ee60ee05a2c1ee896cf93e
Author: Cheng Su 
AuthorDate: Fri Mar 19 09:41:52 2021 +0900

[SPARK-34781][SQL] Eliminate LEFT SEMI/ANTI joins to its left child side in 
AQE

### What changes were proposed in this pull request?

In `EliminateJoinToEmptyRelation.scala`, we can extend it to cover more 
cases for LEFT SEMI and LEFT ANI joins:

* Join is left semi join, join right side is non-empty and condition is 
empty. Eliminate join to its left side.
* Join is left anti join, join right side is empty. Eliminate join to its 
left side.

Given we eliminate join to its left side here, renaming the current 
optimization rule to `EliminateUnnecessaryJoin` instead.
In addition, also change to use `checkRowCount()` to check run time row 
count, instead of using `EmptyHashedRelation`. So this can cover 
`BroadcastNestedLoopJoin` as well. (`BroadcastNestedLoopJoin`'s broadcast side 
is `Array[InternalRow]`, not `HashedRelation`).

### Why are the changes needed?

Cover more join cases, and improve query performance for affected queries.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Added unit tests in `AdaptiveQueryExecSuite.scala`.

Closes #31873 from c21/aqe-join.

Authored-by: Cheng Su 
Signed-off-by: Takeshi Yamamuro 
---
 .../sql/execution/adaptive/AQEOptimizer.scala  |  2 +-
 .../adaptive/EliminateJoinToEmptyRelation.scala| 71 -
 .../adaptive/EliminateUnnecessaryJoin.scala| 91 ++
 .../spark/sql/DynamicPartitionPruningSuite.scala   |  2 +-
 .../adaptive/AdaptiveQueryExecSuite.scala  | 51 
 5 files changed, 127 insertions(+), 90 deletions(-)

diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AQEOptimizer.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AQEOptimizer.scala
index 04b8ade..901637d 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AQEOptimizer.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AQEOptimizer.scala
@@ -29,7 +29,7 @@ class AQEOptimizer(conf: SQLConf) extends 
RuleExecutor[LogicalPlan] {
   private val defaultBatches = Seq(
 Batch("Demote BroadcastHashJoin", Once,
   DemoteBroadcastHashJoin),
-Batch("Eliminate Join to Empty Relation", Once, 
EliminateJoinToEmptyRelation)
+Batch("Eliminate Unnecessary Join", Once, EliminateUnnecessaryJoin)
   )
 
   final override protected def batches: Seq[Batch] = {
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/EliminateJoinToEmptyRelation.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/EliminateJoinToEmptyRelation.scala
deleted file mode 100644
index d6df522..000
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/EliminateJoinToEmptyRelation.scala
+++ /dev/null
@@ -1,71 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one or more
- * contributor license agreements.  See the NOTICE file distributed with
- * this work for additional information regarding copyright ownership.
- * The ASF licenses this file to You under the Apache License, Version 2.0
- * (the "License"); you may not use this file except in compliance with
- * the License.  You may obtain a copy of the License at
- *
- *http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.spark.sql.execution.adaptive
-
-import 
org.apache.spark.sql.catalyst.planning.ExtractSingleColumnNullAwareAntiJoin
-import org.apache.spark.sql.catalyst.plans.{Inner, LeftAnti, LeftSemi}
-import org.apache.spark.sql.catalyst.plans.logical.{Join, LocalRelation, 
LogicalPlan}
-import org.apache.spark.sql.catalyst.rules.Rule
-import org.apache.spark.sql.execution.joins.{EmptyHashedRelation, 
HashedRelation, HashedRelationWithAllNullKeys}
-
-/**
- * This optimization rule detects and converts a Join to an empty 
[[LocalRelation]]:
- * 1. Join is single column NULL-aware anti join (NAAJ), and broadcasted 
[[HashedRelation]]
- *is [[HashedRelationWithAllNullKeys]].
- *
- * 2. Join is in

[spark] branch branch-3.1 updated (1b70aad -> c2629a7)

2021-03-18 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 1b70aad  [SPARK-34747][SQL][DOCS] Add virtual operators to the 
built-in function document
 add c2629a7  [SPARK-34719][SQL][3.1] Correctly resolve the view query with 
duplicated column names

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/catalyst/analysis/view.scala  | 44 +---
 .../spark/sql/execution/SQLViewTestSuite.scala | 48 ++
 2 files changed, 86 insertions(+), 6 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-34719][SQL][3.0] Correctly resolve the view query with duplicated column names

2021-03-19 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 25d7219  [SPARK-34719][SQL][3.0] Correctly resolve the view query with 
duplicated column names
25d7219 is described below

commit 25d72191de7c842aa2acd4b7307ba8e6585dd182
Author: Wenchen Fan 
AuthorDate: Sat Mar 20 11:09:50 2021 +0900

[SPARK-34719][SQL][3.0] Correctly resolve the view query with duplicated 
column names

backport https://github.com/apache/spark/pull/31811 to 3.0

### What changes were proposed in this pull request?

For permanent views (and the new SQL temp view in Spark 3.1), we store the 
view SQL text and re-parse/analyze the view SQL text when reading the view. In 
the case of `SELECT * FROM ...`, we want to avoid view schema change (e.g. the 
referenced table changes its schema) and will record the view query output 
column names when creating the view, so that when reading the view we can add a 
`SELECT recorded_column_names FROM ...` to retain the original view query 
schema.

In Spark 3.1 and before, the final SELECT is added after the analysis 
phase: 
https://github.com/apache/spark/blob/branch-3.1/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/view.scala#L67

If the view query has duplicated output column names, we always pick the 
first column when reading a view. A simple repro:
```
scala> sql("create view c(x, y) as select 1 a, 2 a")
res0: org.apache.spark.sql.DataFrame = []

scala> sql("select * from c").show
+---+---+
|  x|  y|
+---+---+
|  1|  1|
+---+---+
```

In the master branch, we will fail at the view reading time due to 
https://github.com/apache/spark/commit/b891862fb6b740b103d5a09530626ee4e0e8f6e3 
, which adds the final SELECT during analysis, so that the query fails with 
`Reference 'a' is ambiguous`

This PR proposes to resolve the view query output column names from the 
matching attributes by ordinal.

For example,  `create view c(x, y) as select 1 a, 2 a`, the view query 
output column names are `[a, a]`. When we reading the view, there are 2 
matching attributes (e.g.`[a#1, a#2]`) and we can simply match them by ordinal.

A negative example is
```
create table t(a int)
create view v as select *, 1 as col from t
replace table t(a int, col int)
```
When reading the view, the view query output column names are `[a, col]`, 
and there are two matching attributes of `col`, and we should fail the query. 
See the tests for details.

### Why are the changes needed?

bug fix

### Does this PR introduce _any_ user-facing change?

yes

### How was this patch tested?

new test

Closes #31894 from cloud-fan/backport.

Authored-by: Wenchen Fan 
Signed-off-by: Takeshi Yamamuro 
---
 .../apache/spark/sql/catalyst/analysis/view.scala  | 44 ++---
 .../apache/spark/sql/execution/SQLViewSuite.scala  | 45 +-
 2 files changed, 82 insertions(+), 7 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/view.scala 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/view.scala
index 6560164..013a303 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/view.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/view.scala
@@ -17,7 +17,10 @@
 
 package org.apache.spark.sql.catalyst.analysis
 
-import org.apache.spark.sql.catalyst.expressions.Alias
+import java.util.Locale
+
+import org.apache.spark.sql.AnalysisException
+import org.apache.spark.sql.catalyst.expressions.{Alias, Attribute}
 import org.apache.spark.sql.catalyst.plans.logical.{LogicalPlan, Project, View}
 import org.apache.spark.sql.catalyst.rules.Rule
 import org.apache.spark.sql.internal.SQLConf
@@ -60,15 +63,44 @@ object EliminateView extends Rule[LogicalPlan] with 
CastSupport {
 // The child has the different output attributes with the View operator. 
Adds a Project over
 // the child of the view.
 case v @ View(desc, output, child) if child.resolved && 
!v.sameOutput(child) =>
+  // Use the stored view query output column names to find the matching 
attributes. The column
+  // names may have duplication, e.g. `CREATE VIEW v(x, y) AS SELECT 1 
col, 2 col`. We need to
+  // make sure the that matching attributes have the same number of 
duplications, and pick the
+  // corresponding attribute by ordinal.
   val resolver = conf.resolver
   val queryColumnNames = desc.viewQueryColumnNames
   val queryOutput = if (queryColumnNames.nonEmpty) {
-// Find the attribute that has the expected at

[spark] branch master updated (7a8a600 -> 620cae0)

2021-03-19 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 7a8a600  [SPARK-34776][SQL] Nested column pruning should not prune 
Window produced attributes
 add 620cae0  [SPARK-33122][SQL] Remove redundant aggregates in the 
Optimzier

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/analysis/Analyzer.scala |  50 ---
 .../analysis/PullOutNondeterministic.scala |  74 ++
 .../spark/sql/catalyst/optimizer/Optimizer.scala   |  45 ++
 .../plans/logical/basicLogicalOperators.scala  |   2 +-
 .../optimizer/RemoveRedundantAggregatesSuite.scala | 163 +
 .../execution/RemoveRedundantProjectsSuite.scala   |   2 +-
 6 files changed, 284 insertions(+), 52 deletions(-)
 create mode 100644 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/PullOutNondeterministic.scala
 create mode 100644 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/RemoveRedundantAggregatesSuite.scala

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (620cae0 -> 2ff0032)

2021-03-19 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 620cae0  [SPARK-33122][SQL] Remove redundant aggregates in the 
Optimzier
 add 2ff0032  [SPARK-34796][SQL] Initialize counter variable for LIMIT 
code-gen in doProduce()

No new revisions were added by this update.

Summary of changes:
 .../scala/org/apache/spark/sql/execution/limit.scala  | 12 
 .../scala/org/apache/spark/sql/SQLQuerySuite.scala| 19 +++
 2 files changed, 27 insertions(+), 4 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated (25d7219 -> 828cf76)

2021-03-19 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 25d7219  [SPARK-34719][SQL][3.0] Correctly resolve the view query with 
duplicated column names
 add 828cf76  [SPARK-34776][SQL][3.0][2.4] Window class should override 
producedAttributes

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala | 2 ++
 1 file changed, 2 insertions(+)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated: [SPARK-34776][SQL][3.0][2.4] Window class should override producedAttributes

2021-03-19 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new 59e4ae4  [SPARK-34776][SQL][3.0][2.4] Window class should override 
producedAttributes
59e4ae4 is described below

commit 59e4ae4149ff93bd64c8b3210c27dc2fbebe2a96
Author: Liang-Chi Hsieh 
AuthorDate: Sat Mar 20 11:26:01 2021 +0900

[SPARK-34776][SQL][3.0][2.4] Window class should override producedAttributes

### What changes were proposed in this pull request?

This patch proposes to override `producedAttributes` of  `Window` class.

### Why are the changes needed?

This is a backport of #31897 to branch-3.0/2.4. Unlike original PR, nested 
column pruning does not allow pushing through `Window` in branch-3.0/2.4 yet. 
But `Window` doesn't override `producedAttributes`. It's wrong and could cause 
potential issue. So backport `Window` related change.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Existing tests.

Closes #31904 from viirya/SPARK-34776-3.0.

Authored-by: Liang-Chi Hsieh 
Signed-off-by: Takeshi Yamamuro 
(cherry picked from commit 828cf76bced1b70769b0453f3e9ba95faaa84e39)
Signed-off-by: Takeshi Yamamuro 
---
 .../apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala | 2 ++
 1 file changed, 2 insertions(+)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala
index a0086c1..2fe9cd4 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala
@@ -621,6 +621,8 @@ case class Window(
   override def output: Seq[Attribute] =
 child.output ++ windowExpressions.map(_.toAttribute)
 
+  override def producedAttributes: AttributeSet = windowOutputSet
+
   def windowOutputSet: AttributeSet = 
AttributeSet(windowExpressions.map(_.toAttribute))
 }
 

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.1 updated (da013d0 -> 250c820)

2021-03-21 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git.


from da013d0  [MINOR][DOCS][ML] Doc 'mode' as a supported Imputer strategy 
in Pyspark
 add 250c820  [SPARK-34796][SQL][3.1] Initialize counter variable for LIMIT 
code-gen in doProduce()

No new revisions were added by this update.

Summary of changes:
 .../scala/org/apache/spark/sql/execution/limit.scala  | 12 
 .../scala/org/apache/spark/sql/SQLQuerySuite.scala| 19 +++
 2 files changed, 27 insertions(+), 4 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-34853][SQL] Remove duplicated definition of output partitioning/ordering for limit operator

2021-03-24 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 35c70e4  [SPARK-34853][SQL] Remove duplicated definition of output 
partitioning/ordering for limit operator
35c70e4 is described below

commit 35c70e417d8c6e3958e0da8a4bec731f9e394a28
Author: Cheng Su 
AuthorDate: Wed Mar 24 23:06:35 2021 +0900

[SPARK-34853][SQL] Remove duplicated definition of output 
partitioning/ordering for limit operator

### What changes were proposed in this pull request?

Both local limit and global limit define the output partitioning and output 
ordering in the same way and this is duplicated 
(https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/limit.scala#L159-L175
 ). We can move the output partitioning and ordering into their parent trait - 
`BaseLimitExec`. This is doable as `BaseLimitExec` has no more other child 
class. This is a minor code refactoring.

### Why are the changes needed?

Clean up the code a little bit. Better readability.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pure refactoring. Rely on existing unit tests.

Closes #31950 from c21/limit-cleanup.

Authored-by: Cheng Su 
Signed-off-by: Takeshi Yamamuro 
---
 .../main/scala/org/apache/spark/sql/execution/limit.scala | 15 +--
 1 file changed, 5 insertions(+), 10 deletions(-)

diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/limit.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/limit.scala
index d8f67fb..e5a2995 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/execution/limit.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/limit.scala
@@ -113,6 +113,10 @@ object BaseLimitExec {
 trait BaseLimitExec extends LimitExec with CodegenSupport {
   override def output: Seq[Attribute] = child.output
 
+  override def outputPartitioning: Partitioning = child.outputPartitioning
+
+  override def outputOrdering: Seq[SortOrder] = child.outputOrdering
+
   protected override def doExecute(): RDD[InternalRow] = 
child.execute().mapPartitions { iter =>
 iter.take(limit)
   }
@@ -156,12 +160,7 @@ trait BaseLimitExec extends LimitExec with CodegenSupport {
 /**
  * Take the first `limit` elements of each child partition, but do not collect 
or shuffle them.
  */
-case class LocalLimitExec(limit: Int, child: SparkPlan) extends BaseLimitExec {
-
-  override def outputOrdering: Seq[SortOrder] = child.outputOrdering
-
-  override def outputPartitioning: Partitioning = child.outputPartitioning
-}
+case class LocalLimitExec(limit: Int, child: SparkPlan) extends BaseLimitExec
 
 /**
  * Take the first `limit` elements of the child's single output partition.
@@ -169,10 +168,6 @@ case class LocalLimitExec(limit: Int, child: SparkPlan) 
extends BaseLimitExec {
 case class GlobalLimitExec(limit: Int, child: SparkPlan) extends BaseLimitExec 
{
 
   override def requiredChildDistribution: List[Distribution] = AllTuples :: Nil
-
-  override def outputPartitioning: Partitioning = child.outputPartitioning
-
-  override def outputOrdering: Seq[SortOrder] = child.outputOrdering
 }
 
 /**

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (88cf86f -> 150769b)

2021-03-24 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 88cf86f  [SPARK-34797][ML] Refactor Logistic Aggregator - support 
virtual centering
 add 150769b  [SPARK-34833][SQL] Apply right-padding correctly for 
correlated subqueries

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/analysis/Analyzer.scala | 45 ++---
 .../apache/spark/sql/CharVarcharTestSuite.scala| 57 --
 2 files changed, 79 insertions(+), 23 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.1 updated: [SPARK-34833][SQL] Apply right-padding correctly for correlated subqueries

2021-03-24 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.1 by this push:
 new 5ecf306  [SPARK-34833][SQL] Apply right-padding correctly for 
correlated subqueries
5ecf306 is described below

commit 5ecf306245d17053e25b68c844828878a66b593a
Author: Takeshi Yamamuro 
AuthorDate: Thu Mar 25 08:31:57 2021 +0900

[SPARK-34833][SQL] Apply right-padding correctly for correlated subqueries

### What changes were proposed in this pull request?

This PR intends to fix the bug that does not apply right-padding for char 
types inside correlated subquries.
For example,  a query below returns nothing in master, but a correct result 
is `c`.
```
scala> sql(s"CREATE TABLE t1(v VARCHAR(3), c CHAR(5)) USING parquet")
scala> sql(s"CREATE TABLE t2(v VARCHAR(5), c CHAR(7)) USING parquet")
scala> sql("INSERT INTO t1 VALUES ('c', 'b')")
scala> sql("INSERT INTO t2 VALUES ('a', 'b')")
scala> val df = sql("""
  |SELECT v FROM t1
  |WHERE 'a' IN (SELECT v FROM t2 WHERE t2.c = t1.c )""".stripMargin)

scala> df.show()
+---+
|  v|
+---+
+---+

```

This is because `ApplyCharTypePadding`  does not handle the case above to 
apply right-padding into `'abc'`. This PR modifies the code in 
`ApplyCharTypePadding` for handling it correctly.

```
// Before this PR:
scala> df.explain(true)
== Analyzed Logical Plan ==
v: string
Project [v#13]
+- Filter a IN (list#12 [c#14])
   :  +- Project [v#15]
   : +- Filter (c#16 = outer(c#14))
   :+- SubqueryAlias spark_catalog.default.t2
   :   +- Relation default.t2[v#15,c#16] parquet
   +- SubqueryAlias spark_catalog.default.t1
  +- Relation default.t1[v#13,c#14] parquet

scala> df.show()
+---+
|  v|
+---+
+---+

// After this PR:
scala> df.explain(true)
== Analyzed Logical Plan ==
v: string
Project [v#43]
+- Filter a IN (list#42 [c#44])
   :  +- Project [v#45]
   : +- Filter (c#46 = rpad(outer(c#44), 7,  ))
   :+- SubqueryAlias spark_catalog.default.t2
   :   +- Relation default.t2[v#45,c#46] parquet
   +- SubqueryAlias spark_catalog.default.t1
  +- Relation default.t1[v#43,c#44] parquet

scala> df.show()
+---+
|  v|
+---+
|  c|
+---+
```

This fix is lated to TPCDS q17; the query returns nothing because of this 
bug: https://github.com/apache/spark/pull/31886/files#r599333799

### Why are the changes needed?

Bugfix.

### Does this PR introduce _any_ user-facing change?
    
No.

### How was this patch tested?

Unit tests added.

Closes #31940 from maropu/FixCharPadding.

Authored-by: Takeshi Yamamuro 
Signed-off-by: Takeshi Yamamuro 
(cherry picked from commit 150769bcedb6e4a97596e0f04d686482cd09e92a)
Signed-off-by: Takeshi Yamamuro 
---
 .../spark/sql/catalyst/analysis/Analyzer.scala | 45 ++---
 .../apache/spark/sql/CharVarcharTestSuite.scala| 57 --
 2 files changed, 79 insertions(+), 23 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
index f4cdeab..d490845 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
@@ -3921,16 +3921,28 @@ object ApplyCharTypePadding extends Rule[LogicalPlan] {
 
   override def apply(plan: LogicalPlan): LogicalPlan = {
 plan.resolveOperatorsUp {
-  case operator if operator.resolved => operator.transformExpressionsUp {
+  case operator => operator.transformExpressionsUp {
+case e if !e.childrenResolved => e
+
 // String literal is treated as char type when it's compared to a char 
type column.
 // We should pad the shorter one to the longer length.
 case b @ BinaryComparison(attr: Attribute, lit) if lit.foldable =>
-  padAttrLitCmp(attr, lit).map { newChildren =>
+  padAttrLitCmp(attr, attr.metadata, lit).map { newChildren =>
 b.withNewChildren(newChildren)
   }.getOrElse(b)
 
 case b @ BinaryComparison(lit, attr: Attribute) if lit.foldable =>
-  padAttrLitCmp(attr, lit).map { newChildren =>
+  padAttrLitCmp(attr, attr.metadata, lit).map { newChildren =>
+b.withNewChildren(newChildren.reverse)

[spark] branch master updated (6d88212 -> 658e95c)

2021-03-25 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 6d88212  [SPARK-34840][SHUFFLE] Fixes cases of corruption in merged 
shuffle …
 add 658e95c  [SPARK-34833][SQL][FOLLOWUP] Handle outer references in all 
the places

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/analysis/Analyzer.scala | 67 +-
 1 file changed, 41 insertions(+), 26 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.1 updated: [SPARK-34833][SQL][FOLLOWUP] Handle outer references in all the places

2021-03-25 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.1 by this push:
 new f3c1298  [SPARK-34833][SQL][FOLLOWUP] Handle outer references in all 
the places
f3c1298 is described below

commit f3c129827986ba06c8a9ab00bd687e8d025103d1
Author: Wenchen Fan 
AuthorDate: Fri Mar 26 09:10:03 2021 +0900

[SPARK-34833][SQL][FOLLOWUP] Handle outer references in all the places

### What changes were proposed in this pull request?

This is a follow-up of https://github.com/apache/spark/pull/31940 . This PR 
generalizes the matching of attributes and outer references, so that outer 
references are handled everywhere.

Note that, currently correlated subquery has a lot of limitations in Spark, 
and the newly covered cases are not possible to happen. So this PR is a code 
refactor.

### Why are the changes needed?

code cleanup

### Does this PR introduce _any_ user-facing change?

no

### How was this patch tested?

existing tests

Closes #31959 from cloud-fan/follow.

Authored-by: Wenchen Fan 
Signed-off-by: Takeshi Yamamuro 
(cherry picked from commit 658e95c345d5aa2a98b8d2a854e003a5c77ed581)
Signed-off-by: Takeshi Yamamuro 
---
 .../spark/sql/catalyst/analysis/Analyzer.scala | 67 +-
 1 file changed, 41 insertions(+), 26 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
index d490845..600a5af 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
@@ -3919,6 +3919,14 @@ object UpdateOuterReferences extends Rule[LogicalPlan] {
  */
 object ApplyCharTypePadding extends Rule[LogicalPlan] {
 
+  object AttrOrOuterRef {
+def unapply(e: Expression): Option[Attribute] = e match {
+  case a: Attribute => Some(a)
+  case OuterReference(a: Attribute) => Some(a)
+  case _ => None
+}
+  }
+
   override def apply(plan: LogicalPlan): LogicalPlan = {
 plan.resolveOperatorsUp {
   case operator => operator.transformExpressionsUp {
@@ -3926,27 +3934,17 @@ object ApplyCharTypePadding extends Rule[LogicalPlan] {
 
 // String literal is treated as char type when it's compared to a char 
type column.
 // We should pad the shorter one to the longer length.
-case b @ BinaryComparison(attr: Attribute, lit) if lit.foldable =>
-  padAttrLitCmp(attr, attr.metadata, lit).map { newChildren =>
-b.withNewChildren(newChildren)
-  }.getOrElse(b)
-
-case b @ BinaryComparison(lit, attr: Attribute) if lit.foldable =>
-  padAttrLitCmp(attr, attr.metadata, lit).map { newChildren =>
-b.withNewChildren(newChildren.reverse)
-  }.getOrElse(b)
-
-case b @ BinaryComparison(or @ OuterReference(attr: Attribute), lit) 
if lit.foldable =>
-  padAttrLitCmp(or, attr.metadata, lit).map { newChildren =>
+case b @ BinaryComparison(e @ AttrOrOuterRef(attr), lit) if 
lit.foldable =>
+  padAttrLitCmp(e, attr.metadata, lit).map { newChildren =>
 b.withNewChildren(newChildren)
   }.getOrElse(b)
 
-case b @ BinaryComparison(lit, or @ OuterReference(attr: Attribute)) 
if lit.foldable =>
-  padAttrLitCmp(or, attr.metadata, lit).map { newChildren =>
+case b @ BinaryComparison(lit, e @ AttrOrOuterRef(attr)) if 
lit.foldable =>
+  padAttrLitCmp(e, attr.metadata, lit).map { newChildren =>
 b.withNewChildren(newChildren.reverse)
   }.getOrElse(b)
 
-case i @ In(attr: Attribute, list)
+case i @ In(e @ AttrOrOuterRef(attr), list)
   if attr.dataType == StringType && list.forall(_.foldable) =>
   CharVarcharUtils.getRawType(attr.metadata).flatMap {
 case CharType(length) =>
@@ -3955,7 +3953,7 @@ object ApplyCharTypePadding extends Rule[LogicalPlan] {
   val literalCharLengths = literalChars.map(_.numChars())
   val targetLen = (length +: literalCharLengths).max
   Some(i.copy(
-value = addPadding(attr, length, targetLen),
+value = addPadding(e, length, targetLen),
 list = list.zip(literalCharLengths).map {
   case (lit, charLength) => addPadding(lit, charLength, 
targetLen)
 } ++ nulls.map(Literal.create(_, StringType
@@ -3963,19 +3961,36 @@ object ApplyCharTypePadding extends Rule[LogicalPlan] {
   }.getOrElse(i)
 
 // For char type colum

[spark] branch master updated (b2bfe98 -> fcef237)

2021-03-29 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from b2bfe98  [SPARK-34845][CORE] ProcfsMetricsGetter shouldn't return 
partial procfs metrics
 add fcef237  [SPARK-34622][SQL] Push down limit through Project with Join

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/optimizer/Optimizer.scala   | 33 +-
 .../catalyst/optimizer/LimitPushdownSuite.scala|  9 ++
 2 files changed, 29 insertions(+), 13 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (3951e33 -> 90f2d4d)

2021-03-31 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 3951e33  [SPARK-34881][SQL] New SQL Function: TRY_CAST
 add 90f2d4d  [SPARK-34882][SQL] Replace if with filter clause in 
RewriteDistinctAggregates

No new revisions were added by this update.

Summary of changes:
 .../optimizer/RewriteDistinctAggregates.scala  | 47 ++
 .../org/apache/spark/sql/DataFrameSuite.scala  | 29 -
 2 files changed, 49 insertions(+), 27 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (a9ca197 -> 39d5677)

2021-04-05 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from a9ca197  [SPARK-34949][CORE] Prevent BlockManager reregister when 
Executor is shutting down
 add 39d5677  [SPARK-34932][SQL] deprecate GROUP BY ... GROUPING SETS (...) 
and promote GROUP BY GROUPING SETS (...)

No new revisions were added by this update.

Summary of changes:
 docs/sql-ref-syntax-qry-select-groupby.md  | 34 +++
 .../spark/sql/catalyst/analysis/Analyzer.scala | 36 +++
 .../spark/sql/catalyst/expressions/grouping.scala  | 46 ---
 .../spark/sql/catalyst/parser/AstBuilder.scala | 13 +++---
 .../analysis/ResolveGroupingAnalyticsSuite.scala   | 51 +-
 .../sql/catalyst/parser/PlanParserSuite.scala  |  2 +-
 6 files changed, 72 insertions(+), 110 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (39d5677 -> 7cfface)

2021-04-05 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 39d5677  [SPARK-34932][SQL] deprecate GROUP BY ... GROUPING SETS (...) 
and promote GROUP BY GROUPING SETS (...)
 add 7cfface  [SPARK-34935][SQL] CREATE TABLE LIKE should respect the 
reserved table properties

No new revisions were added by this update.

Summary of changes:
 docs/sql-migration-guide.md   | 2 ++
 .../scala/org/apache/spark/sql/execution/SparkSqlParser.scala | 3 ++-
 .../org/apache/spark/sql/execution/SparkSqlParserSuite.scala  | 8 
 3 files changed, 12 insertions(+), 1 deletion(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (390d5bd -> 7c8dc5e)

2021-04-06 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 390d5bd  [SPARK-34968][TEST][PYTHON] Add the `-fr` argument to xargs rm
 add 7c8dc5e  [SPARK-34922][SQL] Use a relative cost comparison function in 
the CBO

No new revisions were added by this update.

Summary of changes:
 .../catalyst/optimizer/CostBasedJoinReorder.scala  |  28 +-
 .../org/apache/spark/sql/internal/SQLConf.scala|   6 +-
 .../optimizer/joinReorder/JoinReorderSuite.scala   |   3 -
 .../StarJoinCostBasedReorderSuite.scala|   9 +-
 .../approved-plans-modified/q73.sf100/explain.txt  |  86 +--
 .../q73.sf100/simplified.txt   |  20 +-
 .../approved-plans-v1_4/q12.sf100/explain.txt  | 178 +++
 .../approved-plans-v1_4/q12.sf100/simplified.txt   |  52 +-
 .../approved-plans-v1_4/q13.sf100/explain.txt  | 134 ++---
 .../approved-plans-v1_4/q13.sf100/simplified.txt   |  38 +-
 .../approved-plans-v1_4/q18.sf100/explain.txt  | 152 +++---
 .../approved-plans-v1_4/q18.sf100/simplified.txt   |  50 +-
 .../approved-plans-v1_4/q19.sf100/explain.txt  | 376 ++---
 .../approved-plans-v1_4/q19.sf100/simplified.txt   | 118 ++---
 .../approved-plans-v1_4/q20.sf100/explain.txt  | 178 +++
 .../approved-plans-v1_4/q20.sf100/simplified.txt   |  52 +-
 .../approved-plans-v1_4/q24a.sf100/explain.txt | 116 ++--
 .../approved-plans-v1_4/q24a.sf100/simplified.txt  |  34 +-
 .../approved-plans-v1_4/q24b.sf100/explain.txt | 116 ++--
 .../approved-plans-v1_4/q24b.sf100/simplified.txt  |  34 +-
 .../approved-plans-v1_4/q25.sf100/explain.txt  | 192 +++
 .../approved-plans-v1_4/q25.sf100/simplified.txt   | 138 ++---
 .../approved-plans-v1_4/q33.sf100/explain.txt  | 264 +-
 .../approved-plans-v1_4/q33.sf100/simplified.txt   |  58 +-
 .../approved-plans-v1_4/q52.sf100/explain.txt  | 146 +++---
 .../approved-plans-v1_4/q52.sf100/simplified.txt   |  30 +-
 .../approved-plans-v1_4/q55.sf100/explain.txt  | 142 ++---
 .../approved-plans-v1_4/q55.sf100/simplified.txt   |  30 +-
 .../approved-plans-v1_4/q72.sf100/explain.txt  | 326 ++--
 .../approved-plans-v1_4/q72.sf100/simplified.txt   | 154 +++---
 .../approved-plans-v1_4/q81.sf100/explain.txt  | 582 ++---
 .../approved-plans-v1_4/q81.sf100/simplified.txt   | 146 +++---
 .../approved-plans-v1_4/q91.sf100/explain.txt  | 312 +--
 .../approved-plans-v1_4/q91.sf100/simplified.txt   |  66 +--
 .../approved-plans-v1_4/q98.sf100/explain.txt  | 186 +++
 .../approved-plans-v1_4/q98.sf100/simplified.txt   |  52 +-
 .../approved-plans-v2_7/q12.sf100/explain.txt  | 178 +++
 .../approved-plans-v2_7/q12.sf100/simplified.txt   |  52 +-
 .../approved-plans-v2_7/q18a.sf100/explain.txt | 172 +++---
 .../approved-plans-v2_7/q18a.sf100/simplified.txt  |  54 +-
 .../approved-plans-v2_7/q20.sf100/explain.txt  | 178 +++
 .../approved-plans-v2_7/q20.sf100/simplified.txt   |  52 +-
 .../approved-plans-v2_7/q72.sf100/explain.txt  | 326 ++--
 .../approved-plans-v2_7/q72.sf100/simplified.txt   | 154 +++---
 .../approved-plans-v2_7/q98.sf100/explain.txt  | 182 +++
 .../approved-plans-v2_7/q98.sf100/simplified.txt   |  52 +-
 46 files changed, 3011 insertions(+), 2993 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.1 updated (f6b5c6f -> 84d96e8)

2021-04-07 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git.


from f6b5c6f  [SPARK-34970][SQL][SERCURITY][3.1] Redact map-type options in 
the output of explain()
 add 84d96e8  [SPARK-34922][SQL][3.1] Use a relative cost comparison 
function in the CBO

No new revisions were added by this update.

Summary of changes:
 .../catalyst/optimizer/CostBasedJoinReorder.scala  |  28 +-
 .../org/apache/spark/sql/internal/SQLConf.scala|   6 +-
 .../optimizer/joinReorder/JoinReorderSuite.scala   |   3 -
 .../StarJoinCostBasedReorderSuite.scala|   9 +-
 .../approved-plans-modified/q73.sf100/explain.txt  |   8 +-
 .../approved-plans-v1_4/q12.sf100/explain.txt  | 174 ++---
 .../approved-plans-v1_4/q12.sf100/simplified.txt   |  52 +-
 .../approved-plans-v1_4/q13.sf100/explain.txt  | 138 ++--
 .../approved-plans-v1_4/q13.sf100/simplified.txt   |  34 +-
 .../approved-plans-v1_4/q18.sf100/explain.txt  | 303 
 .../approved-plans-v1_4/q18.sf100/simplified.txt   |  50 +-
 .../approved-plans-v1_4/q19.sf100/explain.txt  | 368 -
 .../approved-plans-v1_4/q19.sf100/simplified.txt   | 116 +--
 .../approved-plans-v1_4/q20.sf100/explain.txt  | 174 ++---
 .../approved-plans-v1_4/q20.sf100/simplified.txt   |  52 +-
 .../approved-plans-v1_4/q24a.sf100/explain.txt | 832 +++--
 .../approved-plans-v1_4/q24a.sf100/simplified.txt  |  34 +-
 .../approved-plans-v1_4/q24b.sf100/explain.txt | 832 +++--
 .../approved-plans-v1_4/q24b.sf100/simplified.txt  |  34 +-
 .../approved-plans-v1_4/q25.sf100/explain.txt  | 186 ++---
 .../approved-plans-v1_4/q25.sf100/simplified.txt   | 130 ++--
 .../approved-plans-v1_4/q33.sf100/explain.txt  | 395 +-
 .../approved-plans-v1_4/q33.sf100/simplified.txt   |  58 +-
 .../approved-plans-v1_4/q52.sf100/explain.txt  | 138 ++--
 .../approved-plans-v1_4/q52.sf100/simplified.txt   |  26 +-
 .../approved-plans-v1_4/q55.sf100/explain.txt  | 134 ++--
 .../approved-plans-v1_4/q55.sf100/simplified.txt   |  26 +-
 .../approved-plans-v1_4/q72.sf100/explain.txt  | 260 +++
 .../approved-plans-v1_4/q72.sf100/simplified.txt   | 150 ++--
 .../approved-plans-v1_4/q81.sf100/explain.txt  | 570 +++---
 .../approved-plans-v1_4/q81.sf100/simplified.txt   | 142 ++--
 .../approved-plans-v1_4/q91.sf100/explain.txt  | 304 
 .../approved-plans-v1_4/q91.sf100/simplified.txt   |  62 +-
 .../approved-plans-v1_4/q98.sf100/explain.txt  | 182 ++---
 .../approved-plans-v1_4/q98.sf100/simplified.txt   |  52 +-
 .../approved-plans-v2_7/q12.sf100/explain.txt  | 174 ++---
 .../approved-plans-v2_7/q12.sf100/simplified.txt   |  52 +-
 .../approved-plans-v2_7/q18a.sf100/explain.txt | 737 +-
 .../approved-plans-v2_7/q18a.sf100/simplified.txt  |  54 +-
 .../approved-plans-v2_7/q20.sf100/explain.txt  | 174 ++---
 .../approved-plans-v2_7/q20.sf100/simplified.txt   |  52 +-
 .../approved-plans-v2_7/q72.sf100/explain.txt  | 260 +++
 .../approved-plans-v2_7/q72.sf100/simplified.txt   | 150 ++--
 .../approved-plans-v2_7/q98.sf100/explain.txt  | 178 ++---
 .../approved-plans-v2_7/q98.sf100/simplified.txt   |  52 +-
 45 files changed, 4024 insertions(+), 3921 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-34922][SQL][3.0] Use a relative cost comparison function in the CBO

2021-04-07 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new b9ee41f  [SPARK-34922][SQL][3.0] Use a relative cost comparison 
function in the CBO
b9ee41f is described below

commit b9ee41fa9957631ca0f859ee928358c108fbd9a9
Author: Tanel Kiis 
AuthorDate: Thu Apr 8 11:03:59 2021 +0900

[SPARK-34922][SQL][3.0] Use a relative cost comparison function in the CBO

### What changes were proposed in this pull request?

Changed the cost comparison function of the CBO to use the ratios of row 
counts and sizes in bytes.

### Why are the changes needed?

In #30965 we changed to CBO cost comparison function so it would be 
"symetric": `A.betterThan(B)` now implies, that `!B.betterThan(A)`.
With that we caused a performance regressions in some queries - TPCDS q19 
for example.

The original cost comparison function used the ratios `relativeRows = 
A.rowCount / B.rowCount` and `relativeSize = A.size / B.size`. The changed 
function compared "absolute" cost values `costA = w*A.rowCount + (1-w)*A.size` 
and `costB = w*B.rowCount + (1-w)*B.size`.

Given the input from wzhfy we decided to go back to the relative values, 
because otherwise one (size) may overwhelm the other (rowCount). But this time 
we avoid adding up the ratios.

Originally `A.betterThan(B) => w*relativeRows + (1-w)*relativeSize < 1` was 
used. Besides being "non-symteric", this also can exhibit one overwhelming 
other.
For `w=0.5` If `A` size (bytes) is at least 2x larger than `B`, then no 
matter how many times more rows does the `B` plan have, `B` will allways be 
considered to be better - `0.5*2 + 0.5*0.01 > 1`.

When working with ratios, then it would be better to multiply them.
The proposed cost comparison function is: `A.betterThan(B) => 
relativeRows^w  * relativeSize^(1-w) < 1`.

### Does this PR introduce _any_ user-facing change?

Comparison of the changed TPCDS v1.4 query execution times at sf=10:

  | absolute | multiplicative |   | additive |  
-- | -- | -- | -- | -- | --
q12 | 145 | 137 | -5.52% | 141 | -2.76%
q13 | 264 | 271 | 2.65% | 271 | 2.65%
q17 | 4521 | 4243 | -6.15% | 4348 | -3.83%
q18 | 758 | 466 | -38.52% | 480 | -36.68%
q19 | 38503 | 2167 | -94.37% | 2176 | -94.35%
q20 | 119 | 120 | 0.84% | 126 | 5.88%
q24a | 16429 | 16838 | 2.49% | 17103 | 4.10%
q24b | 16592 | 16999 | 2.45% | 17268 | 4.07%
q25 | 3558 | 3556 | -0.06% | 3675 | 3.29%
q33 | 362 | 361 | -0.28% | 380 | 4.97%
q52 | 1020 | 1032 | 1.18% | 1052 | 3.14%
q55 | 927 | 938 | 1.19% | 961 | 3.67%
q72 | 24169 | 13377 | -44.65% | 24306 | 0.57%
q81 | 1285 | 1185 | -7.78% | 1168 | -9.11%
q91 | 324 | 336 | 3.70% | 337 | 4.01%
q98 | 126 | 129 | 2.38% | 131 | 3.97%

All times are in ms, the change is compared to the situation in the master 
branch (absolute).
The proposed cost function (multiplicative) significantlly improves the 
performance on q18, q19 and q72. The original cost function (additive) has 
similar improvements at q18 and q19. All other chagnes are within the error 
bars and I would ignore them - perhaps q81 has also improved.

### How was this patch tested?

PlanStabilitySuite

Closes #32076 from tanelk/SPARK-34922_cbo_better_cost_function_3.0.

Lead-authored-by: Tanel Kiis 
Co-authored-by: tanel.k...@gmail.com 
Signed-off-by: Takeshi Yamamuro 
---
 .../catalyst/optimizer/CostBasedJoinReorder.scala  | 28 ++
 .../org/apache/spark/sql/internal/SQLConf.scala|  6 +++--
 .../sql/catalyst/optimizer/JoinReorderSuite.scala  |  3 ---
 .../optimizer/StarJoinCostBasedReorderSuite.scala  |  9 +++
 4 files changed, 32 insertions(+), 14 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala
index 93c608dc..ed7d92e 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala
@@ -343,12 +343,30 @@ object JoinReorderDP extends PredicateHelper with Logging 
{
   }
 }
 
+/**
+ * To identify the plan with smaller computational cost,
+ * we use the weighted geometric mean of ratio of rows and the ratio of 
sizes in bytes.
+ *
+ * There are other ways to combine these values as a cost comparison 
function.
+ * Some of these, that we have experimented with, but have gotten worse 
result,
+ * than with the current one:
+ * 1) Weighted ar

[spark] branch master updated (9c1f807 -> 278203d)

2021-04-12 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 9c1f807  [SPARK-35031][PYTHON] Port Koalas operations on different 
frames tests into PySpark
 add 278203d  [SPARK-28227][SQL] Support projection, aggregate/window 
functions, and lateral view in the TRANSFORM clause

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/catalyst/parser/SqlBase.g4|   6 +-
 .../spark/sql/catalyst/analysis/Analyzer.scala |   9 +-
 .../spark/sql/catalyst/parser/AstBuilder.scala |  79 --
 .../sql/catalyst/parser/PlanParserSuite.scala  |  18 +-
 .../test/resources/sql-tests/inputs/transform.sql  | 132 +
 .../resources/sql-tests/results/transform.sql.out  | 316 -
 .../spark/sql/execution/SparkSqlParserSuite.scala  | 164 +--
 .../sql/execution/command/DDLParserSuite.scala |  14 +-
 8 files changed, 662 insertions(+), 76 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (26f312e -> caf33be)

2021-04-12 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 26f312e  [SPARK-35037][SQL] Recognize sign before the interval string 
in literals
 add caf33be  [SPARK-33411][SQL] Cardinality estimation of union, sort and 
range operator

No new revisions were added by this update.

Summary of changes:
 .../plans/logical/LogicalPlanVisitor.scala |   3 +
 .../plans/logical/basicLogicalOperators.scala  |  22 ++-
 .../statsEstimation/BasicStatsPlanVisitor.scala|  12 +-
 .../SizeInBytesOnlyStatsPlanVisitor.scala  |   2 +
 .../logical/statsEstimation/UnionEstimation.scala  | 120 +
 .../BasicStatsEstimationSuite.scala| 136 +--
 .../statsEstimation/UnionEstimationSuite.scala | 194 +
 .../spark/sql/StatisticsCollectionSuite.scala  |   4 +-
 8 files changed, 473 insertions(+), 20 deletions(-)
 create mode 100644 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/UnionEstimation.scala
 create mode 100644 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/statsEstimation/UnionEstimationSuite.scala

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (12abfe7 -> 074f770)

2021-04-18 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 12abfe7  [SPARK-34716][SQL] Support ANSI SQL intervals by the 
aggregate function `sum`
 add 074f770  [SPARK-35115][SQL][TESTS] Check ANSI intervals in 
`MutableProjectionSuite`

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/expressions/MutableProjectionSuite.scala   | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (978cd0b -> fd08c93)

2021-04-18 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 978cd0b  [SPARK-35092][UI] the auto-generated rdd's name in the 
storage tab should be truncated if it is too long
 add fd08c93  [SPARK-35109][SQL] Fix minor exception messages of 
HashedRelation and HashJoin

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/errors/QueryExecutionErrors.scala | 18 ++
 .../spark/sql/execution/joins/HashedRelation.scala |  6 ++
 2 files changed, 8 insertions(+), 16 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.1 updated (034ba76 -> 5f48abe)

2021-04-20 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 034ba76  [SPARK-35080][SQL] Only allow a subset of correlated equality 
predicates when a subquery is aggregated
 add 5f48abe  [SPARK-34639][SQL][3.1] RelationalGroupedDataset.alias should 
not create UnresolvedAlias

No new revisions were added by this update.

Summary of changes:
 .../main/scala/org/apache/spark/sql/RelationalGroupedDataset.scala  | 6 +-
 sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala| 3 +++
 2 files changed, 4 insertions(+), 5 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (9af338c -> e503b9c)

2021-04-23 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 9af338c  [SPARK-35078][SQL] Add tree traversal pruning in expression 
rules
 add e503b9c  [SPARK-35201][SQL] Format empty grouping set exception in 
CUBE/ROLLUP

No new revisions were added by this update.

Summary of changes:
 .../scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala | 6 ++
 .../main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala | 3 +++
 2 files changed, 5 insertions(+), 4 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-33976][SQL][DOCS][FOLLOWUP] Fix syntax error in select doc page

2021-04-28 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 26a5e33  [SPARK-33976][SQL][DOCS][FOLLOWUP] Fix syntax error in select 
doc page
26a5e33 is described below

commit 26a5e339a61ab06fb2949166db705f1b575addd3
Author: Angerszh 
AuthorDate: Wed Apr 28 16:47:02 2021 +0900

[SPARK-33976][SQL][DOCS][FOLLOWUP] Fix syntax error in select doc page

### What changes were proposed in this pull request?
Add doc about `TRANSFORM` and related function.

### Why are the changes needed?

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Not need

Closes #32257 from AngersZh/SPARK-33976-followup.

Authored-by: Angerszh 
Signed-off-by: Takeshi Yamamuro 
---
 docs/sql-ref-syntax-qry-select.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/sql-ref-syntax-qry-select.md 
b/docs/sql-ref-syntax-qry-select.md
index 62a7f5f..500eda1 100644
--- a/docs/sql-ref-syntax-qry-select.md
+++ b/docs/sql-ref-syntax-qry-select.md
@@ -41,7 +41,7 @@ select_statement [ { UNION | INTERSECT | EXCEPT } [ ALL | 
DISTINCT ] select_stat
 
 While `select_statement` is defined as
 ```sql
-SELECT [ hints , ... ] [ ALL | DISTINCT ] { [[ named_expression | 
regex_column_names ] [ , ... ] | TRANSFORM (...)) ] }
+SELECT [ hints , ... ] [ ALL | DISTINCT ] { [ [ named_expression | 
regex_column_names ] [ , ... ] | TRANSFORM (...) ] }
 FROM { from_item [ , ... ] }
 [ PIVOT clause ]
 [ LATERAL VIEW clause ] [ ... ] 

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.1 updated (e58055b -> 361e684)

2021-04-28 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git.


from e58055b  [SPARK-35244][SQL] Invoke should throw the original exception
 add 361e684  [SPARK-33976][SQL][DOCS][3.1] Add a SQL doc page for a 
TRANSFORM clause

No new revisions were added by this update.

Summary of changes:
 docs/_data/menu-sql.yaml|   2 +
 docs/sql-ref-syntax-qry-select-transform.md | 235 
 docs/sql-ref-syntax-qry-select.md   |   7 +-
 docs/sql-ref-syntax-qry.md  |   1 +
 docs/sql-ref-syntax.md  |   1 +
 5 files changed, 245 insertions(+), 1 deletion(-)
 create mode 100644 docs/sql-ref-syntax-qry-select-transform.md

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated (6e83789b -> a556bc8)

2021-04-28 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 6e83789b [SPARK-35244][SQL] Invoke should throw the original exception
 add a556bc8  [SPARK-33976][SQL][DOCS][3.0] Add a SQL doc page for a 
TRANSFORM clause

No new revisions were added by this update.

Summary of changes:
 docs/_data/menu-sql.yaml|   2 +
 docs/sql-ref-syntax-qry-select-transform.md | 235 
 docs/sql-ref-syntax-qry-select.md   |   7 +-
 docs/sql-ref-syntax-qry.md  |   1 +
 docs/sql-ref-syntax.md  |   1 +
 5 files changed, 245 insertions(+), 1 deletion(-)
 create mode 100644 docs/sql-ref-syntax-qry-select-transform.md

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (26a5e33 -> 8b62c29)

2021-04-28 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 26a5e33  [SPARK-33976][SQL][DOCS][FOLLOWUP] Fix syntax error in select 
doc page
 add 8b62c29  [SPARK-35214][SQL] OptimizeSkewedJoin support 
ShuffledHashJoinExec

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/internal/SQLConf.scala|   4 +-
 .../execution/adaptive/OptimizeSkewedJoin.scala| 189 -
 .../execution/exchange/EnsureRequirements.scala|   9 +-
 .../sql/execution/joins/ShuffledHashJoinExec.scala |   3 +-
 .../spark/sql/execution/joins/ShuffledJoin.scala   |  18 +-
 .../sql/execution/joins/SortMergeJoinExec.scala|  17 --
 .../adaptive/AdaptiveQueryExecSuite.scala  | 130 +++---
 7 files changed, 204 insertions(+), 166 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.1 updated (361e684 -> db8204e)

2021-04-28 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 361e684  [SPARK-33976][SQL][DOCS][3.1] Add a SQL doc page for a 
TRANSFORM clause
 add db8204e  [SPARK-35159][SQL][DOCS][3.1] Extract hive format doc

No new revisions were added by this update.

Summary of changes:
 docs/sql-ref-syntax-ddl-create-table-hiveformat.md | 52 +--
 docs/sql-ref-syntax-hive-format.md | 73 ++
 docs/sql-ref-syntax-qry-select-transform.md| 48 +-
 3 files changed, 77 insertions(+), 96 deletions(-)
 create mode 100644 docs/sql-ref-syntax-hive-format.md

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated (a556bc8 -> c6659e6)

2021-04-28 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git.


from a556bc8  [SPARK-33976][SQL][DOCS][3.0] Add a SQL doc page for a 
TRANSFORM clause
 add c6659e6  [SPARK-35159][SQL][DOCS][3.0] Extract hive format doc

No new revisions were added by this update.

Summary of changes:
 docs/sql-ref-syntax-ddl-create-table-hiveformat.md | 52 +--
 docs/sql-ref-syntax-hive-format.md | 73 ++
 docs/sql-ref-syntax-qry-select-transform.md| 48 +-
 3 files changed, 77 insertions(+), 96 deletions(-)
 create mode 100644 docs/sql-ref-syntax-hive-format.md

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (86d3bb5 -> 403e479)

2021-04-28 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 86d3bb5  [SPARK-34981][SQL] Implement V2 function resolution and 
evaluation
 add 403e479  [SPARK-35244][SQL][FOLLOWUP] Add null check for the exception 
cause

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/catalyst/expressions/objects/objects.scala| 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (caa46ce -> cd689c9)

2021-05-02 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from caa46ce  [SPARK-35112][SQL] Support Cast string to day-second interval
 add cd689c9  [SPARK-35192][SQL][TESTS] Port minimal TPC-DS datagen code 
from databricks/spark-sql-perf

No new revisions were added by this update.

Summary of changes:
 .github/workflows/build_and_test.yml   |  31 +-
 .../scala/org/apache/spark/sql/GenTPCDSData.scala  | 445 +
 .../scala/org/apache/spark/sql/TPCDSBase.scala | 537 +
 .../sql/{TPCDSBase.scala => TPCDSSchema.scala} |  92 +---
 4 files changed, 466 insertions(+), 639 deletions(-)
 create mode 100644 
sql/core/src/test/scala/org/apache/spark/sql/GenTPCDSData.scala
 copy sql/core/src/test/scala/org/apache/spark/sql/{TPCDSBase.scala => 
TPCDSSchema.scala} (83%)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (7fd3f8f -> f550e03)

2021-05-04 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 7fd3f8f  [SPARK-35294][SQL] Add tree traversal pruning in rules with 
dedicated files under optimizer
 add f550e03  [SPARK-34794][SQL] Fix lambda variable name issues in nested 
DataFrame functions

No new revisions were added by this update.

Summary of changes:
 .../expressions/higherOrderFunctions.scala | 12 ++-
 .../scala/org/apache/spark/sql/functions.scala | 12 +--
 .../apache/spark/sql/DataFrameFunctionsSuite.scala | 23 ++
 3 files changed, 40 insertions(+), 7 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.1 updated: [SPARK-34794][SQL] Fix lambda variable name issues in nested DataFrame functions

2021-05-04 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.1 by this push:
 new 6df4ec0  [SPARK-34794][SQL] Fix lambda variable name issues in nested 
DataFrame functions
6df4ec0 is described below

commit 6df4ec09a17077c2a0b114a7bf5736711ba268e4
Author: dsolow 
AuthorDate: Wed May 5 12:46:13 2021 +0900

[SPARK-34794][SQL] Fix lambda variable name issues in nested DataFrame 
functions

### What changes were proposed in this pull request?

To fix lambda variable name issues in nested DataFrame functions, this PR 
modifies code to use a global counter for `LambdaVariables` names created by 
higher order functions.

This is the rework of #31887. Closes #31887.

### Why are the changes needed?

 This moves away from the current hard-coded variable names which break on 
nested function calls. There is currently a bug where nested transforms in 
particular fail (the inner variable shadows the outer variable)

For this query:
```
val df = Seq(
(Seq(1,2,3), Seq("a", "b", "c"))
).toDF("numbers", "letters")

df.select(
f.flatten(
f.transform(
$"numbers",
(number: Column) => { f.transform(
$"letters",
(letter: Column) => { f.struct(
number.as("number"),
letter.as("letter")
) }
) }
)
).as("zipped")
).show(10, false)
```
This is the current (incorrect) output:
```
++
|zipped  |
++
|[{a, a}, {b, b}, {c, c}, {a, a}, {b, b}, {c, c}, {a, a}, {b, b}, {c, c}]|
++
```
And this is the correct output after fix:
```
++
|zipped  |
++
|[{1, a}, {1, b}, {1, c}, {2, a}, {2, b}, {2, c}, {3, a}, {3, b}, {3, c}]|
++
```

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Added the new test in `DataFrameFunctionsSuite`.
    
Closes #32424 from maropu/pr31887.

Lead-authored-by: dsolow 
Co-authored-by: Takeshi Yamamuro 
Co-authored-by: dmsolow 
Signed-off-by: Takeshi Yamamuro 
    (cherry picked from commit f550e03b96638de93381734c4eada2ace02d9a4f)
Signed-off-by: Takeshi Yamamuro 
---
 .../expressions/higherOrderFunctions.scala | 12 ++-
 .../scala/org/apache/spark/sql/functions.scala | 12 +--
 .../apache/spark/sql/DataFrameFunctionsSuite.scala | 23 ++
 3 files changed, 40 insertions(+), 7 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala
index ba447ea..a4e069d 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala
@@ -18,7 +18,7 @@
 package org.apache.spark.sql.catalyst.expressions
 
 import java.util.Comparator
-import java.util.concurrent.atomic.AtomicReference
+import java.util.concurrent.atomic.{AtomicInteger, AtomicReference}
 
 import scala.collection.mutable
 
@@ -52,6 +52,16 @@ case class UnresolvedNamedLambdaVariable(nameParts: 
Seq[String])
   override def sql: String = name
 }
 
+object UnresolvedNamedLambdaVariable {
+
+  // Counter to ensure lambda variable names are unique
+  private val nextVarNameId = new AtomicInteger(0)
+
+  def freshVarName(name: String): String = {
+s"${name}_${nextVarNameId.getAndIncrement()}"
+  }
+}
+
 /**
  * A named lambda variable.
  */
diff --git a/sql/core/src/main/scala/org/apache/spark/sql/functions.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/functions.scala
index e6b41cd..6bc49b6 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/functions.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/functions.scala
@@ -3644,22 +3644,22 @@ object functions {
   }
 
   private def createLambda(f: Column

[spark] branch branch-3.0 updated: [SPARK-34794][SQL] Fix lambda variable name issues in nested DataFrame functions

2021-05-04 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 8ef4023  [SPARK-34794][SQL] Fix lambda variable name issues in nested 
DataFrame functions
8ef4023 is described below

commit 8ef4023683dee537a40d376d93c329a802a929bd
Author: dsolow 
AuthorDate: Wed May 5 12:46:13 2021 +0900

[SPARK-34794][SQL] Fix lambda variable name issues in nested DataFrame 
functions

### What changes were proposed in this pull request?

To fix lambda variable name issues in nested DataFrame functions, this PR 
modifies code to use a global counter for `LambdaVariables` names created by 
higher order functions.

This is the rework of #31887. Closes #31887.

### Why are the changes needed?

 This moves away from the current hard-coded variable names which break on 
nested function calls. There is currently a bug where nested transforms in 
particular fail (the inner variable shadows the outer variable)

For this query:
```
val df = Seq(
(Seq(1,2,3), Seq("a", "b", "c"))
).toDF("numbers", "letters")

df.select(
f.flatten(
f.transform(
$"numbers",
(number: Column) => { f.transform(
$"letters",
(letter: Column) => { f.struct(
number.as("number"),
letter.as("letter")
) }
) }
)
).as("zipped")
).show(10, false)
```
This is the current (incorrect) output:
```
++
|zipped  |
++
|[{a, a}, {b, b}, {c, c}, {a, a}, {b, b}, {c, c}, {a, a}, {b, b}, {c, c}]|
++
```
And this is the correct output after fix:
```
++
|zipped  |
++
|[{1, a}, {1, b}, {1, c}, {2, a}, {2, b}, {2, c}, {3, a}, {3, b}, {3, c}]|
++
```

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Added the new test in `DataFrameFunctionsSuite`.
    
Closes #32424 from maropu/pr31887.

Lead-authored-by: dsolow 
Co-authored-by: Takeshi Yamamuro 
Co-authored-by: dmsolow 
Signed-off-by: Takeshi Yamamuro 
    (cherry picked from commit f550e03b96638de93381734c4eada2ace02d9a4f)
Signed-off-by: Takeshi Yamamuro 
---
 .../expressions/higherOrderFunctions.scala | 12 ++-
 .../scala/org/apache/spark/sql/functions.scala | 12 +--
 .../apache/spark/sql/DataFrameFunctionsSuite.scala | 23 ++
 3 files changed, 40 insertions(+), 7 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala
index e5cf8c0..a530ce5 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala
@@ -18,7 +18,7 @@
 package org.apache.spark.sql.catalyst.expressions
 
 import java.util.Comparator
-import java.util.concurrent.atomic.AtomicReference
+import java.util.concurrent.atomic.{AtomicInteger, AtomicReference}
 
 import scala.collection.mutable
 
@@ -52,6 +52,16 @@ case class UnresolvedNamedLambdaVariable(nameParts: 
Seq[String])
   override def sql: String = name
 }
 
+object UnresolvedNamedLambdaVariable {
+
+  // Counter to ensure lambda variable names are unique
+  private val nextVarNameId = new AtomicInteger(0)
+
+  def freshVarName(name: String): String = {
+s"${name}_${nextVarNameId.getAndIncrement()}"
+  }
+}
+
 /**
  * A named lambda variable.
  */
diff --git a/sql/core/src/main/scala/org/apache/spark/sql/functions.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/functions.scala
index bb77c7e..f6d6200 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/functions.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/functions.scala
@@ -3489,22 +3489,22 @@ object functions {
   }
 
   private def createLambda(f: Column

[spark] 01/09: Update docs to reflect alternative key value notation

2021-05-05 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch pr31899
in repository https://gitbox.apache.org/repos/asf/spark.git

commit c685abe33681fcbf0bfa6aa86ba229f19e4d451f
Author: Niklas Riekenbrauck 
AuthorDate: Fri Mar 19 14:53:53 2021 +0100

Update docs to reflect alternative key value notation
---
 docs/sql-ref-syntax-ddl-create-table-datasource.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/docs/sql-ref-syntax-ddl-create-table-datasource.md 
b/docs/sql-ref-syntax-ddl-create-table-datasource.md
index ba0516a..82d3a09 100644
--- a/docs/sql-ref-syntax-ddl-create-table-datasource.md
+++ b/docs/sql-ref-syntax-ddl-create-table-datasource.md
@@ -29,14 +29,14 @@ The `CREATE TABLE` statement defines a new table using a 
Data Source.
 CREATE TABLE [ IF NOT EXISTS ] table_identifier
 [ ( col_name1 col_type1 [ COMMENT col_comment1 ], ... ) ]
 USING data_source
-[ OPTIONS ( key1=val1, key2=val2, ... ) ]
+[ OPTIONS [ ( key1=val1, key2=val2, ... ) | ( key1 val1, key2 val2, ... ) 
] ]
 [ PARTITIONED BY ( col_name1, col_name2, ... ) ]
 [ CLUSTERED BY ( col_name3, col_name4, ... ) 
 [ SORTED BY ( col_name [ ASC | DESC ], ... ) ] 
 INTO num_buckets BUCKETS ]
 [ LOCATION path ]
 [ COMMENT table_comment ]
-[ TBLPROPERTIES ( key1=val1, key2=val2, ... ) ]
+[ TBLPROPERTIES [ ( key1=val1, key2=val2, ... ) | ( key1 val1, key2 val2, 
... ) ] ]
 [ AS select_statement ]
 ```
 

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] 04/09: Update to eaasier KV syntax

2021-05-05 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch pr31899
in repository https://gitbox.apache.org/repos/asf/spark.git

commit 1d157ed7209b355294b8e07e672ba8b5916e93f5
Author: Niklas Riekenbrauck 
AuthorDate: Sat Mar 27 15:17:26 2021 +0100

Update to eaasier KV syntax
---
 docs/sql-ref-syntax-ddl-alter-database.md  | 2 +-
 docs/sql-ref-syntax-ddl-alter-table.md | 8 
 docs/sql-ref-syntax-ddl-alter-view.md  | 2 +-
 docs/sql-ref-syntax-ddl-create-database.md | 4 ++--
 docs/sql-ref-syntax-ddl-create-table-datasource.md | 4 ++--
 docs/sql-ref-syntax-ddl-create-table-hiveformat.md | 2 +-
 docs/sql-ref-syntax-ddl-create-table-like.md   | 2 +-
 7 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/docs/sql-ref-syntax-ddl-alter-database.md 
b/docs/sql-ref-syntax-ddl-alter-database.md
index fbc454e..2de9675 100644
--- a/docs/sql-ref-syntax-ddl-alter-database.md
+++ b/docs/sql-ref-syntax-ddl-alter-database.md
@@ -31,7 +31,7 @@ for a database and may be used for auditing purposes.
 
 ```sql
 ALTER { DATABASE | SCHEMA } database_name
-SET DBPROPERTIES ( property_name = property_value [ , ... ] )
+SET DBPROPERTIES ( ( property_name [=] property_value [ , ... ] | ( 
property_name property_value [ , ... ] )
 ```
 
 ### Parameters
diff --git a/docs/sql-ref-syntax-ddl-alter-table.md 
b/docs/sql-ref-syntax-ddl-alter-table.md
index 2d42eb4..912de0f 100644
--- a/docs/sql-ref-syntax-ddl-alter-table.md
+++ b/docs/sql-ref-syntax-ddl-alter-table.md
@@ -169,7 +169,7 @@ this overrides the old value with the new one.
 
 ```sql
 -- Set Table Properties 
-ALTER TABLE table_identifier SET TBLPROPERTIES ( key1 = val1, key2 = val2, ... 
)
+ALTER TABLE table_identifier SET TBLPROPERTIES ( ( key1 [=] val1, key2 [=] 
val2, ... ) )
 
 -- Unset Table Properties
 ALTER TABLE table_identifier UNSET TBLPROPERTIES [ IF EXISTS ] ( key1, key2, 
... )
@@ -184,10 +184,10 @@ ALTER TABLE table_identifier UNSET TBLPROPERTIES [ IF 
EXISTS ] ( key1, key2, ...
 ```sql
 -- Set SERDE Properties
 ALTER TABLE table_identifier [ partition_spec ]
-SET SERDEPROPERTIES ( key1 = val1, key2 = val2, ... )
+SET SERDEPROPERTIES ( ( key1 = val1, key2 = val2, ... ) | ( key1 val1, 
key2 val2, ... ) )
 
 ALTER TABLE table_identifier [ partition_spec ] SET SERDE serde_class_name
-[ WITH SERDEPROPERTIES ( key1 = val1, key2 = val2, ... ) ]
+[ WITH SERDEPROPERTIES ( ( key1 = val1, key2 = val2, ... ) | ( key1 val1, 
key2 val2, ... ) ) ]
 ```
 
  SET LOCATION And SET FILE FORMAT
@@ -221,7 +221,7 @@ ALTER TABLE table_identifier [ partition_spec ] SET 
LOCATION 'new_location'
 
 **Syntax:** `PARTITION ( partition_col_name  = partition_col_val [ , ... ] 
)`
 
-* **SERDEPROPERTIES ( key1 = val1, key2 = val2, ... )**
+* **SERDEPROPERTIES ( ( key1 = val1, key2 = val2, ... ) | ( key1 val1, key2 
val2, ... ) ) **
 
 Specifies the SERDE properties to be set.
 
diff --git a/docs/sql-ref-syntax-ddl-alter-view.md 
b/docs/sql-ref-syntax-ddl-alter-view.md
index d69f246..25280c4 100644
--- a/docs/sql-ref-syntax-ddl-alter-view.md
+++ b/docs/sql-ref-syntax-ddl-alter-view.md
@@ -49,7 +49,7 @@ the properties.
 
  Syntax
 ```sql
-ALTER VIEW view_identifier SET TBLPROPERTIES ( property_key = property_val [ , 
... ] )
+ALTER VIEW view_identifier SET TBLPROPERTIES ( property_key [=] property_val [ 
, ... ] )
 ```
 
  Parameters
diff --git a/docs/sql-ref-syntax-ddl-create-database.md 
b/docs/sql-ref-syntax-ddl-create-database.md
index 9d8bf47..7db410e 100644
--- a/docs/sql-ref-syntax-ddl-create-database.md
+++ b/docs/sql-ref-syntax-ddl-create-database.md
@@ -29,7 +29,7 @@ Creates a database with the specified name. If database with 
the same name alrea
 CREATE { DATABASE | SCHEMA } [ IF NOT EXISTS ] database_name
 [ COMMENT database_comment ]
 [ LOCATION database_directory ]
-[ WITH DBPROPERTIES ( property_name = property_value [ , ... ] ) ]
+[ WITH DBPROPERTIES ( property_name [=] property_value [ , ... ] ) ]
 ```
 
 ### Parameters
@@ -50,7 +50,7 @@ CREATE { DATABASE | SCHEMA } [ IF NOT EXISTS ] database_name
 
 Specifies the description for the database.
 
-* **WITH DBPROPERTIES ( property_name=property_value [ , ... ] )**
+* **WITH DBPROPERTIES ( property_name [=] property_value [ , ... ] )**
 
 Specifies the properties for the database in key-value pairs.
 
diff --git a/docs/sql-ref-syntax-ddl-create-table-datasource.md 
b/docs/sql-ref-syntax-ddl-create-table-datasource.md
index 9926bc6..7d8e692 100644
--- a/docs/sql-ref-syntax-ddl-create-table-datasource.md
+++ b/docs/sql-ref-syntax-ddl-create-table-datasource.md
@@ -29,14 +29,14 @@ The `CREATE TABLE` statement defines a new table using a 
Data Source.
 CREATE TABLE [ IF NOT EXISTS ] table_identifier
 [ ( col_name1 col_type1 [ COMMENT col_comment1 ], ... ) ]
 USING data_source
-[ OPTIONS ( ( key1=val1, key2=val2, ... ) | ( key1 val1,

[spark] 07/09: Remove unnecessary change

2021-05-05 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch pr31899
in repository https://gitbox.apache.org/repos/asf/spark.git

commit 42cd52e297b141a8b837a8315ca4c84a5ffc3def
Author: Niklas Riekenbrauck 
AuthorDate: Sat Mar 27 15:27:52 2021 +0100

Remove unnecessary change
---
 docs/sql-ref-syntax-ddl-alter-database.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/sql-ref-syntax-ddl-alter-database.md 
b/docs/sql-ref-syntax-ddl-alter-database.md
index 2de9675..6ac6863 100644
--- a/docs/sql-ref-syntax-ddl-alter-database.md
+++ b/docs/sql-ref-syntax-ddl-alter-database.md
@@ -31,7 +31,7 @@ for a database and may be used for auditing purposes.
 
 ```sql
 ALTER { DATABASE | SCHEMA } database_name
-SET DBPROPERTIES ( ( property_name [=] property_value [ , ... ] | ( 
property_name property_value [ , ... ] )
+SET DBPROPERTIES ( property_name [=] property_value [ , ... ] )
 ```
 
 ### Parameters

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch pr31899 created (now 0c4e71e)

2021-05-05 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch pr31899
in repository https://gitbox.apache.org/repos/asf/spark.git.


  at 0c4e71e  Fix

This branch includes the following new commits:

 new c685abe  Update docs to reflect alternative key value notation
 new 2ff9703  Update docs other create table docs
 new 8245a55  Fix alternatives with subrule grammar
 new 1d157ed  Update to eaasier KV syntax
 new 83ec2ee  Commit missing doc updates
 new fff449b  Some more fixes
 new 42cd52e  Remove unnecessary change
 new 2ebb2aa  remove space
 new 0c4e71e  Fix

The 9 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] 02/09: Update docs other create table docs

2021-05-05 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch pr31899
in repository https://gitbox.apache.org/repos/asf/spark.git

commit 2ff970350427835a7b7f7f9d0ec7bc8f1049f7fd
Author: Niklas Riekenbrauck 
AuthorDate: Fri Mar 19 15:17:48 2021 +0100

Update docs other create table docs
---
 docs/sql-ref-syntax-ddl-create-table-hiveformat.md | 2 +-
 docs/sql-ref-syntax-ddl-create-table-like.md   | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/docs/sql-ref-syntax-ddl-create-table-hiveformat.md 
b/docs/sql-ref-syntax-ddl-create-table-hiveformat.md
index b2f5957..63880d5 100644
--- a/docs/sql-ref-syntax-ddl-create-table-hiveformat.md
+++ b/docs/sql-ref-syntax-ddl-create-table-hiveformat.md
@@ -37,7 +37,7 @@ CREATE [ EXTERNAL ] TABLE [ IF NOT EXISTS ] table_identifier
 [ ROW FORMAT row_format ]
 [ STORED AS file_format ]
 [ LOCATION path ]
-[ TBLPROPERTIES ( key1=val1, key2=val2, ... ) ]
+[ TBLPROPERTIES [ ( key1=val1, key2=val2, ... ) | ( key1 val1, key2 val2, 
... ) ] ]
 [ AS select_statement ]
 ```
 
diff --git a/docs/sql-ref-syntax-ddl-create-table-like.md 
b/docs/sql-ref-syntax-ddl-create-table-like.md
index cfb959c..a374296a 100644
--- a/docs/sql-ref-syntax-ddl-create-table-like.md
+++ b/docs/sql-ref-syntax-ddl-create-table-like.md
@@ -30,7 +30,7 @@ CREATE TABLE [IF NOT EXISTS] table_identifier LIKE 
source_table_identifier
 USING data_source
 [ ROW FORMAT row_format ]
 [ STORED AS file_format ]
-[ TBLPROPERTIES ( key1=val1, key2=val2, ... ) ]
+[ TBLPROPERTIES [ ( key1=val1, key2=val2, ... ) | ( key1 val1, key2 val2, 
... ) ] ]
 [ LOCATION path ]
 ```
 

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] 03/09: Fix alternatives with subrule grammar

2021-05-05 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch pr31899
in repository https://gitbox.apache.org/repos/asf/spark.git

commit 8245a55dd1092fe9ef3fbcacb5cf07d1888ac23a
Author: Niklas Riekenbrauck 
AuthorDate: Sat Mar 20 13:00:08 2021 +0100

Fix alternatives with subrule grammar
---
 docs/sql-ref-syntax-ddl-create-table-datasource.md | 4 ++--
 docs/sql-ref-syntax-ddl-create-table-hiveformat.md | 2 +-
 docs/sql-ref-syntax-ddl-create-table-like.md   | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/docs/sql-ref-syntax-ddl-create-table-datasource.md 
b/docs/sql-ref-syntax-ddl-create-table-datasource.md
index 82d3a09..9926bc6 100644
--- a/docs/sql-ref-syntax-ddl-create-table-datasource.md
+++ b/docs/sql-ref-syntax-ddl-create-table-datasource.md
@@ -29,14 +29,14 @@ The `CREATE TABLE` statement defines a new table using a 
Data Source.
 CREATE TABLE [ IF NOT EXISTS ] table_identifier
 [ ( col_name1 col_type1 [ COMMENT col_comment1 ], ... ) ]
 USING data_source
-[ OPTIONS [ ( key1=val1, key2=val2, ... ) | ( key1 val1, key2 val2, ... ) 
] ]
+[ OPTIONS ( ( key1=val1, key2=val2, ... ) | ( key1 val1, key2 val2, ... ) 
) ]
 [ PARTITIONED BY ( col_name1, col_name2, ... ) ]
 [ CLUSTERED BY ( col_name3, col_name4, ... ) 
 [ SORTED BY ( col_name [ ASC | DESC ], ... ) ] 
 INTO num_buckets BUCKETS ]
 [ LOCATION path ]
 [ COMMENT table_comment ]
-[ TBLPROPERTIES [ ( key1=val1, key2=val2, ... ) | ( key1 val1, key2 val2, 
... ) ] ]
+[ TBLPROPERTIES ( ( key1=val1, key2=val2, ... ) | ( key1 val1, key2 val2, 
... ) ) ]
 [ AS select_statement ]
 ```
 
diff --git a/docs/sql-ref-syntax-ddl-create-table-hiveformat.md 
b/docs/sql-ref-syntax-ddl-create-table-hiveformat.md
index 63880d5..2e05e64 100644
--- a/docs/sql-ref-syntax-ddl-create-table-hiveformat.md
+++ b/docs/sql-ref-syntax-ddl-create-table-hiveformat.md
@@ -37,7 +37,7 @@ CREATE [ EXTERNAL ] TABLE [ IF NOT EXISTS ] table_identifier
 [ ROW FORMAT row_format ]
 [ STORED AS file_format ]
 [ LOCATION path ]
-[ TBLPROPERTIES [ ( key1=val1, key2=val2, ... ) | ( key1 val1, key2 val2, 
... ) ] ]
+[ TBLPROPERTIES ( ( key1=val1, key2=val2, ... ) | ( key1 val1, key2 val2, 
... ) ) ]
 [ AS select_statement ]
 ```
 
diff --git a/docs/sql-ref-syntax-ddl-create-table-like.md 
b/docs/sql-ref-syntax-ddl-create-table-like.md
index a374296a..772b299 100644
--- a/docs/sql-ref-syntax-ddl-create-table-like.md
+++ b/docs/sql-ref-syntax-ddl-create-table-like.md
@@ -30,7 +30,7 @@ CREATE TABLE [IF NOT EXISTS] table_identifier LIKE 
source_table_identifier
 USING data_source
 [ ROW FORMAT row_format ]
 [ STORED AS file_format ]
-[ TBLPROPERTIES [ ( key1=val1, key2=val2, ... ) | ( key1 val1, key2 val2, 
... ) ] ]
+[ TBLPROPERTIES ( ( key1=val1, key2=val2, ... ) | ( key1 val1, key2 val2, 
... ) ) ]
 [ LOCATION path ]
 ```
 

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] 05/09: Commit missing doc updates

2021-05-05 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch pr31899
in repository https://gitbox.apache.org/repos/asf/spark.git

commit 83ec2ee71751142220464ea54ffc6e47ccc35ad4
Author: Niklas Riekenbrauck 
AuthorDate: Sat Mar 27 15:19:27 2021 +0100

Commit missing doc updates
---
 docs/sql-ref-syntax-ddl-alter-table.md | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/docs/sql-ref-syntax-ddl-alter-table.md 
b/docs/sql-ref-syntax-ddl-alter-table.md
index 912de0f..866b596 100644
--- a/docs/sql-ref-syntax-ddl-alter-table.md
+++ b/docs/sql-ref-syntax-ddl-alter-table.md
@@ -184,10 +184,10 @@ ALTER TABLE table_identifier UNSET TBLPROPERTIES [ IF 
EXISTS ] ( key1, key2, ...
 ```sql
 -- Set SERDE Properties
 ALTER TABLE table_identifier [ partition_spec ]
-SET SERDEPROPERTIES ( ( key1 = val1, key2 = val2, ... ) | ( key1 val1, 
key2 val2, ... ) )
+SET SERDEPROPERTIES ( key1 [=] val1, key2 [=] val2, ... )
 
 ALTER TABLE table_identifier [ partition_spec ] SET SERDE serde_class_name
-[ WITH SERDEPROPERTIES ( ( key1 = val1, key2 = val2, ... ) | ( key1 val1, 
key2 val2, ... ) ) ]
+[ WITH SERDEPROPERTIES ( key1 [=] val1, key2 [=] val2, ... ) ]
 ```
 
  SET LOCATION And SET FILE FORMAT
@@ -221,7 +221,7 @@ ALTER TABLE table_identifier [ partition_spec ] SET 
LOCATION 'new_location'
 
 **Syntax:** `PARTITION ( partition_col_name  = partition_col_val [ , ... ] 
)`
 
-* **SERDEPROPERTIES ( ( key1 = val1, key2 = val2, ... ) | ( key1 val1, key2 
val2, ... ) ) **
+* **SERDEPROPERTIES ( key1 [=] val1, key2 [=] val2, ... ) **
 
 Specifies the SERDE properties to be set.
 

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] 06/09: Some more fixes

2021-05-05 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch pr31899
in repository https://gitbox.apache.org/repos/asf/spark.git

commit fff449bd54f2204d7cfc7a5fcf5c8877aa37a992
Author: Niklas Riekenbrauck 
AuthorDate: Sat Mar 27 15:22:11 2021 +0100

Some more fixes
---
 docs/sql-ref-syntax-ddl-alter-table.md | 4 ++--
 docs/sql-ref-syntax-ddl-create-table-hiveformat.md | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/docs/sql-ref-syntax-ddl-alter-table.md 
b/docs/sql-ref-syntax-ddl-alter-table.md
index 866b596..915ccf8 100644
--- a/docs/sql-ref-syntax-ddl-alter-table.md
+++ b/docs/sql-ref-syntax-ddl-alter-table.md
@@ -169,7 +169,7 @@ this overrides the old value with the new one.
 
 ```sql
 -- Set Table Properties 
-ALTER TABLE table_identifier SET TBLPROPERTIES ( ( key1 [=] val1, key2 [=] 
val2, ... ) )
+ALTER TABLE table_identifier SET TBLPROPERTIES ( key1 [=] val1, key2 [=] val2, 
... )
 
 -- Unset Table Properties
 ALTER TABLE table_identifier UNSET TBLPROPERTIES [ IF EXISTS ] ( key1, key2, 
... )
@@ -219,7 +219,7 @@ ALTER TABLE table_identifier [ partition_spec ] SET 
LOCATION 'new_location'
 
 Specifies the partition on which the property has to be set. Note that one 
can use a typed literal (e.g., date'2019-01-02') in the partition spec.
 
-**Syntax:** `PARTITION ( partition_col_name  = partition_col_val [ , ... ] 
)`
+**Syntax:** `PARTITION ( partition_col_name = partition_col_val [ , ... ] 
)`
 
 * **SERDEPROPERTIES ( key1 [=] val1, key2 [=] val2, ... ) **
 
diff --git a/docs/sql-ref-syntax-ddl-create-table-hiveformat.md 
b/docs/sql-ref-syntax-ddl-create-table-hiveformat.md
index 48d089d..3231b66 100644
--- a/docs/sql-ref-syntax-ddl-create-table-hiveformat.md
+++ b/docs/sql-ref-syntax-ddl-create-table-hiveformat.md
@@ -37,7 +37,7 @@ CREATE [ EXTERNAL ] TABLE [ IF NOT EXISTS ] table_identifier
 [ ROW FORMAT row_format ]
 [ STORED AS file_format ]
 [ LOCATION path ]
-[ TBLPROPERTIES ( ( key1 [=] val1, key2 [=] val2, ... ) ]
+[ TBLPROPERTIES ( key1 [=] val1, key2 [=] val2, ... ) ]
 [ AS select_statement ]
 ```
 

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] 08/09: remove space

2021-05-05 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch pr31899
in repository https://gitbox.apache.org/repos/asf/spark.git

commit 2ebb2aac7a0c87c929e72bc7c8c080096c55a8f1
Author: Niklas Riekenbrauck 
AuthorDate: Tue Mar 30 13:20:32 2021 +0200

remove space
---
 docs/sql-ref-syntax-ddl-alter-table.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/sql-ref-syntax-ddl-alter-table.md 
b/docs/sql-ref-syntax-ddl-alter-table.md
index 915ccf8..ae40fe4 100644
--- a/docs/sql-ref-syntax-ddl-alter-table.md
+++ b/docs/sql-ref-syntax-ddl-alter-table.md
@@ -221,7 +221,7 @@ ALTER TABLE table_identifier [ partition_spec ] SET 
LOCATION 'new_location'
 
 **Syntax:** `PARTITION ( partition_col_name = partition_col_val [ , ... ] 
)`
 
-* **SERDEPROPERTIES ( key1 [=] val1, key2 [=] val2, ... ) **
+* **SERDEPROPERTIES ( key1 [=] val1, key2 [=] val2, ... )**
 
 Specifies the SERDE properties to be set.
 

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] 09/09: Fix

2021-05-05 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch pr31899
in repository https://gitbox.apache.org/repos/asf/spark.git

commit 0c4e71e00129bcffe933a8cceffab7cf51cf33ce
Author: Takeshi Yamamuro 
AuthorDate: Thu May 6 10:14:25 2021 +0900

Fix
---
 docs/sql-ref-syntax-hive-format.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/sql-ref-syntax-hive-format.md 
b/docs/sql-ref-syntax-hive-format.md
index 8092e58..01b8d3f 100644
--- a/docs/sql-ref-syntax-hive-format.md
+++ b/docs/sql-ref-syntax-hive-format.md
@@ -30,7 +30,7 @@ There are two ways to define a row format in `row_format` of 
`CREATE TABLE` and
 
 ```sql
 row_format:
-SERDE serde_class [ WITH SERDEPROPERTIES (k1=v1, k2=v2, ... ) ]
+SERDE serde_class [ WITH SERDEPROPERTIES (k1 [=] v1, k2 [=] v2, ... ) ]
 | DELIMITED [ FIELDS TERMINATED BY fields_terminated_char [ ESCAPED BY 
escaped_char ] ] 
 [ COLLECTION ITEMS TERMINATED BY collection_items_terminated_char ] 
 [ MAP KEYS TERMINATED BY map_key_terminated_char ]

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] 04/09: Update to eaasier KV syntax

2021-05-05 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch pull/31899
in repository https://gitbox.apache.org/repos/asf/spark.git

commit 1d157ed7209b355294b8e07e672ba8b5916e93f5
Author: Niklas Riekenbrauck 
AuthorDate: Sat Mar 27 15:17:26 2021 +0100

Update to eaasier KV syntax
---
 docs/sql-ref-syntax-ddl-alter-database.md  | 2 +-
 docs/sql-ref-syntax-ddl-alter-table.md | 8 
 docs/sql-ref-syntax-ddl-alter-view.md  | 2 +-
 docs/sql-ref-syntax-ddl-create-database.md | 4 ++--
 docs/sql-ref-syntax-ddl-create-table-datasource.md | 4 ++--
 docs/sql-ref-syntax-ddl-create-table-hiveformat.md | 2 +-
 docs/sql-ref-syntax-ddl-create-table-like.md   | 2 +-
 7 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/docs/sql-ref-syntax-ddl-alter-database.md 
b/docs/sql-ref-syntax-ddl-alter-database.md
index fbc454e..2de9675 100644
--- a/docs/sql-ref-syntax-ddl-alter-database.md
+++ b/docs/sql-ref-syntax-ddl-alter-database.md
@@ -31,7 +31,7 @@ for a database and may be used for auditing purposes.
 
 ```sql
 ALTER { DATABASE | SCHEMA } database_name
-SET DBPROPERTIES ( property_name = property_value [ , ... ] )
+SET DBPROPERTIES ( ( property_name [=] property_value [ , ... ] | ( 
property_name property_value [ , ... ] )
 ```
 
 ### Parameters
diff --git a/docs/sql-ref-syntax-ddl-alter-table.md 
b/docs/sql-ref-syntax-ddl-alter-table.md
index 2d42eb4..912de0f 100644
--- a/docs/sql-ref-syntax-ddl-alter-table.md
+++ b/docs/sql-ref-syntax-ddl-alter-table.md
@@ -169,7 +169,7 @@ this overrides the old value with the new one.
 
 ```sql
 -- Set Table Properties 
-ALTER TABLE table_identifier SET TBLPROPERTIES ( key1 = val1, key2 = val2, ... 
)
+ALTER TABLE table_identifier SET TBLPROPERTIES ( ( key1 [=] val1, key2 [=] 
val2, ... ) )
 
 -- Unset Table Properties
 ALTER TABLE table_identifier UNSET TBLPROPERTIES [ IF EXISTS ] ( key1, key2, 
... )
@@ -184,10 +184,10 @@ ALTER TABLE table_identifier UNSET TBLPROPERTIES [ IF 
EXISTS ] ( key1, key2, ...
 ```sql
 -- Set SERDE Properties
 ALTER TABLE table_identifier [ partition_spec ]
-SET SERDEPROPERTIES ( key1 = val1, key2 = val2, ... )
+SET SERDEPROPERTIES ( ( key1 = val1, key2 = val2, ... ) | ( key1 val1, 
key2 val2, ... ) )
 
 ALTER TABLE table_identifier [ partition_spec ] SET SERDE serde_class_name
-[ WITH SERDEPROPERTIES ( key1 = val1, key2 = val2, ... ) ]
+[ WITH SERDEPROPERTIES ( ( key1 = val1, key2 = val2, ... ) | ( key1 val1, 
key2 val2, ... ) ) ]
 ```
 
  SET LOCATION And SET FILE FORMAT
@@ -221,7 +221,7 @@ ALTER TABLE table_identifier [ partition_spec ] SET 
LOCATION 'new_location'
 
 **Syntax:** `PARTITION ( partition_col_name  = partition_col_val [ , ... ] 
)`
 
-* **SERDEPROPERTIES ( key1 = val1, key2 = val2, ... )**
+* **SERDEPROPERTIES ( ( key1 = val1, key2 = val2, ... ) | ( key1 val1, key2 
val2, ... ) ) **
 
 Specifies the SERDE properties to be set.
 
diff --git a/docs/sql-ref-syntax-ddl-alter-view.md 
b/docs/sql-ref-syntax-ddl-alter-view.md
index d69f246..25280c4 100644
--- a/docs/sql-ref-syntax-ddl-alter-view.md
+++ b/docs/sql-ref-syntax-ddl-alter-view.md
@@ -49,7 +49,7 @@ the properties.
 
  Syntax
 ```sql
-ALTER VIEW view_identifier SET TBLPROPERTIES ( property_key = property_val [ , 
... ] )
+ALTER VIEW view_identifier SET TBLPROPERTIES ( property_key [=] property_val [ 
, ... ] )
 ```
 
  Parameters
diff --git a/docs/sql-ref-syntax-ddl-create-database.md 
b/docs/sql-ref-syntax-ddl-create-database.md
index 9d8bf47..7db410e 100644
--- a/docs/sql-ref-syntax-ddl-create-database.md
+++ b/docs/sql-ref-syntax-ddl-create-database.md
@@ -29,7 +29,7 @@ Creates a database with the specified name. If database with 
the same name alrea
 CREATE { DATABASE | SCHEMA } [ IF NOT EXISTS ] database_name
 [ COMMENT database_comment ]
 [ LOCATION database_directory ]
-[ WITH DBPROPERTIES ( property_name = property_value [ , ... ] ) ]
+[ WITH DBPROPERTIES ( property_name [=] property_value [ , ... ] ) ]
 ```
 
 ### Parameters
@@ -50,7 +50,7 @@ CREATE { DATABASE | SCHEMA } [ IF NOT EXISTS ] database_name
 
 Specifies the description for the database.
 
-* **WITH DBPROPERTIES ( property_name=property_value [ , ... ] )**
+* **WITH DBPROPERTIES ( property_name [=] property_value [ , ... ] )**
 
 Specifies the properties for the database in key-value pairs.
 
diff --git a/docs/sql-ref-syntax-ddl-create-table-datasource.md 
b/docs/sql-ref-syntax-ddl-create-table-datasource.md
index 9926bc6..7d8e692 100644
--- a/docs/sql-ref-syntax-ddl-create-table-datasource.md
+++ b/docs/sql-ref-syntax-ddl-create-table-datasource.md
@@ -29,14 +29,14 @@ The `CREATE TABLE` statement defines a new table using a 
Data Source.
 CREATE TABLE [ IF NOT EXISTS ] table_identifier
 [ ( col_name1 col_type1 [ COMMENT col_comment1 ], ... ) ]
 USING data_source
-[ OPTIONS ( ( key1=val1, key2=val2, ... ) | ( key1 val1,

[spark] 01/09: Update docs to reflect alternative key value notation

2021-05-05 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch pull/31899
in repository https://gitbox.apache.org/repos/asf/spark.git

commit c685abe33681fcbf0bfa6aa86ba229f19e4d451f
Author: Niklas Riekenbrauck 
AuthorDate: Fri Mar 19 14:53:53 2021 +0100

Update docs to reflect alternative key value notation
---
 docs/sql-ref-syntax-ddl-create-table-datasource.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/docs/sql-ref-syntax-ddl-create-table-datasource.md 
b/docs/sql-ref-syntax-ddl-create-table-datasource.md
index ba0516a..82d3a09 100644
--- a/docs/sql-ref-syntax-ddl-create-table-datasource.md
+++ b/docs/sql-ref-syntax-ddl-create-table-datasource.md
@@ -29,14 +29,14 @@ The `CREATE TABLE` statement defines a new table using a 
Data Source.
 CREATE TABLE [ IF NOT EXISTS ] table_identifier
 [ ( col_name1 col_type1 [ COMMENT col_comment1 ], ... ) ]
 USING data_source
-[ OPTIONS ( key1=val1, key2=val2, ... ) ]
+[ OPTIONS [ ( key1=val1, key2=val2, ... ) | ( key1 val1, key2 val2, ... ) 
] ]
 [ PARTITIONED BY ( col_name1, col_name2, ... ) ]
 [ CLUSTERED BY ( col_name3, col_name4, ... ) 
 [ SORTED BY ( col_name [ ASC | DESC ], ... ) ] 
 INTO num_buckets BUCKETS ]
 [ LOCATION path ]
 [ COMMENT table_comment ]
-[ TBLPROPERTIES ( key1=val1, key2=val2, ... ) ]
+[ TBLPROPERTIES [ ( key1=val1, key2=val2, ... ) | ( key1 val1, key2 val2, 
... ) ] ]
 [ AS select_statement ]
 ```
 

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] 03/09: Fix alternatives with subrule grammar

2021-05-05 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch pull/31899
in repository https://gitbox.apache.org/repos/asf/spark.git

commit 8245a55dd1092fe9ef3fbcacb5cf07d1888ac23a
Author: Niklas Riekenbrauck 
AuthorDate: Sat Mar 20 13:00:08 2021 +0100

Fix alternatives with subrule grammar
---
 docs/sql-ref-syntax-ddl-create-table-datasource.md | 4 ++--
 docs/sql-ref-syntax-ddl-create-table-hiveformat.md | 2 +-
 docs/sql-ref-syntax-ddl-create-table-like.md   | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/docs/sql-ref-syntax-ddl-create-table-datasource.md 
b/docs/sql-ref-syntax-ddl-create-table-datasource.md
index 82d3a09..9926bc6 100644
--- a/docs/sql-ref-syntax-ddl-create-table-datasource.md
+++ b/docs/sql-ref-syntax-ddl-create-table-datasource.md
@@ -29,14 +29,14 @@ The `CREATE TABLE` statement defines a new table using a 
Data Source.
 CREATE TABLE [ IF NOT EXISTS ] table_identifier
 [ ( col_name1 col_type1 [ COMMENT col_comment1 ], ... ) ]
 USING data_source
-[ OPTIONS [ ( key1=val1, key2=val2, ... ) | ( key1 val1, key2 val2, ... ) 
] ]
+[ OPTIONS ( ( key1=val1, key2=val2, ... ) | ( key1 val1, key2 val2, ... ) 
) ]
 [ PARTITIONED BY ( col_name1, col_name2, ... ) ]
 [ CLUSTERED BY ( col_name3, col_name4, ... ) 
 [ SORTED BY ( col_name [ ASC | DESC ], ... ) ] 
 INTO num_buckets BUCKETS ]
 [ LOCATION path ]
 [ COMMENT table_comment ]
-[ TBLPROPERTIES [ ( key1=val1, key2=val2, ... ) | ( key1 val1, key2 val2, 
... ) ] ]
+[ TBLPROPERTIES ( ( key1=val1, key2=val2, ... ) | ( key1 val1, key2 val2, 
... ) ) ]
 [ AS select_statement ]
 ```
 
diff --git a/docs/sql-ref-syntax-ddl-create-table-hiveformat.md 
b/docs/sql-ref-syntax-ddl-create-table-hiveformat.md
index 63880d5..2e05e64 100644
--- a/docs/sql-ref-syntax-ddl-create-table-hiveformat.md
+++ b/docs/sql-ref-syntax-ddl-create-table-hiveformat.md
@@ -37,7 +37,7 @@ CREATE [ EXTERNAL ] TABLE [ IF NOT EXISTS ] table_identifier
 [ ROW FORMAT row_format ]
 [ STORED AS file_format ]
 [ LOCATION path ]
-[ TBLPROPERTIES [ ( key1=val1, key2=val2, ... ) | ( key1 val1, key2 val2, 
... ) ] ]
+[ TBLPROPERTIES ( ( key1=val1, key2=val2, ... ) | ( key1 val1, key2 val2, 
... ) ) ]
 [ AS select_statement ]
 ```
 
diff --git a/docs/sql-ref-syntax-ddl-create-table-like.md 
b/docs/sql-ref-syntax-ddl-create-table-like.md
index a374296a..772b299 100644
--- a/docs/sql-ref-syntax-ddl-create-table-like.md
+++ b/docs/sql-ref-syntax-ddl-create-table-like.md
@@ -30,7 +30,7 @@ CREATE TABLE [IF NOT EXISTS] table_identifier LIKE 
source_table_identifier
 USING data_source
 [ ROW FORMAT row_format ]
 [ STORED AS file_format ]
-[ TBLPROPERTIES [ ( key1=val1, key2=val2, ... ) | ( key1 val1, key2 val2, 
... ) ] ]
+[ TBLPROPERTIES ( ( key1=val1, key2=val2, ... ) | ( key1 val1, key2 val2, 
... ) ) ]
 [ LOCATION path ]
 ```
 

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] 02/09: Update docs other create table docs

2021-05-05 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch pull/31899
in repository https://gitbox.apache.org/repos/asf/spark.git

commit 2ff970350427835a7b7f7f9d0ec7bc8f1049f7fd
Author: Niklas Riekenbrauck 
AuthorDate: Fri Mar 19 15:17:48 2021 +0100

Update docs other create table docs
---
 docs/sql-ref-syntax-ddl-create-table-hiveformat.md | 2 +-
 docs/sql-ref-syntax-ddl-create-table-like.md   | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/docs/sql-ref-syntax-ddl-create-table-hiveformat.md 
b/docs/sql-ref-syntax-ddl-create-table-hiveformat.md
index b2f5957..63880d5 100644
--- a/docs/sql-ref-syntax-ddl-create-table-hiveformat.md
+++ b/docs/sql-ref-syntax-ddl-create-table-hiveformat.md
@@ -37,7 +37,7 @@ CREATE [ EXTERNAL ] TABLE [ IF NOT EXISTS ] table_identifier
 [ ROW FORMAT row_format ]
 [ STORED AS file_format ]
 [ LOCATION path ]
-[ TBLPROPERTIES ( key1=val1, key2=val2, ... ) ]
+[ TBLPROPERTIES [ ( key1=val1, key2=val2, ... ) | ( key1 val1, key2 val2, 
... ) ] ]
 [ AS select_statement ]
 ```
 
diff --git a/docs/sql-ref-syntax-ddl-create-table-like.md 
b/docs/sql-ref-syntax-ddl-create-table-like.md
index cfb959c..a374296a 100644
--- a/docs/sql-ref-syntax-ddl-create-table-like.md
+++ b/docs/sql-ref-syntax-ddl-create-table-like.md
@@ -30,7 +30,7 @@ CREATE TABLE [IF NOT EXISTS] table_identifier LIKE 
source_table_identifier
 USING data_source
 [ ROW FORMAT row_format ]
 [ STORED AS file_format ]
-[ TBLPROPERTIES ( key1=val1, key2=val2, ... ) ]
+[ TBLPROPERTIES [ ( key1=val1, key2=val2, ... ) | ( key1 val1, key2 val2, 
... ) ] ]
 [ LOCATION path ]
 ```
 

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch pull/31899 created (now 0c4e71e)

2021-05-05 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch pull/31899
in repository https://gitbox.apache.org/repos/asf/spark.git.


  at 0c4e71e  Fix

This branch includes the following new commits:

 new c685abe  Update docs to reflect alternative key value notation
 new 2ff9703  Update docs other create table docs
 new 8245a55  Fix alternatives with subrule grammar
 new 1d157ed  Update to eaasier KV syntax
 new 83ec2ee  Commit missing doc updates
 new fff449b  Some more fixes
 new 42cd52e  Remove unnecessary change
 new 2ebb2aa  remove space
 new 0c4e71e  Fix

The 9 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] 06/09: Some more fixes

2021-05-05 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch pull/31899
in repository https://gitbox.apache.org/repos/asf/spark.git

commit fff449bd54f2204d7cfc7a5fcf5c8877aa37a992
Author: Niklas Riekenbrauck 
AuthorDate: Sat Mar 27 15:22:11 2021 +0100

Some more fixes
---
 docs/sql-ref-syntax-ddl-alter-table.md | 4 ++--
 docs/sql-ref-syntax-ddl-create-table-hiveformat.md | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/docs/sql-ref-syntax-ddl-alter-table.md 
b/docs/sql-ref-syntax-ddl-alter-table.md
index 866b596..915ccf8 100644
--- a/docs/sql-ref-syntax-ddl-alter-table.md
+++ b/docs/sql-ref-syntax-ddl-alter-table.md
@@ -169,7 +169,7 @@ this overrides the old value with the new one.
 
 ```sql
 -- Set Table Properties 
-ALTER TABLE table_identifier SET TBLPROPERTIES ( ( key1 [=] val1, key2 [=] 
val2, ... ) )
+ALTER TABLE table_identifier SET TBLPROPERTIES ( key1 [=] val1, key2 [=] val2, 
... )
 
 -- Unset Table Properties
 ALTER TABLE table_identifier UNSET TBLPROPERTIES [ IF EXISTS ] ( key1, key2, 
... )
@@ -219,7 +219,7 @@ ALTER TABLE table_identifier [ partition_spec ] SET 
LOCATION 'new_location'
 
 Specifies the partition on which the property has to be set. Note that one 
can use a typed literal (e.g., date'2019-01-02') in the partition spec.
 
-**Syntax:** `PARTITION ( partition_col_name  = partition_col_val [ , ... ] 
)`
+**Syntax:** `PARTITION ( partition_col_name = partition_col_val [ , ... ] 
)`
 
 * **SERDEPROPERTIES ( key1 [=] val1, key2 [=] val2, ... ) **
 
diff --git a/docs/sql-ref-syntax-ddl-create-table-hiveformat.md 
b/docs/sql-ref-syntax-ddl-create-table-hiveformat.md
index 48d089d..3231b66 100644
--- a/docs/sql-ref-syntax-ddl-create-table-hiveformat.md
+++ b/docs/sql-ref-syntax-ddl-create-table-hiveformat.md
@@ -37,7 +37,7 @@ CREATE [ EXTERNAL ] TABLE [ IF NOT EXISTS ] table_identifier
 [ ROW FORMAT row_format ]
 [ STORED AS file_format ]
 [ LOCATION path ]
-[ TBLPROPERTIES ( ( key1 [=] val1, key2 [=] val2, ... ) ]
+[ TBLPROPERTIES ( key1 [=] val1, key2 [=] val2, ... ) ]
 [ AS select_statement ]
 ```
 

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] 09/09: Fix

2021-05-05 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch pull/31899
in repository https://gitbox.apache.org/repos/asf/spark.git

commit 0c4e71e00129bcffe933a8cceffab7cf51cf33ce
Author: Takeshi Yamamuro 
AuthorDate: Thu May 6 10:14:25 2021 +0900

Fix
---
 docs/sql-ref-syntax-hive-format.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/sql-ref-syntax-hive-format.md 
b/docs/sql-ref-syntax-hive-format.md
index 8092e58..01b8d3f 100644
--- a/docs/sql-ref-syntax-hive-format.md
+++ b/docs/sql-ref-syntax-hive-format.md
@@ -30,7 +30,7 @@ There are two ways to define a row format in `row_format` of 
`CREATE TABLE` and
 
 ```sql
 row_format:
-SERDE serde_class [ WITH SERDEPROPERTIES (k1=v1, k2=v2, ... ) ]
+SERDE serde_class [ WITH SERDEPROPERTIES (k1 [=] v1, k2 [=] v2, ... ) ]
 | DELIMITED [ FIELDS TERMINATED BY fields_terminated_char [ ESCAPED BY 
escaped_char ] ] 
 [ COLLECTION ITEMS TERMINATED BY collection_items_terminated_char ] 
 [ MAP KEYS TERMINATED BY map_key_terminated_char ]

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] 08/09: remove space

2021-05-05 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch pull/31899
in repository https://gitbox.apache.org/repos/asf/spark.git

commit 2ebb2aac7a0c87c929e72bc7c8c080096c55a8f1
Author: Niklas Riekenbrauck 
AuthorDate: Tue Mar 30 13:20:32 2021 +0200

remove space
---
 docs/sql-ref-syntax-ddl-alter-table.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/sql-ref-syntax-ddl-alter-table.md 
b/docs/sql-ref-syntax-ddl-alter-table.md
index 915ccf8..ae40fe4 100644
--- a/docs/sql-ref-syntax-ddl-alter-table.md
+++ b/docs/sql-ref-syntax-ddl-alter-table.md
@@ -221,7 +221,7 @@ ALTER TABLE table_identifier [ partition_spec ] SET 
LOCATION 'new_location'
 
 **Syntax:** `PARTITION ( partition_col_name = partition_col_val [ , ... ] 
)`
 
-* **SERDEPROPERTIES ( key1 [=] val1, key2 [=] val2, ... ) **
+* **SERDEPROPERTIES ( key1 [=] val1, key2 [=] val2, ... )**
 
 Specifies the SERDE properties to be set.
 

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] 05/09: Commit missing doc updates

2021-05-05 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch pull/31899
in repository https://gitbox.apache.org/repos/asf/spark.git

commit 83ec2ee71751142220464ea54ffc6e47ccc35ad4
Author: Niklas Riekenbrauck 
AuthorDate: Sat Mar 27 15:19:27 2021 +0100

Commit missing doc updates
---
 docs/sql-ref-syntax-ddl-alter-table.md | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/docs/sql-ref-syntax-ddl-alter-table.md 
b/docs/sql-ref-syntax-ddl-alter-table.md
index 912de0f..866b596 100644
--- a/docs/sql-ref-syntax-ddl-alter-table.md
+++ b/docs/sql-ref-syntax-ddl-alter-table.md
@@ -184,10 +184,10 @@ ALTER TABLE table_identifier UNSET TBLPROPERTIES [ IF 
EXISTS ] ( key1, key2, ...
 ```sql
 -- Set SERDE Properties
 ALTER TABLE table_identifier [ partition_spec ]
-SET SERDEPROPERTIES ( ( key1 = val1, key2 = val2, ... ) | ( key1 val1, 
key2 val2, ... ) )
+SET SERDEPROPERTIES ( key1 [=] val1, key2 [=] val2, ... )
 
 ALTER TABLE table_identifier [ partition_spec ] SET SERDE serde_class_name
-[ WITH SERDEPROPERTIES ( ( key1 = val1, key2 = val2, ... ) | ( key1 val1, 
key2 val2, ... ) ) ]
+[ WITH SERDEPROPERTIES ( key1 [=] val1, key2 [=] val2, ... ) ]
 ```
 
  SET LOCATION And SET FILE FORMAT
@@ -221,7 +221,7 @@ ALTER TABLE table_identifier [ partition_spec ] SET 
LOCATION 'new_location'
 
 **Syntax:** `PARTITION ( partition_col_name  = partition_col_val [ , ... ] 
)`
 
-* **SERDEPROPERTIES ( ( key1 = val1, key2 = val2, ... ) | ( key1 val1, key2 
val2, ... ) ) **
+* **SERDEPROPERTIES ( key1 [=] val1, key2 [=] val2, ... ) **
 
 Specifies the SERDE properties to be set.
 

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] 07/09: Remove unnecessary change

2021-05-05 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch pull/31899
in repository https://gitbox.apache.org/repos/asf/spark.git

commit 42cd52e297b141a8b837a8315ca4c84a5ffc3def
Author: Niklas Riekenbrauck 
AuthorDate: Sat Mar 27 15:27:52 2021 +0100

Remove unnecessary change
---
 docs/sql-ref-syntax-ddl-alter-database.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/sql-ref-syntax-ddl-alter-database.md 
b/docs/sql-ref-syntax-ddl-alter-database.md
index 2de9675..6ac6863 100644
--- a/docs/sql-ref-syntax-ddl-alter-database.md
+++ b/docs/sql-ref-syntax-ddl-alter-database.md
@@ -31,7 +31,7 @@ for a database and may be used for auditing purposes.
 
 ```sql
 ALTER { DATABASE | SCHEMA } database_name
-SET DBPROPERTIES ( ( property_name [=] property_value [ , ... ] | ( 
property_name property_value [ , ... ] )
+SET DBPROPERTIES ( property_name [=] property_value [ , ... ] )
 ```
 
 ### Parameters

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (19661f6 -> 5c67d0c)

2021-05-05 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 19661f6  [SPARK-35325][SQL][TESTS] Add nested column ORC encryption 
test case
 add 5c67d0c  [SPARK-35293][SQL][TESTS] Use the newer dsdgen for 
TPCDSQueryTestSuite

No new revisions were added by this update.

Summary of changes:
 .github/workflows/build_and_test.yml   |6 +-
 .../resources/tpcds-query-results/v1_4/q1.sql.out  |  184 +-
 .../resources/tpcds-query-results/v1_4/q10.sql.out |   11 +-
 .../resources/tpcds-query-results/v1_4/q11.sql.out |6 +
 .../resources/tpcds-query-results/v1_4/q12.sql.out |  200 +-
 .../resources/tpcds-query-results/v1_4/q13.sql.out |2 +-
 .../tpcds-query-results/v1_4/q14a.sql.out  |  200 +-
 .../tpcds-query-results/v1_4/q14b.sql.out  |  200 +-
 .../resources/tpcds-query-results/v1_4/q15.sql.out |  200 +-
 .../resources/tpcds-query-results/v1_4/q16.sql.out |2 +-
 .../resources/tpcds-query-results/v1_4/q17.sql.out |2 +-
 .../resources/tpcds-query-results/v1_4/q18.sql.out |  200 +-
 .../resources/tpcds-query-results/v1_4/q19.sql.out |  200 +-
 .../resources/tpcds-query-results/v1_4/q2.sql.out  | 5026 +--
 .../resources/tpcds-query-results/v1_4/q20.sql.out |  200 +-
 .../resources/tpcds-query-results/v1_4/q21.sql.out |  200 +-
 .../resources/tpcds-query-results/v1_4/q22.sql.out |  200 +-
 .../tpcds-query-results/v1_4/q23a.sql.out  |2 +-
 .../tpcds-query-results/v1_4/q23b.sql.out  |5 +-
 .../tpcds-query-results/v1_4/q24a.sql.out  |8 +-
 .../tpcds-query-results/v1_4/q24b.sql.out  |2 +-
 .../resources/tpcds-query-results/v1_4/q25.sql.out |2 +-
 .../resources/tpcds-query-results/v1_4/q26.sql.out |  200 +-
 .../resources/tpcds-query-results/v1_4/q27.sql.out |  200 +-
 .../resources/tpcds-query-results/v1_4/q28.sql.out |2 +-
 .../resources/tpcds-query-results/v1_4/q29.sql.out |3 +-
 .../resources/tpcds-query-results/v1_4/q3.sql.out  |  172 +-
 .../resources/tpcds-query-results/v1_4/q30.sql.out |  200 +-
 .../resources/tpcds-query-results/v1_4/q31.sql.out |  112 +-
 .../resources/tpcds-query-results/v1_4/q32.sql.out |2 -
 .../resources/tpcds-query-results/v1_4/q33.sql.out |  200 +-
 .../resources/tpcds-query-results/v1_4/q34.sql.out |  434 +-
 .../resources/tpcds-query-results/v1_4/q35.sql.out |  188 +-
 .../resources/tpcds-query-results/v1_4/q36.sql.out |  200 +-
 .../resources/tpcds-query-results/v1_4/q37.sql.out |3 +-
 .../resources/tpcds-query-results/v1_4/q38.sql.out |2 +-
 .../tpcds-query-results/v1_4/q39a.sql.out  |  449 +-
 .../tpcds-query-results/v1_4/q39b.sql.out  |   24 +-
 .../resources/tpcds-query-results/v1_4/q4.sql.out  |   10 +-
 .../resources/tpcds-query-results/v1_4/q40.sql.out |  200 +-
 .../resources/tpcds-query-results/v1_4/q41.sql.out |9 +-
 .../resources/tpcds-query-results/v1_4/q42.sql.out |   21 +-
 .../resources/tpcds-query-results/v1_4/q43.sql.out |   12 +-
 .../resources/tpcds-query-results/v1_4/q44.sql.out |   20 +-
 .../resources/tpcds-query-results/v1_4/q45.sql.out |   39 +-
 .../resources/tpcds-query-results/v1_4/q46.sql.out |  200 +-
 .../resources/tpcds-query-results/v1_4/q47.sql.out |  200 +-
 .../resources/tpcds-query-results/v1_4/q48.sql.out |2 +-
 .../resources/tpcds-query-results/v1_4/q49.sql.out |   64 +-
 .../resources/tpcds-query-results/v1_4/q5.sql.out  |  200 +-
 .../resources/tpcds-query-results/v1_4/q50.sql.out |   12 +-
 .../resources/tpcds-query-results/v1_4/q51.sql.out |  200 +-
 .../resources/tpcds-query-results/v1_4/q52.sql.out |  200 +-
 .../resources/tpcds-query-results/v1_4/q53.sql.out |  200 +-
 .../resources/tpcds-query-results/v1_4/q54.sql.out |2 +-
 .../resources/tpcds-query-results/v1_4/q55.sql.out |  200 +-
 .../resources/tpcds-query-results/v1_4/q56.sql.out |  200 +-
 .../resources/tpcds-query-results/v1_4/q57.sql.out |  200 +-
 .../resources/tpcds-query-results/v1_4/q58.sql.out |4 +-
 .../resources/tpcds-query-results/v1_4/q59.sql.out |  200 +-
 .../resources/tpcds-query-results/v1_4/q6.sql.out  |   91 +-
 .../resources/tpcds-query-results/v1_4/q60.sql.out |  200 +-
 .../resources/tpcds-query-results/v1_4/q61.sql.out |2 +-
 .../resources/tpcds-query-results/v1_4/q62.sql.out |  200 +-
 .../resources/tpcds-query-results/v1_4/q63.sql.out |  200 +-
 .../resources/tpcds-query-results/v1_4/q64.sql.out |   19 +-
 .../resources/tpcds-query-results/v1_4/q65.sql.out |  200 +-
 .../resources/tpcds-query-results/v1_4/q66.sql.out |   10 +-
 .../resources/tpcds-query-results/v1_4/q67.sql.out |  200 +-
 .../resources/tpcds-query-results/v1_4/q68.sql.out |  200 +-
 .../resources/tpcds-query-results/v1_4/q69.sql.out |  182 +-
 .../resources/tpcds-query-results/v1_4/q7.sql.out  |  200 +-
 .../resources/tpcds-query-results/v1_4/q70.sql.out |6 +-
 .../resources/tpcds-query-results/v1_4

[spark] branch master updated (2634dba -> 6f0ef93)

2021-05-07 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 2634dba  [SPARK-35175][BUILD] Add linter for JavaScript source files
 add 6f0ef93  [SPARK-35297][CORE][DOC][MINOR] Modify the comment about the 
executor

No new revisions were added by this update.

Summary of changes:
 core/src/main/scala/org/apache/spark/executor/Executor.scala | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (b025780 -> 06c4009)

2021-05-08 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from b025780  [SPARK-35331][SQL] Support resolving missing attrs for 
distribute/cluster by/repartition hint
 add 06c4009  [SPARK-35327][SQL][TESTS] Filters out the TPC-DS queries that 
can cause flaky test results

No new revisions were added by this update.

Summary of changes:
 .../resources/tpcds-query-results/v1_4/q6.sql.out  |  51 --
 .../resources/tpcds-query-results/v1_4/q75.sql.out | 105 -
 .../scala/org/apache/spark/sql/TPCDSBase.scala |   2 +-
 .../org/apache/spark/sql/TPCDSQueryTestSuite.scala |   6 ++
 4 files changed, 7 insertions(+), 157 deletions(-)
 delete mode 100644 
sql/core/src/test/resources/tpcds-query-results/v1_4/q6.sql.out
 delete mode 100644 
sql/core/src/test/resources/tpcds-query-results/v1_4/q75.sql.out

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (5b65d8a -> 620f072)

2021-05-09 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 5b65d8a  [SPARK-35347][SQL] Use MethodUtils for looking up methods in 
Invoke and StaticInvoke
 add 620f072  [SPARK-35231][SQL] logical.Range override maxRowsPerPartition

No new revisions were added by this update.

Summary of changes:
 .../sql/catalyst/plans/logical/basicLogicalOperators.scala   | 12 
 .../apache/spark/sql/catalyst/plans/LogicalPlanSuite.scala   | 11 ++-
 2 files changed, 22 insertions(+), 1 deletion(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (620f072 -> 38eb5a6)

2021-05-09 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 620f072  [SPARK-35231][SQL] logical.Range override maxRowsPerPartition
 add 38eb5a6  [SPARK-35354][SQL] Replace BaseJoinExec with ShuffledJoin in 
CoalesceBucketsInJoin

No new revisions were added by this update.

Summary of changes:
 .../sql/execution/bucketing/CoalesceBucketsInJoin.scala  | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (44bd0a8 -> c4ca232)

2021-05-10 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 44bd0a8  [SPARK-35088][SQL][FOLLOWUP] Improve the error message for 
Sequence expression
 add c4ca232  [SPARK-35363][SQL] Refactor sort merge join code-gen be 
agnostic to join type

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/execution/joins/ShuffledJoin.scala   |   2 +-
 .../sql/execution/joins/SortMergeJoinExec.scala| 163 +++--
 2 files changed, 84 insertions(+), 81 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (ae0579a -> 3241aeb)

2021-05-12 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from ae0579a  [SPARK-35369][DOC] Document ExecutorAllocationManager metrics
 add 3241aeb  [SPARK-35385][SQL][TESTS] Skip duplicate queries in the 
TPCDS-related tests

No new revisions were added by this update.

Summary of changes:
 sql/core/src/test/scala/org/apache/spark/sql/TPCDSBase.scala   | 10 +-
 .../test/scala/org/apache/spark/sql/TPCDSQueryTestSuite.scala  |  6 --
 2 files changed, 9 insertions(+), 7 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (2eef2f9 -> 2390b9d)

2021-05-16 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 2eef2f9  [SPARK-35412][SQL] Fix a bug in groupBy of 
year-month/day-time intervals
 add 2390b9d  [SPARK-35413][INFRA] Use the SHA of the latest commit when 
checking out databricks/tpcds-kit

No new revisions were added by this update.

Summary of changes:
 .github/workflows/build_and_test.yml | 1 +
 1 file changed, 1 insertion(+)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.1 updated: [SPARK-35413][INFRA] Use the SHA of the latest commit when checking out databricks/tpcds-kit

2021-05-16 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.1 by this push:
 new f9a396c  [SPARK-35413][INFRA] Use the SHA of the latest commit when 
checking out databricks/tpcds-kit
f9a396c is described below

commit f9a396c37cb340671666379cb8d8a85435c7ad87
Author: Takeshi Yamamuro 
AuthorDate: Mon May 17 09:26:04 2021 +0900

[SPARK-35413][INFRA] Use the SHA of the latest commit when checking out 
databricks/tpcds-kit

### What changes were proposed in this pull request?

This PR proposes to use the SHA of the latest commit 
([2a5078a782192ddb6efbcead8de9973d6ab4f069](https://github.com/databricks/tpcds-kit/commit/2a5078a782192ddb6efbcead8de9973d6ab4f069))
 when checking out `databricks/tpcds-kit`. This can prevent the test workflow 
from breaking accidentally if the repository changes drastically.

### Why are the changes needed?

For better test workflow.

### Does this PR introduce _any_ user-facing change?

No, dev-only.

### How was this patch tested?

GA passed.

Closes #32561 from maropu/UseRefInCheckout.

Authored-by: Takeshi Yamamuro 
Signed-off-by: Takeshi Yamamuro 
(cherry picked from commit 2390b9dbcbc0b0377d694d2c3c2c0fa78179cbd6)
Signed-off-by: Takeshi Yamamuro 
---
 .github/workflows/build_and_test.yml | 1 +
 1 file changed, 1 insertion(+)

diff --git a/.github/workflows/build_and_test.yml 
b/.github/workflows/build_and_test.yml
index 173cc0e..c8b4c77 100644
--- a/.github/workflows/build_and_test.yml
+++ b/.github/workflows/build_and_test.yml
@@ -481,6 +481,7 @@ jobs:
   uses: actions/checkout@v2
   with:
 repository: databricks/tpcds-kit
+ref: 2a5078a782192ddb6efbcead8de9973d6ab4f069
 path: ./tpcds-kit
 - name: Build tpcds-kit
   if: steps.cache-tpcds-sf-1.outputs.cache-hit != 'true'

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-35413][INFRA] Use the SHA of the latest commit when checking out databricks/tpcds-kit

2021-05-16 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 8ebc1d3  [SPARK-35413][INFRA] Use the SHA of the latest commit when 
checking out databricks/tpcds-kit
8ebc1d3 is described below

commit 8ebc1d317f978e524d55449ecc88daa806dde009
Author: Takeshi Yamamuro 
AuthorDate: Mon May 17 09:26:04 2021 +0900

[SPARK-35413][INFRA] Use the SHA of the latest commit when checking out 
databricks/tpcds-kit

### What changes were proposed in this pull request?

This PR proposes to use the SHA of the latest commit 
([2a5078a782192ddb6efbcead8de9973d6ab4f069](https://github.com/databricks/tpcds-kit/commit/2a5078a782192ddb6efbcead8de9973d6ab4f069))
 when checking out `databricks/tpcds-kit`. This can prevent the test workflow 
from breaking accidentally if the repository changes drastically.

### Why are the changes needed?

For better test workflow.

### Does this PR introduce _any_ user-facing change?

No, dev-only.

### How was this patch tested?

GA passed.

Closes #32561 from maropu/UseRefInCheckout.

Authored-by: Takeshi Yamamuro 
Signed-off-by: Takeshi Yamamuro 
(cherry picked from commit 2390b9dbcbc0b0377d694d2c3c2c0fa78179cbd6)
Signed-off-by: Takeshi Yamamuro 
---
 .github/workflows/build_and_test.yml | 1 +
 1 file changed, 1 insertion(+)

diff --git a/.github/workflows/build_and_test.yml 
b/.github/workflows/build_and_test.yml
index 936a256..77a2c79 100644
--- a/.github/workflows/build_and_test.yml
+++ b/.github/workflows/build_and_test.yml
@@ -428,6 +428,7 @@ jobs:
   uses: actions/checkout@v2
   with:
 repository: databricks/tpcds-kit
+ref: 2a5078a782192ddb6efbcead8de9973d6ab4f069
 path: ./tpcds-kit
 - name: Build tpcds-kit
   if: steps.cache-tpcds-sf-1.outputs.cache-hit != 'true'

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (7b942d5 -> cce0048)

2021-05-18 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 7b942d5  [SPARK-35425][BUILD] Pin jinja2 in `spark-rm/Dockerfile` and 
add as a required dependency in the release README.md
 add cce0048  [SPARK-35351][SQL] Add code-gen for left anti sort merge join

No new revisions were added by this update.

Summary of changes:
 .../sql/execution/joins/SortMergeJoinExec.scala|  97 ++
 .../approved-plans-v1_4/q16.sf100/explain.txt  |   4 +-
 .../approved-plans-v1_4/q16.sf100/simplified.txt   |   5 +-
 .../approved-plans-v1_4/q16/explain.txt|   4 +-
 .../approved-plans-v1_4/q16/simplified.txt |   5 +-
 .../approved-plans-v1_4/q69.sf100/explain.txt  |  36 +++
 .../approved-plans-v1_4/q69.sf100/simplified.txt   | 110 +++--
 .../approved-plans-v1_4/q87.sf100/explain.txt  |   8 +-
 .../approved-plans-v1_4/q87.sf100/simplified.txt   |  10 +-
 .../approved-plans-v1_4/q94.sf100/explain.txt  |   4 +-
 .../approved-plans-v1_4/q94.sf100/simplified.txt   |   5 +-
 .../approved-plans-v1_4/q94/explain.txt|   4 +-
 .../approved-plans-v1_4/q94/simplified.txt |   5 +-
 .../sql/execution/WholeStageCodegenSuite.scala |  22 +
 .../sql/execution/metric/SQLMetricsSuite.scala |   4 +-
 15 files changed, 208 insertions(+), 115 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (186477c -> b1493d8)

2021-05-18 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 186477c  [SPARK-35263][TEST] Refactor ShuffleBlockFetcherIteratorSuite 
to reduce duplicated code
 add b1493d8  [SPARK-35398][SQL] Simplify the way to get classes from 
ClassBodyEvaluator in `CodeGenerator.updateAndGetCompilationStats` method

No new revisions were added by this update.

Summary of changes:
 .../sql/catalyst/expressions/codegen/CodeGenerator.scala   | 14 ++
 1 file changed, 2 insertions(+), 12 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (a72d05c -> 46f7d78)

2021-05-19 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from a72d05c  [SPARK-35106][CORE][SQL] Avoid failing rename caused by 
destination directory not exist
 add 46f7d78  [SPARK-35368][SQL] Update histogram statistics for RANGE 
operator for stats estimation

No new revisions were added by this update.

Summary of changes:
 .../plans/logical/basicLogicalOperators.scala  | 43 +++-
 .../BasicStatsEstimationSuite.scala| 81 ++
 2 files changed, 108 insertions(+), 16 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (9283beb -> 1214213)

2021-05-19 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 9283beb  [SPARK-35418][SQL] Add sentences function to 
functions.{scala,py}
 add 1214213  [SPARK-35362][SQL] Update null count in the column stats for 
UNION operator stats estimation

No new revisions were added by this update.

Summary of changes:
 .../logical/statsEstimation/FilterEstimation.scala |  2 +-
 .../logical/statsEstimation/UnionEstimation.scala  | 97 ++
 .../BasicStatsEstimationSuite.scala|  2 +-
 .../statsEstimation/UnionEstimationSuite.scala | 65 +--
 4 files changed, 122 insertions(+), 44 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (d1b24d8 -> 586caae)

2021-05-19 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from d1b24d8  [SPARK-35338][PYTHON] Separate arithmetic operations into 
data type based structures
 add 586caae  [SPARK-35438][SQL][DOCS] Minor documentation fix for window 
physical operator

No new revisions were added by this update.

Summary of changes:
 .../main/scala/org/apache/spark/sql/execution/window/WindowExec.scala   | 2 +-
 .../scala/org/apache/spark/sql/execution/window/WindowExecBase.scala| 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (bdd8e1d -> e170e63)

2021-05-20 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from bdd8e1d  [SPARK-28551][SQL] CTAS with LOCATION should not allow to a 
non-empty directory
 add e170e63  [SPARK-35457][BUILD] Bump ANTLR runtime version to 4.8

No new revisions were added by this update.

Summary of changes:
 dev/deps/spark-deps-hadoop-2.7-hive-2.3 | 2 +-
 dev/deps/spark-deps-hadoop-3.2-hive-2.3 | 2 +-
 pom.xml | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (fdd7ca5 -> 548e37b)

2021-05-24 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from fdd7ca5  [SPARK-35498][PYTHON] Add thread target wrapper API for 
pyspark pin thread mode
 add 548e37b  [SPARK-33122][SQL][FOLLOWUP] Extend RemoveRedundantAggregates 
optimizer rule to apply to more cases

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/optimizer/Optimizer.scala   | 43 +
 .../optimizer/RemoveRedundantAggregates.scala  | 70 ++
 .../optimizer/RemoveRedundantAggregatesSuite.scala | 16 -
 3 files changed, 86 insertions(+), 43 deletions(-)
 create mode 100644 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/RemoveRedundantAggregates.scala

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (a59063d -> 08e6f63)

2021-06-01 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from a59063d  [SPARK-35581][SQL] Support special datetime values in typed 
literals only
 add 08e6f63  [SPARK-35577][TESTS] Allow to log container output for docker 
integration tests

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/jdbc/DockerJDBCIntegrationSuite.scala | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (cf07036 -> 912d60b)

2021-06-10 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from cf07036  [SPARK-35593][K8S][CORE] Support shuffle data recovery on the 
reused PVCs
 add 912d60b  [SPARK-35709][DOCS] Remove the reference to third party Nomad 
integration project

No new revisions were added by this update.

Summary of changes:
 docs/cluster-overview.md | 3 ---
 1 file changed, 3 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (e9af457 -> c463472)

2021-06-11 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from e9af457  [SPARK-35718][SQL] Support casting of Date to timestamp 
without time zone type
 add c463472  [SPARK-35439][SQL][FOLLOWUP] ExpressionContainmentOrdering 
should not sort unrelated expressions

No new revisions were added by this update.

Summary of changes:
 .../expressions/EquivalentExpressions.scala| 45 --
 .../SubexpressionEliminationSuite.scala| 21 ++
 2 files changed, 45 insertions(+), 21 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (864ff67 -> 9709ee5)

2021-06-15 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 864ff67  [SPARK-35429][CORE] Remove commons-httpclient from Hadoop-3.2 
profile due to EOL and CVEs
 add 9709ee5  [SPARK-35760][SQL] Fix the max rows check for broadcast 
exchange

No new revisions were added by this update.

Summary of changes:
 .../execution/exchange/BroadcastExchangeExec.scala | 25 +++---
 1 file changed, 17 insertions(+), 8 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (ac228d4 -> 11e96dc)

2021-06-15 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from ac228d4  [SPARK-35691][CORE] addFile/addJar/addDirectory should put 
CanonicalFile
 add 11e96dc  [SPARK-35669][SQL] Quote the pushed column name only when 
nested column predicate pushdown is enabled

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/sources/filters.scala |  5 ++--
 .../execution/datasources/DataSourceStrategy.scala | 31 +-
 .../spark/sql/FileBasedDataSourceSuite.scala   | 10 +++
 3 files changed, 31 insertions(+), 15 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (5c96d64 -> b08cf6e)

2021-06-15 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 5c96d64  [SPARK-35707][ML] optimize sparse GEMM by skipping bound 
checking
 add b08cf6e  [SPARK-35203][SQL] Improve Repartition statistics estimation

No new revisions were added by this update.

Summary of changes:
 .../logical/statsEstimation/BasicStatsPlanVisitor.scala |  4 ++--
 .../SizeInBytesOnlyStatsPlanVisitor.scala   |  4 ++--
 .../statsEstimation/BasicStatsEstimationSuite.scala | 17 -
 3 files changed, 16 insertions(+), 9 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (37ef7bb -> f80be41)

2021-06-21 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 37ef7bb  [SPARK-35840][SQL] Add `apply()` for a single field to 
`YearMonthIntervalType` and `DayTimeIntervalType`
 add f80be41  [SPARK-34565][SQL] Collapse Window nodes with Project between 
them

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/optimizer/Optimizer.scala   | 25 ---
 .../catalyst/optimizer/CollapseWindowSuite.scala   | 50 +-
 2 files changed, 68 insertions(+), 7 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [MINOR][SQL] Combine the same codes in test cases

2018-12-04 Thread yamamuro

Repository: spark
Updated Branches:
  refs/heads/master 261284842 -> 93f5592aa


[MINOR][SQL] Combine the same codes in test cases

## What changes were proposed in this pull request?

In the DDLSuit, there are four test cases have the same codes , writing a 
function can combine the same code.

## How was this patch tested?

existing tests.

Closes #23194 from CarolinePeng/Update_temp.

Authored-by: å½ç¿00244106 <00244106@zte.intra>
Signed-off-by: Takeshi Yamamuro 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/93f5592a
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/93f5592a
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/93f5592a

Branch: refs/heads/master
Commit: 93f5592aa8c1254a93524fda81cf0e418c22cb2f
Parents: 2612848
Author: å½ç¿00244106 <00244106@zte.intra>
Authored: Tue Dec 4 22:08:16 2018 +0900
Committer: Takeshi Yamamuro 
Committed: Tue Dec 4 22:08:16 2018 +0900

--
 .../spark/sql/execution/command/DDLSuite.scala  | 40 
 1 file changed, 16 insertions(+), 24 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/93f5592a/sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala
--
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala
index 9d32fb6..052a5e7 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala
+++ 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala
@@ -377,41 +377,41 @@ abstract class DDLSuite extends QueryTest with 
SQLTestUtils {
 }
   }
 
-  test("CTAS a managed table with the existing empty directory") {
-val tableLoc = new 
File(spark.sessionState.catalog.defaultTablePath(TableIdentifier("tab1")))
+  private def withEmptyDirInTablePath(dirName: String)(f : File => Unit): Unit 
= {
+val tableLoc =
+  new 
File(spark.sessionState.catalog.defaultTablePath(TableIdentifier(dirName)))
 try {
   tableLoc.mkdir()
+  f(tableLoc)
+} finally {
+  waitForTasksToFinish()
+  Utils.deleteRecursively(tableLoc)
+}
+  }
+
+
+  test("CTAS a managed table with the existing empty directory") {
+withEmptyDirInTablePath("tab1") { tableLoc =>
   withTable("tab1") {
 sql(s"CREATE TABLE tab1 USING ${dataSource} AS SELECT 1, 'a'")
 checkAnswer(spark.table("tab1"), Row(1, "a"))
   }
-} finally {
-  waitForTasksToFinish()
-  Utils.deleteRecursively(tableLoc)
 }
   }
 
   test("create a managed table with the existing empty directory") {
-val tableLoc = new 
File(spark.sessionState.catalog.defaultTablePath(TableIdentifier("tab1")))
-try {
-  tableLoc.mkdir()
+withEmptyDirInTablePath("tab1") { tableLoc =>
   withTable("tab1") {
 sql(s"CREATE TABLE tab1 (col1 int, col2 string) USING ${dataSource}")
 sql("INSERT INTO tab1 VALUES (1, 'a')")
 checkAnswer(spark.table("tab1"), Row(1, "a"))
   }
-} finally {
-  waitForTasksToFinish()
-  Utils.deleteRecursively(tableLoc)
 }
   }
 
   test("create a managed table with the existing non-empty directory") {
 withTable("tab1") {
-  val tableLoc = new 
File(spark.sessionState.catalog.defaultTablePath(TableIdentifier("tab1")))
-  try {
-// create an empty hidden file
-tableLoc.mkdir()
+  withEmptyDirInTablePath("tab1") { tableLoc =>
 val hiddenGarbageFile = new File(tableLoc.getCanonicalPath, ".garbage")
 hiddenGarbageFile.createNewFile()
 val exMsg = "Can not create the managed table('`tab1`'). The 
associated location"
@@ -439,28 +439,20 @@ abstract class DDLSuite extends QueryTest with 
SQLTestUtils {
   }.getMessage
   assert(ex.contains(exMsgWithDefaultDB))
 }
-  } finally {
-waitForTasksToFinish()
-Utils.deleteRecursively(tableLoc)
   }
 }
   }
 
   test("rename a managed table with existing empty directory") {
-val tableLoc = new 
File(spark.sessionState.catalog.defaultTablePath(TableIdentifier("tab2")))
-try {
+withEmptyDirInTablePath("tab2") { tableLoc =>
   withTable("tab1") {
 sql(s"CREATE TABLE tab1 USING $dataSource AS SELECT 1, 'a'")
-tableLoc.mkdir()
 val ex

[spark] branch master updated: [SPARK-26459][SQL] replace UpdateNullabilityInAttributeReferences with FixNullability

2019-01-10 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 6955638  [SPARK-26459][SQL] replace 
UpdateNullabilityInAttributeReferences with FixNullability
6955638 is described below

commit 6955638eae99cbe0a890a50e0c61c17641e7269f
Author: Wenchen Fan 
AuthorDate: Thu Jan 10 20:15:25 2019 +0900

[SPARK-26459][SQL] replace UpdateNullabilityInAttributeReferences with 
FixNullability

## What changes were proposed in this pull request?

This is a followup of https://github.com/apache/spark/pull/18576

The newly added rule `UpdateNullabilityInAttributeReferences` does the same 
thing the `FixNullability` does, we only need to keep one of them.

This PR removes `UpdateNullabilityInAttributeReferences`, and use 
`FixNullability` to replace it. Also rename it to `UpdateAttributeNullability`

## How was this patch tested?

existing tests

Closes #23390 from cloud-fan/nullable.

Authored-by: Wenchen Fan 
Signed-off-by: Takeshi Yamamuro 
---
 .../spark/sql/catalyst/analysis/Analyzer.scala | 38 +--
 .../analysis/UpdateAttributeNullability.scala  | 57 ++
 .../spark/sql/catalyst/optimizer/Optimizer.scala   | 18 +--
 ...dateAttributeNullabilityInOptimizerSuite.scala} |  9 ++--
 4 files changed, 65 insertions(+), 57 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
index 2aa0f21..a84bb76 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
@@ -197,8 +197,8 @@ class Analyzer(
   PullOutNondeterministic),
 Batch("UDF", Once,
   HandleNullInputsForUDF),
-Batch("FixNullability", Once,
-  FixNullability),
+Batch("UpdateNullability", Once,
+  UpdateAttributeNullability),
 Batch("Subquery", Once,
   UpdateOuterReferences),
 Batch("Cleanup", fixedPoint,
@@ -1822,40 +1822,6 @@ class Analyzer(
   }
 
   /**
-   * Fixes nullability of Attributes in a resolved LogicalPlan by using the 
nullability of
-   * corresponding Attributes of its children output Attributes. This step is 
needed because
-   * users can use a resolved AttributeReference in the Dataset API and outer 
joins
-   * can change the nullability of an AttribtueReference. Without the fix, a 
nullable column's
-   * nullable field can be actually set as non-nullable, which cause illegal 
optimization
-   * (e.g., NULL propagation) and wrong answers.
-   * See SPARK-13484 and SPARK-13801 for the concrete queries of this case.
-   */
-  object FixNullability extends Rule[LogicalPlan] {
-
-def apply(plan: LogicalPlan): LogicalPlan = plan resolveOperatorsUp {
-  case p if !p.resolved => p // Skip unresolved nodes.
-  case p: LogicalPlan if p.resolved =>
-val childrenOutput = p.children.flatMap(c => 
c.output).groupBy(_.exprId).flatMap {
-  case (exprId, attributes) =>
-// If there are multiple Attributes having the same ExprId, we 
need to resolve
-// the conflict of nullable field. We do not really expect this 
happen.
-val nullable = attributes.exists(_.nullable)
-attributes.map(attr => attr.withNullability(nullable))
-}.toSeq
-// At here, we create an AttributeMap that only compare the exprId for 
the lookup
-// operation. So, we can find the corresponding input attribute's 
nullability.
-val attributeMap = AttributeMap[Attribute](childrenOutput.map(attr => 
attr -> attr))
-// For an Attribute used by the current LogicalPlan, if it is from its 
children,
-// we fix the nullable field by using the nullability setting of the 
corresponding
-// output Attribute from the children.
-p.transformExpressions {
-  case attr: Attribute if attributeMap.contains(attr) =>
-attr.withNullability(attributeMap(attr).nullable)
-}
-}
-  }
-
-  /**
* Extracts [[WindowExpression]]s from the projectList of a [[Project]] 
operator and
* aggregateExpressions of an [[Aggregate]] operator and creates individual 
[[Window]]
* operators for every distinct [[WindowSpecDefinition]].
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/UpdateAttributeNullability.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/UpdateAttributeNullability.scala
new file mode 100644
index 000..8655dec
--- /dev/null
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analy

svn commit: r31887 - /dev/spark/KEYS

2019-01-10 Thread yamamuro

Author: yamamuro
Date: Fri Jan 11 06:45:33 2019
New Revision: 31887

Log:
Update KEYS

Modified:
dev/spark/KEYS

Modified: dev/spark/KEYS
==
--- dev/spark/KEYS (original)
+++ dev/spark/KEYS Fri Jan 11 06:45:33 2019
@@ -829,3 +829,61 @@ aI9kX8V9gl5PZLw+LchGX5H7HKoRxZM3UbPkY5Mv
 ZIAzEigXrrsePyvHGf6H
 =6YJg
 -END PGP PUBLIC KEY BLOCK-
+
+pub   rsa4096 2019-01-10 [SC]
+  0E9925082727075EEE83D4B06EC5F1052DF08FF4
+uid   [ultimate] Takeshi Yamamuro (CODE SIGNING KEY) 

+sub   rsa4096 2019-01-10 [E]
+
+-BEGIN PGP PUBLIC KEY BLOCK-
+
+mQINBFw2q20BEADLW2BZbJO2YHmAmAumggCTm4aVWFRYH+NX0zqEX2bynA0GM5hR
+euvLL6w5vq44S6zU+39o1s9wSDcBAqLNpPB2eDL8qqXKZa/AQTwCiitk9aDB1KZB
+DzejoqtrtCK1WnCW7oB7mQIq+/txSyLgv1UgFijh2aAx0ChmMnb2WbeZAQz/5ids
+ixMfZiRofZVJIjdNNe5kIBcc9uthoyLw3x16nLT3zrATtBSDAL8hAULOqXPMMf3T
+xzm2cPnOnqFlKGkEWRuptnoPHJ8+Uwbb91oQmlFGolU9PvCQVdmtMWCmqvlg5SeZ
+VSC+w4eUk8M2nWxPh+WrPP5eQMDVUdmWgC/ZzCoNW/AxY4T9G3h3XLpZoyoDEUmd
+Xk95KiEq/fo2ZT2jF31tPsGPhlzGETnzDK1xdNtoFKqjvWxwdPmJgGBau2d30rxJ
+gvrjMtvcJ8Z/L7D0hKR8r8eJB6GlfBTLARVQ/XygNS1sfR6+rv/kNFGR8932bNsf
+OtxiAo1Ga3vn3Q3WK+9Ddz4HKhsoOwWYllRNE60xB2LGM7ZjvvY/I9Vx2Fqfew5z
+MC1s3u1Bgu0FIepV+N0Qxs2yfavdfLSVCFZ2elXkyZ7vGAFikksgGRSLbYgx01Qx
+gCx3nzYL1uol6s1z6jj039p/mEqSVMY1FiecmK3/inNMy4dLjg6s+Au+GwARAQAB
+tDlUYWtlc2hpIFlhbWFtdXJvIChDT0RFIFNJR05JTkcgS0VZKSA8eWFtYW11cm9A
+YXBhY2hlLm9yZz6JAk4EEwEIADgWIQQOmSUIJycHXu6D1LBuxfEFLfCP9AUCXDar
+bQIbAwULCQgHAgYVCgkICwIEFgIDAQIeAQIXgAAKCRBuxfEFLfCP9OLWD/4uxh1M
+3BiXxwqKoBbephYTI//iSVRmwSXQdm7fsPkZywc7K4W+jiYyf1Qe8mZ4ikNVnvcE
+W7+FkLGWDFHcIXddXcruynrTeQ9YwO/RPY26qYGWfeXaIf7obVSRVT6wCg//rw/o
+xglE2aBXM6kgEgcZRkIo5FeLxGK0VgQ56ANN4Aa4/Jev7/Fca+MkeXH6UlxnkMD5
+W/UgMWMEZFKJPXiLpxgmhzzq5T5ahvhRQfxtRAXz+w/SK+vo+jeZZ5/SqtDECsa4
+uG/iWjC5bNOsV97iCFx/KxNY5I4U4Q5svG6mz+IRgCMV3jpkslQfME2wXgC/k1bT
+vr8ICOEzQguYgBYdXl99cMgy3ULPy1vbx4DycuKneKtkp25voy5rtU3+JBrxpwSa
+TwD1gRiXFscZ5oomI3rn0jPq1dIKhrQaG0T2QwKn47spdPK0TWbec+SNo07dDaC0
+IsqgSZ1fkGk5ILTZ/AfYzdnHHeJ3IvrkVFLMMD35Rwcji8E85tMXV7GmlDejjMNk
+QTQMQymXB+yRqIrHMAss1IY11UmQCtGSJfHwiAYW+iRBZfpB7fFvHMhwQFT4wEPW
+St5JyUiRled8+1BtDUYeBjDr9UtAh/moD7xXtu8wiZjea87LUt+H/tTogsHWN/kJ
+igCoSWXK5ugVy8sKI/Q+jQSgXzduChiTQQWIvrkCDQRcNqttARAA0WuzOkBGx6/S
+0YV5GGwn0+Zqxhm0EV/G4cT+1IPKgiMTuTp/vRF7IDwZwh5oalG4Cl7YGygqEx/V
+gHqtf0m1aFV4vndmmMaHKnYAl9/rk3Svu3BRXgu9sJPoMz3nDlRhcT3IvVPZw34E
+PQg0tKhnAbvSwxpRL1jHhJgHTYmebja0UTSVr3NXAs8Z+XSEjZN//5B5m4N2UkUh
+XVMzfDWaOa+EYlKmzhqIt6Q8/MNjFp7jeNOKUMBoIP0JKf3Y37M9NLolQihJ9RwE
+2f0a8PN5xMVDJTcDMox+bXa0ohcYKiu6whIz82tg0hZmgtdg20lC15ZTXzJh3DRh
+cklbMeLegwijHLuCBIgOtbuVknWqktx89Xdg9IG84eByDPxxuZwM9QNbfip9JHKH
+Pv8M2W1wPMIIgIaRRzEu1NKUoZq14/Djn0t1hb2rjQarPOR3pqlO75TdMZJ8ZVK3
+OSUKWbLed+VI/X2I0iiH5Ag/Ajzh9qIqyKVxZI0Md7G7CWHfiVRHNzMlGP08z4sn
+N6uu9vzL6GSiHU5cPtD34gPXMlWq42wXCat8GMMHZAdeCwhLVm3+wPucq8OO
+S0cTmUzxdnMomUO9HST2a3aO8ulBhu4wh3Y+1gkxvJ19N+WsS6uBFBOnaWf3m1Y0
+2bKSKEtKunWfwfXHowyFwKpQF4cFClEAEQEAAYkCNgQYAQgAIBYhBA6ZJQgnJwde
+7oPUsG7F8QUt8I/0BQJcNqttAhsMAAoJEG7F8QUt8I/0H54QAKtJvjP7dtCQF+pZ
+oy9KgfdF0CSdpTwXbEn0VE/GcdkJxXoiDTTb9GVAm/ySpwRUcTub/jFjh3uKN1t5
+SbVUR6TfewhKZ5fsKqTbUKYXag+CRLy1n59RQPg9LcL6NwTk3+SJ4cLAnj0buVFa
+nlZ0W2fC54TK2xvGcnU7S3dQdlyPuvR6ouNqzQxEuXTI0t9cXdQFpf8WLt0KknsH
+kMEZpKWMnrfA5fusqiGQ+9GcjowvEc6tPiZ+bMJyJSj2kmTHnCU0krxPr/xuFfNa
+YpJvIZFPwn9GKxejOcZVckKtdhXMmtFlwLnCcWuB0GRRQjd9r8R+KCJM6RlTp4yI
+LBBWmPnJp0Sd/9xCdVZp1fFNZ+w72q5Z0l+6r+DuvThYhH5HdRxfmH33SzdpWEf8
+WcKCbbi9mN+2ZsJufR5LvKsNpv6DLTwCuMFlIptxSxGiYZxRYMKeZJ84AWHL7sit
+ftDfwHakkfUZgprK5MBuEcjxXrsmcM25Ns+rhA80JCRmsqqreSC4M9XnKkya5hoJ
+83pIuVIGxOVLhVWYkAGCqW+UVr1zBBBZYe8U3wDCFucHazqcaOHCUXAxM4rwpp/K
+pqnGj9s6Uudh/FXfVN5MC0/pH/ySSACkXwCmKXAh2s8F9w199WRsNlya3Ce1Ryan
+/G8Bpm/p4kbeqJtsx3t7nhPke7fG
+=4noL
+-END PGP PUBLIC KEY BLOCK-



-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] tag v2.3.3-rc1 created (now 0e3d5fd)

2019-01-12 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to tag v2.3.3-rc1
in repository https://gitbox.apache.org/repos/asf/spark.git.


  at 0e3d5fd  (commit)
This tag includes the following new commits:

 new 0e3d5fd  Preparing Spark release v2.3.3-rc1

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.



-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] 01/01: Preparing Spark release v2.3.3-rc1

2019-01-12 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to tag v2.3.3-rc1
in repository https://gitbox.apache.org/repos/asf/spark.git

commit 0e3d5fd960927dd8ff1a909aba98b85fb9350c58
Author: Takeshi Yamamuro 
AuthorDate: Sun Jan 13 00:25:46 2019 +

Preparing Spark release v2.3.3-rc1
---
 assembly/pom.xml  | 2 +-
 common/kvstore/pom.xml| 2 +-
 common/network-common/pom.xml | 2 +-
 common/network-shuffle/pom.xml| 2 +-
 common/network-yarn/pom.xml   | 2 +-
 common/sketch/pom.xml | 2 +-
 common/tags/pom.xml   | 2 +-
 common/unsafe/pom.xml | 2 +-
 core/pom.xml  | 2 +-
 docs/_config.yml  | 2 +-
 examples/pom.xml  | 2 +-
 external/docker-integration-tests/pom.xml | 2 +-
 external/flume-assembly/pom.xml   | 2 +-
 external/flume-sink/pom.xml   | 2 +-
 external/flume/pom.xml| 2 +-
 external/kafka-0-10-assembly/pom.xml  | 2 +-
 external/kafka-0-10-sql/pom.xml   | 2 +-
 external/kafka-0-10/pom.xml   | 2 +-
 external/kafka-0-8-assembly/pom.xml   | 2 +-
 external/kafka-0-8/pom.xml| 2 +-
 external/kinesis-asl-assembly/pom.xml | 2 +-
 external/kinesis-asl/pom.xml  | 2 +-
 external/spark-ganglia-lgpl/pom.xml   | 2 +-
 graphx/pom.xml| 2 +-
 hadoop-cloud/pom.xml  | 2 +-
 launcher/pom.xml  | 2 +-
 mllib-local/pom.xml   | 2 +-
 mllib/pom.xml | 2 +-
 pom.xml   | 2 +-
 python/pyspark/version.py | 2 +-
 repl/pom.xml  | 2 +-
 resource-managers/kubernetes/core/pom.xml | 2 +-
 resource-managers/mesos/pom.xml   | 2 +-
 resource-managers/yarn/pom.xml| 2 +-
 sql/catalyst/pom.xml  | 2 +-
 sql/core/pom.xml  | 2 +-
 sql/hive-thriftserver/pom.xml | 2 +-
 sql/hive/pom.xml  | 2 +-
 streaming/pom.xml | 2 +-
 tools/pom.xml | 2 +-
 40 files changed, 40 insertions(+), 40 deletions(-)

diff --git a/assembly/pom.xml b/assembly/pom.xml
index f8b15cc..6a8cd4f 100644
--- a/assembly/pom.xml
+++ b/assembly/pom.xml
@@ -21,7 +21,7 @@
   
 org.apache.spark
 spark-parent_2.11
-2.3.3-SNAPSHOT
+2.3.3
 ../pom.xml
   
 
diff --git a/common/kvstore/pom.xml b/common/kvstore/pom.xml
index e412a47..6010b6e 100644
--- a/common/kvstore/pom.xml
+++ b/common/kvstore/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.spark
 spark-parent_2.11
-2.3.3-SNAPSHOT
+2.3.3
 ../../pom.xml
   
 
diff --git a/common/network-common/pom.xml b/common/network-common/pom.xml
index d8f9a3d..8b5d3c8 100644
--- a/common/network-common/pom.xml
+++ b/common/network-common/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.spark
 spark-parent_2.11
-2.3.3-SNAPSHOT
+2.3.3
 ../../pom.xml
   
 
diff --git a/common/network-shuffle/pom.xml b/common/network-shuffle/pom.xml
index a1a4f87..dd27a24 100644
--- a/common/network-shuffle/pom.xml
+++ b/common/network-shuffle/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.spark
 spark-parent_2.11
-2.3.3-SNAPSHOT
+2.3.3
 ../../pom.xml
   
 
diff --git a/common/network-yarn/pom.xml b/common/network-yarn/pom.xml
index e650978..aded5e7d 100644
--- a/common/network-yarn/pom.xml
+++ b/common/network-yarn/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.spark
 spark-parent_2.11
-2.3.3-SNAPSHOT
+2.3.3
 ../../pom.xml
   
 
diff --git a/common/sketch/pom.xml b/common/sketch/pom.xml
index 350e3cb..a50f612 100644
--- a/common/sketch/pom.xml
+++ b/common/sketch/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.spark
 spark-parent_2.11
-2.3.3-SNAPSHOT
+2.3.3
 ../../pom.xml
   
 
diff --git a/common/tags/pom.xml b/common/tags/pom.xml
index e7fea41..8112ca4 100644
--- a/common/tags/pom.xml
+++ b/common/tags/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.spark
 spark-parent_2.11
-2.3.3-SNAPSHOT
+2.3.3
 ../../pom.xml
   
 
diff --git a/common/unsafe/pom.xml b/common/unsafe/pom.xml
index 601cc5d..0d5f61f 100644
--- a/common/unsafe/pom.xml
+++ b/common/unsafe/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.spark
 spark-parent_2.11
-2.3.3-SNAPSHOT
+2.3.3
 ../../pom.xml
   
 
diff --git a/core/pom.xml b/core/pom.xml
index 2a7e644..930128d 100644
--- a/core/pom.xml
+++ b/core/pom.xml
@@ -21,7 +21,7 @@
   
 org.apache.spark
 spark-parent_2.11
-2.3.3-SNAPSHOT
+2.3.3
 ../pom.xml
   
 
diff --git a/docs/_config.yml b/docs/_config.yml
index 7629f5f..8e9c3b5 100644
--- a/docs/_config.yml
+++ b/docs/_config.yml
@@ -14,7 +14,7 @@ include:
 
 # These allow

[spark] 01/01: Preparing development version 2.3.4-SNAPSHOT

2019-01-12 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-2.3
in repository https://gitbox.apache.org/repos/asf/spark.git

commit e46b0edd1046329fa3e3a730d59a6a263f72cbd0
Author: Takeshi Yamamuro 
AuthorDate: Sun Jan 13 00:26:02 2019 +

Preparing development version 2.3.4-SNAPSHOT
---
 R/pkg/DESCRIPTION | 2 +-
 assembly/pom.xml  | 2 +-
 common/kvstore/pom.xml| 2 +-
 common/network-common/pom.xml | 2 +-
 common/network-shuffle/pom.xml| 2 +-
 common/network-yarn/pom.xml   | 2 +-
 common/sketch/pom.xml | 2 +-
 common/tags/pom.xml   | 2 +-
 common/unsafe/pom.xml | 2 +-
 core/pom.xml  | 2 +-
 docs/_config.yml  | 4 ++--
 examples/pom.xml  | 2 +-
 external/docker-integration-tests/pom.xml | 2 +-
 external/flume-assembly/pom.xml   | 2 +-
 external/flume-sink/pom.xml   | 2 +-
 external/flume/pom.xml| 2 +-
 external/kafka-0-10-assembly/pom.xml  | 2 +-
 external/kafka-0-10-sql/pom.xml   | 2 +-
 external/kafka-0-10/pom.xml   | 2 +-
 external/kafka-0-8-assembly/pom.xml   | 2 +-
 external/kafka-0-8/pom.xml| 2 +-
 external/kinesis-asl-assembly/pom.xml | 2 +-
 external/kinesis-asl/pom.xml  | 2 +-
 external/spark-ganglia-lgpl/pom.xml   | 2 +-
 graphx/pom.xml| 2 +-
 hadoop-cloud/pom.xml  | 2 +-
 launcher/pom.xml  | 2 +-
 mllib-local/pom.xml   | 2 +-
 mllib/pom.xml | 2 +-
 pom.xml   | 2 +-
 python/pyspark/version.py | 2 +-
 repl/pom.xml  | 2 +-
 resource-managers/kubernetes/core/pom.xml | 2 +-
 resource-managers/mesos/pom.xml   | 2 +-
 resource-managers/yarn/pom.xml| 2 +-
 sql/catalyst/pom.xml  | 2 +-
 sql/core/pom.xml  | 2 +-
 sql/hive-thriftserver/pom.xml | 2 +-
 sql/hive/pom.xml  | 2 +-
 streaming/pom.xml | 2 +-
 tools/pom.xml | 2 +-
 41 files changed, 42 insertions(+), 42 deletions(-)

diff --git a/R/pkg/DESCRIPTION b/R/pkg/DESCRIPTION
index 6ec4966..a82446e 100644
--- a/R/pkg/DESCRIPTION
+++ b/R/pkg/DESCRIPTION
@@ -1,6 +1,6 @@
 Package: SparkR
 Type: Package
-Version: 2.3.3
+Version: 2.3.4
 Title: R Frontend for Apache Spark
 Description: Provides an R Frontend for Apache Spark.
 Authors@R: c(person("Shivaram", "Venkataraman", role = c("aut", "cre"),
diff --git a/assembly/pom.xml b/assembly/pom.xml
index 6a8cd4f..612a1b8 100644
--- a/assembly/pom.xml
+++ b/assembly/pom.xml
@@ -21,7 +21,7 @@
   
 org.apache.spark
 spark-parent_2.11
-2.3.3
+2.3.4-SNAPSHOT
 ../pom.xml
   
 
diff --git a/common/kvstore/pom.xml b/common/kvstore/pom.xml
index 6010b6e..5547e97 100644
--- a/common/kvstore/pom.xml
+++ b/common/kvstore/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.spark
 spark-parent_2.11
-2.3.3
+2.3.4-SNAPSHOT
 ../../pom.xml
   
 
diff --git a/common/network-common/pom.xml b/common/network-common/pom.xml
index 8b5d3c8..119dde2 100644
--- a/common/network-common/pom.xml
+++ b/common/network-common/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.spark
 spark-parent_2.11
-2.3.3
+2.3.4-SNAPSHOT
 ../../pom.xml
   
 
diff --git a/common/network-shuffle/pom.xml b/common/network-shuffle/pom.xml
index dd27a24..dba5224 100644
--- a/common/network-shuffle/pom.xml
+++ b/common/network-shuffle/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.spark
 spark-parent_2.11
-2.3.3
+2.3.4-SNAPSHOT
 ../../pom.xml
   
 
diff --git a/common/network-yarn/pom.xml b/common/network-yarn/pom.xml
index aded5e7d..56902a3 100644
--- a/common/network-yarn/pom.xml
+++ b/common/network-yarn/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.spark
 spark-parent_2.11
-2.3.3
+2.3.4-SNAPSHOT
 ../../pom.xml
   
 
diff --git a/common/sketch/pom.xml b/common/sketch/pom.xml
index a50f612..5302d95 100644
--- a/common/sketch/pom.xml
+++ b/common/sketch/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.spark
 spark-parent_2.11
-2.3.3
+2.3.4-SNAPSHOT
 ../../pom.xml
   
 
diff --git a/common/tags/pom.xml b/common/tags/pom.xml
index 8112ca4..232ebfa 100644
--- a/common/tags/pom.xml
+++ b/common/tags/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.spark
 spark-parent_2.11
-2.3.3
+2.3.4-SNAPSHOT
 ../../pom.xml
   
 
diff --git a/common/unsafe/pom.xml b/common/unsafe/pom.xml
index 0d5f61f..f0baa2a 100644
--- a/common/unsafe/pom.xml
+++ b/common/unsafe/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache

[spark] branch branch-2.3 updated (6d063ee -> e46b0ed)

2019-01-12 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch branch-2.3
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 6d063ee  [SPARK-26538][SQL] Set default precision and scale for 
elements of postgres numeric array
 add 0e3d5fd  Preparing Spark release v2.3.3-rc1
 new e46b0ed  Preparing development version 2.3.4-SNAPSHOT

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 R/pkg/DESCRIPTION | 2 +-
 assembly/pom.xml  | 2 +-
 common/kvstore/pom.xml| 2 +-
 common/network-common/pom.xml | 2 +-
 common/network-shuffle/pom.xml| 2 +-
 common/network-yarn/pom.xml   | 2 +-
 common/sketch/pom.xml | 2 +-
 common/tags/pom.xml   | 2 +-
 common/unsafe/pom.xml | 2 +-
 core/pom.xml  | 2 +-
 docs/_config.yml  | 4 ++--
 examples/pom.xml  | 2 +-
 external/docker-integration-tests/pom.xml | 2 +-
 external/flume-assembly/pom.xml   | 2 +-
 external/flume-sink/pom.xml   | 2 +-
 external/flume/pom.xml| 2 +-
 external/kafka-0-10-assembly/pom.xml  | 2 +-
 external/kafka-0-10-sql/pom.xml   | 2 +-
 external/kafka-0-10/pom.xml   | 2 +-
 external/kafka-0-8-assembly/pom.xml   | 2 +-
 external/kafka-0-8/pom.xml| 2 +-
 external/kinesis-asl-assembly/pom.xml | 2 +-
 external/kinesis-asl/pom.xml  | 2 +-
 external/spark-ganglia-lgpl/pom.xml   | 2 +-
 graphx/pom.xml| 2 +-
 hadoop-cloud/pom.xml  | 2 +-
 launcher/pom.xml  | 2 +-
 mllib-local/pom.xml   | 2 +-
 mllib/pom.xml | 2 +-
 pom.xml   | 2 +-
 python/pyspark/version.py | 2 +-
 repl/pom.xml  | 2 +-
 resource-managers/kubernetes/core/pom.xml | 2 +-
 resource-managers/mesos/pom.xml   | 2 +-
 resource-managers/yarn/pom.xml| 2 +-
 sql/catalyst/pom.xml  | 2 +-
 sql/core/pom.xml  | 2 +-
 sql/hive-thriftserver/pom.xml | 2 +-
 sql/hive/pom.xml  | 2 +-
 streaming/pom.xml | 2 +-
 tools/pom.xml | 2 +-
 41 files changed, 42 insertions(+), 42 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.3 updated: [SPARK-25572][SPARKR] test only if not cran

2019-01-12 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-2.3
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.3 by this push:
 new d397348  [SPARK-25572][SPARKR] test only if not cran
d397348 is described below

commit d397348b7bec20743f738694a135e4b67947fd99
Author: Felix Cheung 
AuthorDate: Sat Sep 29 14:48:32 2018 -0700

[SPARK-25572][SPARKR] test only if not cran

## What changes were proposed in this pull request?

CRAN doesn't seem to respect the system requirements as running tests - we 
have seen cases where SparkR is run on Java 10, which unfortunately Spark does 
not start on. For 2.4, lets attempt skipping all tests

## How was this patch tested?

manual, jenkins, appveyor

Author: Felix Cheung 

Closes #22589 from felixcheung/ralltests.

(cherry picked from commit f4b138082ff91be74b0f5bbe19cdb90dd9e5f131)
Signed-off-by: Takeshi Yamamuro 
---
 R/pkg/tests/run-all.R | 83 +++
 1 file changed, 44 insertions(+), 39 deletions(-)

diff --git a/R/pkg/tests/run-all.R b/R/pkg/tests/run-all.R
index 94d7518..1e96418 100644
--- a/R/pkg/tests/run-all.R
+++ b/R/pkg/tests/run-all.R
@@ -18,50 +18,55 @@
 library(testthat)
 library(SparkR)
 
-# Turn all warnings into errors
-options("warn" = 2)
+# SPARK-25572
+if (identical(Sys.getenv("NOT_CRAN"), "true")) {
 
-if (.Platform$OS.type == "windows") {
-  Sys.setenv(TZ = "GMT")
-}
+  # Turn all warnings into errors
+  options("warn" = 2)
 
-# Setup global test environment
-# Install Spark first to set SPARK_HOME
+  if (.Platform$OS.type == "windows") {
+Sys.setenv(TZ = "GMT")
+  }
 
-# NOTE(shivaram): We set overwrite to handle any old tar.gz files or 
directories left behind on
-# CRAN machines. For Jenkins we should already have SPARK_HOME set.
-install.spark(overwrite = TRUE)
+  # Setup global test environment
+  # Install Spark first to set SPARK_HOME
 
-sparkRDir <- file.path(Sys.getenv("SPARK_HOME"), "R")
-sparkRWhitelistSQLDirs <- c("spark-warehouse", "metastore_db")
-invisible(lapply(sparkRWhitelistSQLDirs,
- function(x) { unlink(file.path(sparkRDir, x), recursive = 
TRUE, force = TRUE)}))
-sparkRFilesBefore <- list.files(path = sparkRDir, all.files = TRUE)
+  # NOTE(shivaram): We set overwrite to handle any old tar.gz files or 
directories left behind on
+  # CRAN machines. For Jenkins we should already have SPARK_HOME set.
+  install.spark(overwrite = TRUE)
 
-sparkRTestMaster <- "local[1]"
-sparkRTestConfig <- list()
-if (identical(Sys.getenv("NOT_CRAN"), "true")) {
-  sparkRTestMaster <- ""
-} else {
-  # Disable hsperfdata on CRAN
-  old_java_opt <- Sys.getenv("_JAVA_OPTIONS")
-  Sys.setenv("_JAVA_OPTIONS" = paste("-XX:-UsePerfData", old_java_opt))
-  tmpDir <- tempdir()
-  tmpArg <- paste0("-Djava.io.tmpdir=", tmpDir)
-  sparkRTestConfig <- list(spark.driver.extraJavaOptions = tmpArg,
-   spark.executor.extraJavaOptions = tmpArg)
-}
+  sparkRDir <- file.path(Sys.getenv("SPARK_HOME"), "R")
+  sparkRWhitelistSQLDirs <- c("spark-warehouse", "metastore_db")
+  invisible(lapply(sparkRWhitelistSQLDirs,
+   function(x) { unlink(file.path(sparkRDir, x), recursive = 
TRUE, force = TRUE)}))
+  sparkRFilesBefore <- list.files(path = sparkRDir, all.files = TRUE)
 
-test_package("SparkR")
+  sparkRTestMaster <- "local[1]"
+  sparkRTestConfig <- list()
+  if (identical(Sys.getenv("NOT_CRAN"), "true")) {
+sparkRTestMaster <- ""
+  } else {
+# Disable hsperfdata on CRAN
+old_java_opt <- Sys.getenv("_JAVA_OPTIONS")
+Sys.setenv("_JAVA_OPTIONS" = paste("-XX:-UsePerfData", old_java_opt))
+tmpDir <- tempdir()
+tmpArg <- paste0("-Djava.io.tmpdir=", tmpDir)
+sparkRTestConfig <- list(spark.driver.extraJavaOptions = tmpArg,
+ spark.executor.extraJavaOptions = tmpArg)
+  }
 
-if (identical(Sys.getenv("NOT_CRAN"), "true")) {
-  # set random seed for predictable results. mostly for base's sample() in 
tree and classification
-  set.seed(42)
-  # for testthat 1.0.2 later, change reporter from "summary" to 
default_reporter()
-  testthat:::run_tests("SparkR",
-   file.path(sparkRDir, "pkg", "tests", "fulltests"),
-   NULL,
-   "summary")
-}
+  test_package("SparkR")
+
+  if

[spark] tag v2.3.3-rc1 deleted (was 0e3d5fd)

2019-01-12 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to tag v2.3.3-rc1
in repository https://gitbox.apache.org/repos/asf/spark.git.


*** WARNING: tag v2.3.3-rc1 was deleted! ***

 was 0e3d5fd  Preparing Spark release v2.3.3-rc1

The revisions that were on this tag are still contained in
other references; therefore, this change does not discard any commits
from the repository.


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] 01/02: [SPARK-26010][R] fix vignette eval with Java 11

2019-01-12 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-2.3
in repository https://gitbox.apache.org/repos/asf/spark.git

commit 20b749021bacaa2906775944e43597ccf37af62b
Author: Felix Cheung 
AuthorDate: Mon Nov 12 19:03:30 2018 -0800

[SPARK-26010][R] fix vignette eval with Java 11

## What changes were proposed in this pull request?

changes in vignette only to disable eval

## How was this patch tested?

Jenkins

Author: Felix Cheung 

Closes #23007 from felixcheung/rjavavervig.

(cherry picked from commit 88c82627267a9731b2438f0cc28dd656eb3dc834)
Signed-off-by: Felix Cheung 
---
 R/pkg/vignettes/sparkr-vignettes.Rmd | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/R/pkg/vignettes/sparkr-vignettes.Rmd 
b/R/pkg/vignettes/sparkr-vignettes.Rmd
index d4713de..70970bd 100644
--- a/R/pkg/vignettes/sparkr-vignettes.Rmd
+++ b/R/pkg/vignettes/sparkr-vignettes.Rmd
@@ -57,6 +57,20 @@ First, let's load and attach the package.
 library(SparkR)
 ```
 
+```{r, include=FALSE}
+# disable eval if java version not supported
+override_eval <- tryCatch(!is.numeric(SparkR:::checkJavaVersion()),
+  error = function(e) { TRUE },
+  warning = function(e) { TRUE })
+
+if (override_eval) {
+  opts_hooks$set(eval = function(options) {
+options$eval = FALSE
+options
+  })
+}
+```
+
 `SparkSession` is the entry point into SparkR which connects your R program to 
a Spark cluster. You can create a `SparkSession` using `sparkR.session` and 
pass in options such as the application name, any Spark packages depended on, 
etc.
 
 We use default settings in which it runs in local mode. It auto downloads 
Spark package in the background if no previous installation is found. For more 
details about setup, see [Spark Session](#SetupSparkSession).


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.3 updated (d397348 -> 01511e4)

2019-01-12 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch branch-2.3
in repository https://gitbox.apache.org/repos/asf/spark.git.


 discard d397348  [SPARK-25572][SPARKR] test only if not cran
 discard a9a1bc7  [SPARK-26010][R] fix vignette eval with Java 11
 discard e46b0ed  Preparing development version 2.3.4-SNAPSHOT
 discard 0e3d5fd  Preparing Spark release v2.3.3-rc1
 new 20b7490  [SPARK-26010][R] fix vignette eval with Java 11
 new 01511e4  [SPARK-25572][SPARKR] test only if not cran

This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version.  This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:

 * -- * -- B -- O -- O -- O   (d397348)
\
 N -- N -- N   refs/heads/branch-2.3 (01511e4)

You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.

Any revisions marked "omit" are not gone; other references still
refer to them.  Any revisions marked "discard" are gone forever.

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 R/pkg/DESCRIPTION | 2 +-
 assembly/pom.xml  | 2 +-
 common/kvstore/pom.xml| 2 +-
 common/network-common/pom.xml | 2 +-
 common/network-shuffle/pom.xml| 2 +-
 common/network-yarn/pom.xml   | 2 +-
 common/sketch/pom.xml | 2 +-
 common/tags/pom.xml   | 2 +-
 common/unsafe/pom.xml | 2 +-
 core/pom.xml  | 2 +-
 docs/_config.yml  | 4 ++--
 examples/pom.xml  | 2 +-
 external/docker-integration-tests/pom.xml | 2 +-
 external/flume-assembly/pom.xml   | 2 +-
 external/flume-sink/pom.xml   | 2 +-
 external/flume/pom.xml| 2 +-
 external/kafka-0-10-assembly/pom.xml  | 2 +-
 external/kafka-0-10-sql/pom.xml   | 2 +-
 external/kafka-0-10/pom.xml   | 2 +-
 external/kafka-0-8-assembly/pom.xml   | 2 +-
 external/kafka-0-8/pom.xml| 2 +-
 external/kinesis-asl-assembly/pom.xml | 2 +-
 external/kinesis-asl/pom.xml  | 2 +-
 external/spark-ganglia-lgpl/pom.xml   | 2 +-
 graphx/pom.xml| 2 +-
 hadoop-cloud/pom.xml  | 2 +-
 launcher/pom.xml  | 2 +-
 mllib-local/pom.xml   | 2 +-
 mllib/pom.xml | 2 +-
 pom.xml   | 2 +-
 python/pyspark/version.py | 2 +-
 repl/pom.xml  | 2 +-
 resource-managers/kubernetes/core/pom.xml | 2 +-
 resource-managers/mesos/pom.xml   | 2 +-
 resource-managers/yarn/pom.xml| 2 +-
 sql/catalyst/pom.xml  | 2 +-
 sql/core/pom.xml  | 2 +-
 sql/hive-thriftserver/pom.xml | 2 +-
 sql/hive/pom.xml  | 2 +-
 streaming/pom.xml | 2 +-
 tools/pom.xml | 2 +-
 41 files changed, 42 insertions(+), 42 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] 02/02: [SPARK-25572][SPARKR] test only if not cran

2019-01-12 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-2.3
in repository https://gitbox.apache.org/repos/asf/spark.git

commit 01511e479013c56d70fe8ffa805ecbd66591b57e
Author: Felix Cheung 
AuthorDate: Sat Sep 29 14:48:32 2018 -0700

[SPARK-25572][SPARKR] test only if not cran

## What changes were proposed in this pull request?

CRAN doesn't seem to respect the system requirements as running tests - we 
have seen cases where SparkR is run on Java 10, which unfortunately Spark does 
not start on. For 2.4, lets attempt skipping all tests

## How was this patch tested?

manual, jenkins, appveyor

Author: Felix Cheung 

Closes #22589 from felixcheung/ralltests.

(cherry picked from commit f4b138082ff91be74b0f5bbe19cdb90dd9e5f131)
Signed-off-by: Takeshi Yamamuro 
---
 R/pkg/tests/run-all.R | 83 +++
 1 file changed, 44 insertions(+), 39 deletions(-)

diff --git a/R/pkg/tests/run-all.R b/R/pkg/tests/run-all.R
index 94d7518..1e96418 100644
--- a/R/pkg/tests/run-all.R
+++ b/R/pkg/tests/run-all.R
@@ -18,50 +18,55 @@
 library(testthat)
 library(SparkR)
 
-# Turn all warnings into errors
-options("warn" = 2)
+# SPARK-25572
+if (identical(Sys.getenv("NOT_CRAN"), "true")) {
 
-if (.Platform$OS.type == "windows") {
-  Sys.setenv(TZ = "GMT")
-}
+  # Turn all warnings into errors
+  options("warn" = 2)
 
-# Setup global test environment
-# Install Spark first to set SPARK_HOME
+  if (.Platform$OS.type == "windows") {
+Sys.setenv(TZ = "GMT")
+  }
 
-# NOTE(shivaram): We set overwrite to handle any old tar.gz files or 
directories left behind on
-# CRAN machines. For Jenkins we should already have SPARK_HOME set.
-install.spark(overwrite = TRUE)
+  # Setup global test environment
+  # Install Spark first to set SPARK_HOME
 
-sparkRDir <- file.path(Sys.getenv("SPARK_HOME"), "R")
-sparkRWhitelistSQLDirs <- c("spark-warehouse", "metastore_db")
-invisible(lapply(sparkRWhitelistSQLDirs,
- function(x) { unlink(file.path(sparkRDir, x), recursive = 
TRUE, force = TRUE)}))
-sparkRFilesBefore <- list.files(path = sparkRDir, all.files = TRUE)
+  # NOTE(shivaram): We set overwrite to handle any old tar.gz files or 
directories left behind on
+  # CRAN machines. For Jenkins we should already have SPARK_HOME set.
+  install.spark(overwrite = TRUE)
 
-sparkRTestMaster <- "local[1]"
-sparkRTestConfig <- list()
-if (identical(Sys.getenv("NOT_CRAN"), "true")) {
-  sparkRTestMaster <- ""
-} else {
-  # Disable hsperfdata on CRAN
-  old_java_opt <- Sys.getenv("_JAVA_OPTIONS")
-  Sys.setenv("_JAVA_OPTIONS" = paste("-XX:-UsePerfData", old_java_opt))
-  tmpDir <- tempdir()
-  tmpArg <- paste0("-Djava.io.tmpdir=", tmpDir)
-  sparkRTestConfig <- list(spark.driver.extraJavaOptions = tmpArg,
-   spark.executor.extraJavaOptions = tmpArg)
-}
+  sparkRDir <- file.path(Sys.getenv("SPARK_HOME"), "R")
+  sparkRWhitelistSQLDirs <- c("spark-warehouse", "metastore_db")
+  invisible(lapply(sparkRWhitelistSQLDirs,
+   function(x) { unlink(file.path(sparkRDir, x), recursive = 
TRUE, force = TRUE)}))
+  sparkRFilesBefore <- list.files(path = sparkRDir, all.files = TRUE)
 
-test_package("SparkR")
+  sparkRTestMaster <- "local[1]"
+  sparkRTestConfig <- list()
+  if (identical(Sys.getenv("NOT_CRAN"), "true")) {
+sparkRTestMaster <- ""
+  } else {
+# Disable hsperfdata on CRAN
+old_java_opt <- Sys.getenv("_JAVA_OPTIONS")
+Sys.setenv("_JAVA_OPTIONS" = paste("-XX:-UsePerfData", old_java_opt))
+tmpDir <- tempdir()
+tmpArg <- paste0("-Djava.io.tmpdir=", tmpDir)
+sparkRTestConfig <- list(spark.driver.extraJavaOptions = tmpArg,
+ spark.executor.extraJavaOptions = tmpArg)
+  }
 
-if (identical(Sys.getenv("NOT_CRAN"), "true")) {
-  # set random seed for predictable results. mostly for base's sample() in 
tree and classification
-  set.seed(42)
-  # for testthat 1.0.2 later, change reporter from "summary" to 
default_reporter()
-  testthat:::run_tests("SparkR",
-   file.path(sparkRDir, "pkg", "tests", "fulltests"),
-   NULL,
-   "summary")
-}
+  test_package("SparkR")
+
+  if (identical(Sys.getenv("NOT_CRAN"), "true")) {
+# set random seed for predictable results. mostly for base's sample() in 
tree and classification
+se

< 3 4 5 6 7 8 9 10 >

701 - 800 of 917 matches

Mail list logo