from:"yamamuro"

[spark] branch 3.0.1 created (now 3fdfce3)

2021-08-30 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch 3.0.1
in repository https://gitbox.apache.org/repos/asf/spark.git.


  at 3fdfce3  Preparing Spark release v3.0.0-rc3

No new revisions were added by this update.

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (37ef7bb -> f80be41)

2021-06-21 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 37ef7bb  [SPARK-35840][SQL] Add `apply()` for a single field to 
`YearMonthIntervalType` and `DayTimeIntervalType`
 add f80be41  [SPARK-34565][SQL] Collapse Window nodes with Project between 
them

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/optimizer/Optimizer.scala   | 25 ---
 .../catalyst/optimizer/CollapseWindowSuite.scala   | 50 +-
 2 files changed, 68 insertions(+), 7 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (5c96d64 -> b08cf6e)

2021-06-15 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 5c96d64  [SPARK-35707][ML] optimize sparse GEMM by skipping bound 
checking
 add b08cf6e  [SPARK-35203][SQL] Improve Repartition statistics estimation

No new revisions were added by this update.

Summary of changes:
 .../logical/statsEstimation/BasicStatsPlanVisitor.scala |  4 ++--
 .../SizeInBytesOnlyStatsPlanVisitor.scala   |  4 ++--
 .../statsEstimation/BasicStatsEstimationSuite.scala | 17 -
 3 files changed, 16 insertions(+), 9 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (ac228d4 -> 11e96dc)

2021-06-15 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from ac228d4  [SPARK-35691][CORE] addFile/addJar/addDirectory should put 
CanonicalFile
 add 11e96dc  [SPARK-35669][SQL] Quote the pushed column name only when 
nested column predicate pushdown is enabled

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/sources/filters.scala |  5 ++--
 .../execution/datasources/DataSourceStrategy.scala | 31 +-
 .../spark/sql/FileBasedDataSourceSuite.scala   | 10 +++
 3 files changed, 31 insertions(+), 15 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (864ff67 -> 9709ee5)

2021-06-15 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 864ff67  [SPARK-35429][CORE] Remove commons-httpclient from Hadoop-3.2 
profile due to EOL and CVEs
 add 9709ee5  [SPARK-35760][SQL] Fix the max rows check for broadcast 
exchange

No new revisions were added by this update.

Summary of changes:
 .../execution/exchange/BroadcastExchangeExec.scala | 25 +++---
 1 file changed, 17 insertions(+), 8 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (e9af457 -> c463472)

2021-06-11 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from e9af457  [SPARK-35718][SQL] Support casting of Date to timestamp 
without time zone type
 add c463472  [SPARK-35439][SQL][FOLLOWUP] ExpressionContainmentOrdering 
should not sort unrelated expressions

No new revisions were added by this update.

Summary of changes:
 .../expressions/EquivalentExpressions.scala| 45 --
 .../SubexpressionEliminationSuite.scala| 21 ++
 2 files changed, 45 insertions(+), 21 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (cf07036 -> 912d60b)

2021-06-10 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from cf07036  [SPARK-35593][K8S][CORE] Support shuffle data recovery on the 
reused PVCs
 add 912d60b  [SPARK-35709][DOCS] Remove the reference to third party Nomad 
integration project

No new revisions were added by this update.

Summary of changes:
 docs/cluster-overview.md | 3 ---
 1 file changed, 3 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (a59063d -> 08e6f63)

2021-06-01 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from a59063d  [SPARK-35581][SQL] Support special datetime values in typed 
literals only
 add 08e6f63  [SPARK-35577][TESTS] Allow to log container output for docker 
integration tests

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/jdbc/DockerJDBCIntegrationSuite.scala | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (fdd7ca5 -> 548e37b)

2021-05-24 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from fdd7ca5  [SPARK-35498][PYTHON] Add thread target wrapper API for 
pyspark pin thread mode
 add 548e37b  [SPARK-33122][SQL][FOLLOWUP] Extend RemoveRedundantAggregates 
optimizer rule to apply to more cases

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/optimizer/Optimizer.scala   | 43 +
 .../optimizer/RemoveRedundantAggregates.scala  | 70 ++
 .../optimizer/RemoveRedundantAggregatesSuite.scala | 16 -
 3 files changed, 86 insertions(+), 43 deletions(-)
 create mode 100644 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/RemoveRedundantAggregates.scala

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (bdd8e1d -> e170e63)

2021-05-20 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from bdd8e1d  [SPARK-28551][SQL] CTAS with LOCATION should not allow to a 
non-empty directory
 add e170e63  [SPARK-35457][BUILD] Bump ANTLR runtime version to 4.8

No new revisions were added by this update.

Summary of changes:
 dev/deps/spark-deps-hadoop-2.7-hive-2.3 | 2 +-
 dev/deps/spark-deps-hadoop-3.2-hive-2.3 | 2 +-
 pom.xml | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (d1b24d8 -> 586caae)

2021-05-19 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from d1b24d8  [SPARK-35338][PYTHON] Separate arithmetic operations into 
data type based structures
 add 586caae  [SPARK-35438][SQL][DOCS] Minor documentation fix for window 
physical operator

No new revisions were added by this update.

Summary of changes:
 .../main/scala/org/apache/spark/sql/execution/window/WindowExec.scala   | 2 +-
 .../scala/org/apache/spark/sql/execution/window/WindowExecBase.scala| 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (9283beb -> 1214213)

2021-05-19 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 9283beb  [SPARK-35418][SQL] Add sentences function to 
functions.{scala,py}
 add 1214213  [SPARK-35362][SQL] Update null count in the column stats for 
UNION operator stats estimation

No new revisions were added by this update.

Summary of changes:
 .../logical/statsEstimation/FilterEstimation.scala |  2 +-
 .../logical/statsEstimation/UnionEstimation.scala  | 97 ++
 .../BasicStatsEstimationSuite.scala|  2 +-
 .../statsEstimation/UnionEstimationSuite.scala | 65 +--
 4 files changed, 122 insertions(+), 44 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (a72d05c -> 46f7d78)

2021-05-19 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from a72d05c  [SPARK-35106][CORE][SQL] Avoid failing rename caused by 
destination directory not exist
 add 46f7d78  [SPARK-35368][SQL] Update histogram statistics for RANGE 
operator for stats estimation

No new revisions were added by this update.

Summary of changes:
 .../plans/logical/basicLogicalOperators.scala  | 43 +++-
 .../BasicStatsEstimationSuite.scala| 81 ++
 2 files changed, 108 insertions(+), 16 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (186477c -> b1493d8)

2021-05-18 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 186477c  [SPARK-35263][TEST] Refactor ShuffleBlockFetcherIteratorSuite 
to reduce duplicated code
 add b1493d8  [SPARK-35398][SQL] Simplify the way to get classes from 
ClassBodyEvaluator in `CodeGenerator.updateAndGetCompilationStats` method

No new revisions were added by this update.

Summary of changes:
 .../sql/catalyst/expressions/codegen/CodeGenerator.scala   | 14 ++
 1 file changed, 2 insertions(+), 12 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (7b942d5 -> cce0048)

2021-05-18 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 7b942d5  [SPARK-35425][BUILD] Pin jinja2 in `spark-rm/Dockerfile` and 
add as a required dependency in the release README.md
 add cce0048  [SPARK-35351][SQL] Add code-gen for left anti sort merge join

No new revisions were added by this update.

Summary of changes:
 .../sql/execution/joins/SortMergeJoinExec.scala|  97 ++
 .../approved-plans-v1_4/q16.sf100/explain.txt  |   4 +-
 .../approved-plans-v1_4/q16.sf100/simplified.txt   |   5 +-
 .../approved-plans-v1_4/q16/explain.txt|   4 +-
 .../approved-plans-v1_4/q16/simplified.txt |   5 +-
 .../approved-plans-v1_4/q69.sf100/explain.txt  |  36 +++
 .../approved-plans-v1_4/q69.sf100/simplified.txt   | 110 +++--
 .../approved-plans-v1_4/q87.sf100/explain.txt  |   8 +-
 .../approved-plans-v1_4/q87.sf100/simplified.txt   |  10 +-
 .../approved-plans-v1_4/q94.sf100/explain.txt  |   4 +-
 .../approved-plans-v1_4/q94.sf100/simplified.txt   |   5 +-
 .../approved-plans-v1_4/q94/explain.txt|   4 +-
 .../approved-plans-v1_4/q94/simplified.txt |   5 +-
 .../sql/execution/WholeStageCodegenSuite.scala |  22 +
 .../sql/execution/metric/SQLMetricsSuite.scala |   4 +-
 15 files changed, 208 insertions(+), 115 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-35413][INFRA] Use the SHA of the latest commit when checking out databricks/tpcds-kit

2021-05-16 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 8ebc1d3  [SPARK-35413][INFRA] Use the SHA of the latest commit when 
checking out databricks/tpcds-kit
8ebc1d3 is described below

commit 8ebc1d317f978e524d55449ecc88daa806dde009
Author: Takeshi Yamamuro 
AuthorDate: Mon May 17 09:26:04 2021 +0900

[SPARK-35413][INFRA] Use the SHA of the latest commit when checking out 
databricks/tpcds-kit

### What changes were proposed in this pull request?

This PR proposes to use the SHA of the latest commit 
([2a5078a782192ddb6efbcead8de9973d6ab4f069](https://github.com/databricks/tpcds-kit/commit/2a5078a782192ddb6efbcead8de9973d6ab4f069))
 when checking out `databricks/tpcds-kit`. This can prevent the test workflow 
from breaking accidentally if the repository changes drastically.

### Why are the changes needed?

For better test workflow.

### Does this PR introduce _any_ user-facing change?

No, dev-only.

### How was this patch tested?

GA passed.

Closes #32561 from maropu/UseRefInCheckout.

Authored-by: Takeshi Yamamuro 
Signed-off-by: Takeshi Yamamuro 
(cherry picked from commit 2390b9dbcbc0b0377d694d2c3c2c0fa78179cbd6)
Signed-off-by: Takeshi Yamamuro 
---
 .github/workflows/build_and_test.yml | 1 +
 1 file changed, 1 insertion(+)

diff --git a/.github/workflows/build_and_test.yml 
b/.github/workflows/build_and_test.yml
index 936a256..77a2c79 100644
--- a/.github/workflows/build_and_test.yml
+++ b/.github/workflows/build_and_test.yml
@@ -428,6 +428,7 @@ jobs:
   uses: actions/checkout@v2
   with:
 repository: databricks/tpcds-kit
+ref: 2a5078a782192ddb6efbcead8de9973d6ab4f069
 path: ./tpcds-kit
 - name: Build tpcds-kit
   if: steps.cache-tpcds-sf-1.outputs.cache-hit != 'true'

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.1 updated: [SPARK-35413][INFRA] Use the SHA of the latest commit when checking out databricks/tpcds-kit

2021-05-16 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.1 by this push:
 new f9a396c  [SPARK-35413][INFRA] Use the SHA of the latest commit when 
checking out databricks/tpcds-kit
f9a396c is described below

commit f9a396c37cb340671666379cb8d8a85435c7ad87
Author: Takeshi Yamamuro 
AuthorDate: Mon May 17 09:26:04 2021 +0900

[SPARK-35413][INFRA] Use the SHA of the latest commit when checking out 
databricks/tpcds-kit

### What changes were proposed in this pull request?

This PR proposes to use the SHA of the latest commit 
([2a5078a782192ddb6efbcead8de9973d6ab4f069](https://github.com/databricks/tpcds-kit/commit/2a5078a782192ddb6efbcead8de9973d6ab4f069))
 when checking out `databricks/tpcds-kit`. This can prevent the test workflow 
from breaking accidentally if the repository changes drastically.

### Why are the changes needed?

For better test workflow.

### Does this PR introduce _any_ user-facing change?

No, dev-only.

### How was this patch tested?

GA passed.

Closes #32561 from maropu/UseRefInCheckout.

Authored-by: Takeshi Yamamuro 
Signed-off-by: Takeshi Yamamuro 
(cherry picked from commit 2390b9dbcbc0b0377d694d2c3c2c0fa78179cbd6)
Signed-off-by: Takeshi Yamamuro 
---
 .github/workflows/build_and_test.yml | 1 +
 1 file changed, 1 insertion(+)

diff --git a/.github/workflows/build_and_test.yml 
b/.github/workflows/build_and_test.yml
index 173cc0e..c8b4c77 100644
--- a/.github/workflows/build_and_test.yml
+++ b/.github/workflows/build_and_test.yml
@@ -481,6 +481,7 @@ jobs:
   uses: actions/checkout@v2
   with:
 repository: databricks/tpcds-kit
+ref: 2a5078a782192ddb6efbcead8de9973d6ab4f069
 path: ./tpcds-kit
 - name: Build tpcds-kit
   if: steps.cache-tpcds-sf-1.outputs.cache-hit != 'true'

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (2eef2f9 -> 2390b9d)

2021-05-16 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 2eef2f9  [SPARK-35412][SQL] Fix a bug in groupBy of 
year-month/day-time intervals
 add 2390b9d  [SPARK-35413][INFRA] Use the SHA of the latest commit when 
checking out databricks/tpcds-kit

No new revisions were added by this update.

Summary of changes:
 .github/workflows/build_and_test.yml | 1 +
 1 file changed, 1 insertion(+)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (ae0579a -> 3241aeb)

2021-05-12 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from ae0579a  [SPARK-35369][DOC] Document ExecutorAllocationManager metrics
 add 3241aeb  [SPARK-35385][SQL][TESTS] Skip duplicate queries in the 
TPCDS-related tests

No new revisions were added by this update.

Summary of changes:
 sql/core/src/test/scala/org/apache/spark/sql/TPCDSBase.scala   | 10 +-
 .../test/scala/org/apache/spark/sql/TPCDSQueryTestSuite.scala  |  6 --
 2 files changed, 9 insertions(+), 7 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (44bd0a8 -> c4ca232)

2021-05-10 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 44bd0a8  [SPARK-35088][SQL][FOLLOWUP] Improve the error message for 
Sequence expression
 add c4ca232  [SPARK-35363][SQL] Refactor sort merge join code-gen be 
agnostic to join type

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/execution/joins/ShuffledJoin.scala   |   2 +-
 .../sql/execution/joins/SortMergeJoinExec.scala| 163 +++--
 2 files changed, 84 insertions(+), 81 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (620f072 -> 38eb5a6)

2021-05-09 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 620f072  [SPARK-35231][SQL] logical.Range override maxRowsPerPartition
 add 38eb5a6  [SPARK-35354][SQL] Replace BaseJoinExec with ShuffledJoin in 
CoalesceBucketsInJoin

No new revisions were added by this update.

Summary of changes:
 .../sql/execution/bucketing/CoalesceBucketsInJoin.scala  | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (5b65d8a -> 620f072)

2021-05-09 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 5b65d8a  [SPARK-35347][SQL] Use MethodUtils for looking up methods in 
Invoke and StaticInvoke
 add 620f072  [SPARK-35231][SQL] logical.Range override maxRowsPerPartition

No new revisions were added by this update.

Summary of changes:
 .../sql/catalyst/plans/logical/basicLogicalOperators.scala   | 12 
 .../apache/spark/sql/catalyst/plans/LogicalPlanSuite.scala   | 11 ++-
 2 files changed, 22 insertions(+), 1 deletion(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (b025780 -> 06c4009)

2021-05-08 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from b025780  [SPARK-35331][SQL] Support resolving missing attrs for 
distribute/cluster by/repartition hint
 add 06c4009  [SPARK-35327][SQL][TESTS] Filters out the TPC-DS queries that 
can cause flaky test results

No new revisions were added by this update.

Summary of changes:
 .../resources/tpcds-query-results/v1_4/q6.sql.out  |  51 --
 .../resources/tpcds-query-results/v1_4/q75.sql.out | 105 -
 .../scala/org/apache/spark/sql/TPCDSBase.scala |   2 +-
 .../org/apache/spark/sql/TPCDSQueryTestSuite.scala |   6 ++
 4 files changed, 7 insertions(+), 157 deletions(-)
 delete mode 100644 
sql/core/src/test/resources/tpcds-query-results/v1_4/q6.sql.out
 delete mode 100644 
sql/core/src/test/resources/tpcds-query-results/v1_4/q75.sql.out

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (2634dba -> 6f0ef93)

2021-05-07 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 2634dba  [SPARK-35175][BUILD] Add linter for JavaScript source files
 add 6f0ef93  [SPARK-35297][CORE][DOC][MINOR] Modify the comment about the 
executor

No new revisions were added by this update.

Summary of changes:
 core/src/main/scala/org/apache/spark/executor/Executor.scala | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (19661f6 -> 5c67d0c)

2021-05-05 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 19661f6  [SPARK-35325][SQL][TESTS] Add nested column ORC encryption 
test case
 add 5c67d0c  [SPARK-35293][SQL][TESTS] Use the newer dsdgen for 
TPCDSQueryTestSuite

No new revisions were added by this update.

Summary of changes:
 .github/workflows/build_and_test.yml   |6 +-
 .../resources/tpcds-query-results/v1_4/q1.sql.out  |  184 +-
 .../resources/tpcds-query-results/v1_4/q10.sql.out |   11 +-
 .../resources/tpcds-query-results/v1_4/q11.sql.out |6 +
 .../resources/tpcds-query-results/v1_4/q12.sql.out |  200 +-
 .../resources/tpcds-query-results/v1_4/q13.sql.out |2 +-
 .../tpcds-query-results/v1_4/q14a.sql.out  |  200 +-
 .../tpcds-query-results/v1_4/q14b.sql.out  |  200 +-
 .../resources/tpcds-query-results/v1_4/q15.sql.out |  200 +-
 .../resources/tpcds-query-results/v1_4/q16.sql.out |2 +-
 .../resources/tpcds-query-results/v1_4/q17.sql.out |2 +-
 .../resources/tpcds-query-results/v1_4/q18.sql.out |  200 +-
 .../resources/tpcds-query-results/v1_4/q19.sql.out |  200 +-
 .../resources/tpcds-query-results/v1_4/q2.sql.out  | 5026 +--
 .../resources/tpcds-query-results/v1_4/q20.sql.out |  200 +-
 .../resources/tpcds-query-results/v1_4/q21.sql.out |  200 +-
 .../resources/tpcds-query-results/v1_4/q22.sql.out |  200 +-
 .../tpcds-query-results/v1_4/q23a.sql.out  |2 +-
 .../tpcds-query-results/v1_4/q23b.sql.out  |5 +-
 .../tpcds-query-results/v1_4/q24a.sql.out  |8 +-
 .../tpcds-query-results/v1_4/q24b.sql.out  |2 +-
 .../resources/tpcds-query-results/v1_4/q25.sql.out |2 +-
 .../resources/tpcds-query-results/v1_4/q26.sql.out |  200 +-
 .../resources/tpcds-query-results/v1_4/q27.sql.out |  200 +-
 .../resources/tpcds-query-results/v1_4/q28.sql.out |2 +-
 .../resources/tpcds-query-results/v1_4/q29.sql.out |3 +-
 .../resources/tpcds-query-results/v1_4/q3.sql.out  |  172 +-
 .../resources/tpcds-query-results/v1_4/q30.sql.out |  200 +-
 .../resources/tpcds-query-results/v1_4/q31.sql.out |  112 +-
 .../resources/tpcds-query-results/v1_4/q32.sql.out |2 -
 .../resources/tpcds-query-results/v1_4/q33.sql.out |  200 +-
 .../resources/tpcds-query-results/v1_4/q34.sql.out |  434 +-
 .../resources/tpcds-query-results/v1_4/q35.sql.out |  188 +-
 .../resources/tpcds-query-results/v1_4/q36.sql.out |  200 +-
 .../resources/tpcds-query-results/v1_4/q37.sql.out |3 +-
 .../resources/tpcds-query-results/v1_4/q38.sql.out |2 +-
 .../tpcds-query-results/v1_4/q39a.sql.out  |  449 +-
 .../tpcds-query-results/v1_4/q39b.sql.out  |   24 +-
 .../resources/tpcds-query-results/v1_4/q4.sql.out  |   10 +-
 .../resources/tpcds-query-results/v1_4/q40.sql.out |  200 +-
 .../resources/tpcds-query-results/v1_4/q41.sql.out |9 +-
 .../resources/tpcds-query-results/v1_4/q42.sql.out |   21 +-
 .../resources/tpcds-query-results/v1_4/q43.sql.out |   12 +-
 .../resources/tpcds-query-results/v1_4/q44.sql.out |   20 +-
 .../resources/tpcds-query-results/v1_4/q45.sql.out |   39 +-
 .../resources/tpcds-query-results/v1_4/q46.sql.out |  200 +-
 .../resources/tpcds-query-results/v1_4/q47.sql.out |  200 +-
 .../resources/tpcds-query-results/v1_4/q48.sql.out |2 +-
 .../resources/tpcds-query-results/v1_4/q49.sql.out |   64 +-
 .../resources/tpcds-query-results/v1_4/q5.sql.out  |  200 +-
 .../resources/tpcds-query-results/v1_4/q50.sql.out |   12 +-
 .../resources/tpcds-query-results/v1_4/q51.sql.out |  200 +-
 .../resources/tpcds-query-results/v1_4/q52.sql.out |  200 +-
 .../resources/tpcds-query-results/v1_4/q53.sql.out |  200 +-
 .../resources/tpcds-query-results/v1_4/q54.sql.out |2 +-
 .../resources/tpcds-query-results/v1_4/q55.sql.out |  200 +-
 .../resources/tpcds-query-results/v1_4/q56.sql.out |  200 +-
 .../resources/tpcds-query-results/v1_4/q57.sql.out |  200 +-
 .../resources/tpcds-query-results/v1_4/q58.sql.out |4 +-
 .../resources/tpcds-query-results/v1_4/q59.sql.out |  200 +-
 .../resources/tpcds-query-results/v1_4/q6.sql.out  |   91 +-
 .../resources/tpcds-query-results/v1_4/q60.sql.out |  200 +-
 .../resources/tpcds-query-results/v1_4/q61.sql.out |2 +-
 .../resources/tpcds-query-results/v1_4/q62.sql.out |  200 +-
 .../resources/tpcds-query-results/v1_4/q63.sql.out |  200 +-
 .../resources/tpcds-query-results/v1_4/q64.sql.out |   19 +-
 .../resources/tpcds-query-results/v1_4/q65.sql.out |  200 +-
 .../resources/tpcds-query-results/v1_4/q66.sql.out |   10 +-
 .../resources/tpcds-query-results/v1_4/q67.sql.out |  200 +-
 .../resources/tpcds-query-results/v1_4/q68.sql.out |  200 +-
 .../resources/tpcds-query-results/v1_4/q69.sql.out |  182 +-
 .../resources/tpcds-query-results/v1_4/q7.sql.out  |  200 +-
 .../resources/tpcds-query-results/v1_4/q70.sql.out |6 +-
 .../resources/tpcds-query-results/v1_4

[spark] 09/09: Fix

2021-05-05 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch pull/31899
in repository https://gitbox.apache.org/repos/asf/spark.git

commit 0c4e71e00129bcffe933a8cceffab7cf51cf33ce
Author: Takeshi Yamamuro 
AuthorDate: Thu May 6 10:14:25 2021 +0900

Fix
---
 docs/sql-ref-syntax-hive-format.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/sql-ref-syntax-hive-format.md 
b/docs/sql-ref-syntax-hive-format.md
index 8092e58..01b8d3f 100644
--- a/docs/sql-ref-syntax-hive-format.md
+++ b/docs/sql-ref-syntax-hive-format.md
@@ -30,7 +30,7 @@ There are two ways to define a row format in `row_format` of 
`CREATE TABLE` and
 
 ```sql
 row_format:
-SERDE serde_class [ WITH SERDEPROPERTIES (k1=v1, k2=v2, ... ) ]
+SERDE serde_class [ WITH SERDEPROPERTIES (k1 [=] v1, k2 [=] v2, ... ) ]
 | DELIMITED [ FIELDS TERMINATED BY fields_terminated_char [ ESCAPED BY 
escaped_char ] ] 
 [ COLLECTION ITEMS TERMINATED BY collection_items_terminated_char ] 
 [ MAP KEYS TERMINATED BY map_key_terminated_char ]

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] 08/09: remove space

2021-05-05 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch pull/31899
in repository https://gitbox.apache.org/repos/asf/spark.git

commit 2ebb2aac7a0c87c929e72bc7c8c080096c55a8f1
Author: Niklas Riekenbrauck 
AuthorDate: Tue Mar 30 13:20:32 2021 +0200

remove space
---
 docs/sql-ref-syntax-ddl-alter-table.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/sql-ref-syntax-ddl-alter-table.md 
b/docs/sql-ref-syntax-ddl-alter-table.md
index 915ccf8..ae40fe4 100644
--- a/docs/sql-ref-syntax-ddl-alter-table.md
+++ b/docs/sql-ref-syntax-ddl-alter-table.md
@@ -221,7 +221,7 @@ ALTER TABLE table_identifier [ partition_spec ] SET 
LOCATION 'new_location'
 
 **Syntax:** `PARTITION ( partition_col_name = partition_col_val [ , ... ] 
)`
 
-* **SERDEPROPERTIES ( key1 [=] val1, key2 [=] val2, ... ) **
+* **SERDEPROPERTIES ( key1 [=] val1, key2 [=] val2, ... )**
 
 Specifies the SERDE properties to be set.
 

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch pull/31899 created (now 0c4e71e)

2021-05-05 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch pull/31899
in repository https://gitbox.apache.org/repos/asf/spark.git.


  at 0c4e71e  Fix

This branch includes the following new commits:

 new c685abe  Update docs to reflect alternative key value notation
 new 2ff9703  Update docs other create table docs
 new 8245a55  Fix alternatives with subrule grammar
 new 1d157ed  Update to eaasier KV syntax
 new 83ec2ee  Commit missing doc updates
 new fff449b  Some more fixes
 new 42cd52e  Remove unnecessary change
 new 2ebb2aa  remove space
 new 0c4e71e  Fix

The 9 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] 06/09: Some more fixes

2021-05-05 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch pull/31899
in repository https://gitbox.apache.org/repos/asf/spark.git

commit fff449bd54f2204d7cfc7a5fcf5c8877aa37a992
Author: Niklas Riekenbrauck 
AuthorDate: Sat Mar 27 15:22:11 2021 +0100

Some more fixes
---
 docs/sql-ref-syntax-ddl-alter-table.md | 4 ++--
 docs/sql-ref-syntax-ddl-create-table-hiveformat.md | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/docs/sql-ref-syntax-ddl-alter-table.md 
b/docs/sql-ref-syntax-ddl-alter-table.md
index 866b596..915ccf8 100644
--- a/docs/sql-ref-syntax-ddl-alter-table.md
+++ b/docs/sql-ref-syntax-ddl-alter-table.md
@@ -169,7 +169,7 @@ this overrides the old value with the new one.
 
 ```sql
 -- Set Table Properties 
-ALTER TABLE table_identifier SET TBLPROPERTIES ( ( key1 [=] val1, key2 [=] 
val2, ... ) )
+ALTER TABLE table_identifier SET TBLPROPERTIES ( key1 [=] val1, key2 [=] val2, 
... )
 
 -- Unset Table Properties
 ALTER TABLE table_identifier UNSET TBLPROPERTIES [ IF EXISTS ] ( key1, key2, 
... )
@@ -219,7 +219,7 @@ ALTER TABLE table_identifier [ partition_spec ] SET 
LOCATION 'new_location'
 
 Specifies the partition on which the property has to be set. Note that one 
can use a typed literal (e.g., date'2019-01-02') in the partition spec.
 
-**Syntax:** `PARTITION ( partition_col_name  = partition_col_val [ , ... ] 
)`
+**Syntax:** `PARTITION ( partition_col_name = partition_col_val [ , ... ] 
)`
 
 * **SERDEPROPERTIES ( key1 [=] val1, key2 [=] val2, ... ) **
 
diff --git a/docs/sql-ref-syntax-ddl-create-table-hiveformat.md 
b/docs/sql-ref-syntax-ddl-create-table-hiveformat.md
index 48d089d..3231b66 100644
--- a/docs/sql-ref-syntax-ddl-create-table-hiveformat.md
+++ b/docs/sql-ref-syntax-ddl-create-table-hiveformat.md
@@ -37,7 +37,7 @@ CREATE [ EXTERNAL ] TABLE [ IF NOT EXISTS ] table_identifier
 [ ROW FORMAT row_format ]
 [ STORED AS file_format ]
 [ LOCATION path ]
-[ TBLPROPERTIES ( ( key1 [=] val1, key2 [=] val2, ... ) ]
+[ TBLPROPERTIES ( key1 [=] val1, key2 [=] val2, ... ) ]
 [ AS select_statement ]
 ```
 

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] 07/09: Remove unnecessary change

2021-05-05 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch pull/31899
in repository https://gitbox.apache.org/repos/asf/spark.git

commit 42cd52e297b141a8b837a8315ca4c84a5ffc3def
Author: Niklas Riekenbrauck 
AuthorDate: Sat Mar 27 15:27:52 2021 +0100

Remove unnecessary change
---
 docs/sql-ref-syntax-ddl-alter-database.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/sql-ref-syntax-ddl-alter-database.md 
b/docs/sql-ref-syntax-ddl-alter-database.md
index 2de9675..6ac6863 100644
--- a/docs/sql-ref-syntax-ddl-alter-database.md
+++ b/docs/sql-ref-syntax-ddl-alter-database.md
@@ -31,7 +31,7 @@ for a database and may be used for auditing purposes.
 
 ```sql
 ALTER { DATABASE | SCHEMA } database_name
-SET DBPROPERTIES ( ( property_name [=] property_value [ , ... ] | ( 
property_name property_value [ , ... ] )
+SET DBPROPERTIES ( property_name [=] property_value [ , ... ] )
 ```
 
 ### Parameters

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] 04/09: Update to eaasier KV syntax

2021-05-05 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch pull/31899
in repository https://gitbox.apache.org/repos/asf/spark.git

commit 1d157ed7209b355294b8e07e672ba8b5916e93f5
Author: Niklas Riekenbrauck 
AuthorDate: Sat Mar 27 15:17:26 2021 +0100

Update to eaasier KV syntax
---
 docs/sql-ref-syntax-ddl-alter-database.md  | 2 +-
 docs/sql-ref-syntax-ddl-alter-table.md | 8 
 docs/sql-ref-syntax-ddl-alter-view.md  | 2 +-
 docs/sql-ref-syntax-ddl-create-database.md | 4 ++--
 docs/sql-ref-syntax-ddl-create-table-datasource.md | 4 ++--
 docs/sql-ref-syntax-ddl-create-table-hiveformat.md | 2 +-
 docs/sql-ref-syntax-ddl-create-table-like.md   | 2 +-
 7 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/docs/sql-ref-syntax-ddl-alter-database.md 
b/docs/sql-ref-syntax-ddl-alter-database.md
index fbc454e..2de9675 100644
--- a/docs/sql-ref-syntax-ddl-alter-database.md
+++ b/docs/sql-ref-syntax-ddl-alter-database.md
@@ -31,7 +31,7 @@ for a database and may be used for auditing purposes.
 
 ```sql
 ALTER { DATABASE | SCHEMA } database_name
-SET DBPROPERTIES ( property_name = property_value [ , ... ] )
+SET DBPROPERTIES ( ( property_name [=] property_value [ , ... ] | ( 
property_name property_value [ , ... ] )
 ```
 
 ### Parameters
diff --git a/docs/sql-ref-syntax-ddl-alter-table.md 
b/docs/sql-ref-syntax-ddl-alter-table.md
index 2d42eb4..912de0f 100644
--- a/docs/sql-ref-syntax-ddl-alter-table.md
+++ b/docs/sql-ref-syntax-ddl-alter-table.md
@@ -169,7 +169,7 @@ this overrides the old value with the new one.
 
 ```sql
 -- Set Table Properties 
-ALTER TABLE table_identifier SET TBLPROPERTIES ( key1 = val1, key2 = val2, ... 
)
+ALTER TABLE table_identifier SET TBLPROPERTIES ( ( key1 [=] val1, key2 [=] 
val2, ... ) )
 
 -- Unset Table Properties
 ALTER TABLE table_identifier UNSET TBLPROPERTIES [ IF EXISTS ] ( key1, key2, 
... )
@@ -184,10 +184,10 @@ ALTER TABLE table_identifier UNSET TBLPROPERTIES [ IF 
EXISTS ] ( key1, key2, ...
 ```sql
 -- Set SERDE Properties
 ALTER TABLE table_identifier [ partition_spec ]
-SET SERDEPROPERTIES ( key1 = val1, key2 = val2, ... )
+SET SERDEPROPERTIES ( ( key1 = val1, key2 = val2, ... ) | ( key1 val1, 
key2 val2, ... ) )
 
 ALTER TABLE table_identifier [ partition_spec ] SET SERDE serde_class_name
-[ WITH SERDEPROPERTIES ( key1 = val1, key2 = val2, ... ) ]
+[ WITH SERDEPROPERTIES ( ( key1 = val1, key2 = val2, ... ) | ( key1 val1, 
key2 val2, ... ) ) ]
 ```
 
  SET LOCATION And SET FILE FORMAT
@@ -221,7 +221,7 @@ ALTER TABLE table_identifier [ partition_spec ] SET 
LOCATION 'new_location'
 
 **Syntax:** `PARTITION ( partition_col_name  = partition_col_val [ , ... ] 
)`
 
-* **SERDEPROPERTIES ( key1 = val1, key2 = val2, ... )**
+* **SERDEPROPERTIES ( ( key1 = val1, key2 = val2, ... ) | ( key1 val1, key2 
val2, ... ) ) **
 
 Specifies the SERDE properties to be set.
 
diff --git a/docs/sql-ref-syntax-ddl-alter-view.md 
b/docs/sql-ref-syntax-ddl-alter-view.md
index d69f246..25280c4 100644
--- a/docs/sql-ref-syntax-ddl-alter-view.md
+++ b/docs/sql-ref-syntax-ddl-alter-view.md
@@ -49,7 +49,7 @@ the properties.
 
  Syntax
 ```sql
-ALTER VIEW view_identifier SET TBLPROPERTIES ( property_key = property_val [ , 
... ] )
+ALTER VIEW view_identifier SET TBLPROPERTIES ( property_key [=] property_val [ 
, ... ] )
 ```
 
  Parameters
diff --git a/docs/sql-ref-syntax-ddl-create-database.md 
b/docs/sql-ref-syntax-ddl-create-database.md
index 9d8bf47..7db410e 100644
--- a/docs/sql-ref-syntax-ddl-create-database.md
+++ b/docs/sql-ref-syntax-ddl-create-database.md
@@ -29,7 +29,7 @@ Creates a database with the specified name. If database with 
the same name alrea
 CREATE { DATABASE | SCHEMA } [ IF NOT EXISTS ] database_name
 [ COMMENT database_comment ]
 [ LOCATION database_directory ]
-[ WITH DBPROPERTIES ( property_name = property_value [ , ... ] ) ]
+[ WITH DBPROPERTIES ( property_name [=] property_value [ , ... ] ) ]
 ```
 
 ### Parameters
@@ -50,7 +50,7 @@ CREATE { DATABASE | SCHEMA } [ IF NOT EXISTS ] database_name
 
 Specifies the description for the database.
 
-* **WITH DBPROPERTIES ( property_name=property_value [ , ... ] )**
+* **WITH DBPROPERTIES ( property_name [=] property_value [ , ... ] )**
 
 Specifies the properties for the database in key-value pairs.
 
diff --git a/docs/sql-ref-syntax-ddl-create-table-datasource.md 
b/docs/sql-ref-syntax-ddl-create-table-datasource.md
index 9926bc6..7d8e692 100644
--- a/docs/sql-ref-syntax-ddl-create-table-datasource.md
+++ b/docs/sql-ref-syntax-ddl-create-table-datasource.md
@@ -29,14 +29,14 @@ The `CREATE TABLE` statement defines a new table using a 
Data Source.
 CREATE TABLE [ IF NOT EXISTS ] table_identifier
 [ ( col_name1 col_type1 [ COMMENT col_comment1 ], ... ) ]
 USING data_source
-[ OPTIONS ( ( key1=val1, key2=val2, ... ) | ( key1 val1,

[spark] 01/09: Update docs to reflect alternative key value notation

2021-05-05 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch pull/31899
in repository https://gitbox.apache.org/repos/asf/spark.git

commit c685abe33681fcbf0bfa6aa86ba229f19e4d451f
Author: Niklas Riekenbrauck 
AuthorDate: Fri Mar 19 14:53:53 2021 +0100

Update docs to reflect alternative key value notation
---
 docs/sql-ref-syntax-ddl-create-table-datasource.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/docs/sql-ref-syntax-ddl-create-table-datasource.md 
b/docs/sql-ref-syntax-ddl-create-table-datasource.md
index ba0516a..82d3a09 100644
--- a/docs/sql-ref-syntax-ddl-create-table-datasource.md
+++ b/docs/sql-ref-syntax-ddl-create-table-datasource.md
@@ -29,14 +29,14 @@ The `CREATE TABLE` statement defines a new table using a 
Data Source.
 CREATE TABLE [ IF NOT EXISTS ] table_identifier
 [ ( col_name1 col_type1 [ COMMENT col_comment1 ], ... ) ]
 USING data_source
-[ OPTIONS ( key1=val1, key2=val2, ... ) ]
+[ OPTIONS [ ( key1=val1, key2=val2, ... ) | ( key1 val1, key2 val2, ... ) 
] ]
 [ PARTITIONED BY ( col_name1, col_name2, ... ) ]
 [ CLUSTERED BY ( col_name3, col_name4, ... ) 
 [ SORTED BY ( col_name [ ASC | DESC ], ... ) ] 
 INTO num_buckets BUCKETS ]
 [ LOCATION path ]
 [ COMMENT table_comment ]
-[ TBLPROPERTIES ( key1=val1, key2=val2, ... ) ]
+[ TBLPROPERTIES [ ( key1=val1, key2=val2, ... ) | ( key1 val1, key2 val2, 
... ) ] ]
 [ AS select_statement ]
 ```
 

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] 03/09: Fix alternatives with subrule grammar

2021-05-05 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch pull/31899
in repository https://gitbox.apache.org/repos/asf/spark.git

commit 8245a55dd1092fe9ef3fbcacb5cf07d1888ac23a
Author: Niklas Riekenbrauck 
AuthorDate: Sat Mar 20 13:00:08 2021 +0100

Fix alternatives with subrule grammar
---
 docs/sql-ref-syntax-ddl-create-table-datasource.md | 4 ++--
 docs/sql-ref-syntax-ddl-create-table-hiveformat.md | 2 +-
 docs/sql-ref-syntax-ddl-create-table-like.md   | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/docs/sql-ref-syntax-ddl-create-table-datasource.md 
b/docs/sql-ref-syntax-ddl-create-table-datasource.md
index 82d3a09..9926bc6 100644
--- a/docs/sql-ref-syntax-ddl-create-table-datasource.md
+++ b/docs/sql-ref-syntax-ddl-create-table-datasource.md
@@ -29,14 +29,14 @@ The `CREATE TABLE` statement defines a new table using a 
Data Source.
 CREATE TABLE [ IF NOT EXISTS ] table_identifier
 [ ( col_name1 col_type1 [ COMMENT col_comment1 ], ... ) ]
 USING data_source
-[ OPTIONS [ ( key1=val1, key2=val2, ... ) | ( key1 val1, key2 val2, ... ) 
] ]
+[ OPTIONS ( ( key1=val1, key2=val2, ... ) | ( key1 val1, key2 val2, ... ) 
) ]
 [ PARTITIONED BY ( col_name1, col_name2, ... ) ]
 [ CLUSTERED BY ( col_name3, col_name4, ... ) 
 [ SORTED BY ( col_name [ ASC | DESC ], ... ) ] 
 INTO num_buckets BUCKETS ]
 [ LOCATION path ]
 [ COMMENT table_comment ]
-[ TBLPROPERTIES [ ( key1=val1, key2=val2, ... ) | ( key1 val1, key2 val2, 
... ) ] ]
+[ TBLPROPERTIES ( ( key1=val1, key2=val2, ... ) | ( key1 val1, key2 val2, 
... ) ) ]
 [ AS select_statement ]
 ```
 
diff --git a/docs/sql-ref-syntax-ddl-create-table-hiveformat.md 
b/docs/sql-ref-syntax-ddl-create-table-hiveformat.md
index 63880d5..2e05e64 100644
--- a/docs/sql-ref-syntax-ddl-create-table-hiveformat.md
+++ b/docs/sql-ref-syntax-ddl-create-table-hiveformat.md
@@ -37,7 +37,7 @@ CREATE [ EXTERNAL ] TABLE [ IF NOT EXISTS ] table_identifier
 [ ROW FORMAT row_format ]
 [ STORED AS file_format ]
 [ LOCATION path ]
-[ TBLPROPERTIES [ ( key1=val1, key2=val2, ... ) | ( key1 val1, key2 val2, 
... ) ] ]
+[ TBLPROPERTIES ( ( key1=val1, key2=val2, ... ) | ( key1 val1, key2 val2, 
... ) ) ]
 [ AS select_statement ]
 ```
 
diff --git a/docs/sql-ref-syntax-ddl-create-table-like.md 
b/docs/sql-ref-syntax-ddl-create-table-like.md
index a374296a..772b299 100644
--- a/docs/sql-ref-syntax-ddl-create-table-like.md
+++ b/docs/sql-ref-syntax-ddl-create-table-like.md
@@ -30,7 +30,7 @@ CREATE TABLE [IF NOT EXISTS] table_identifier LIKE 
source_table_identifier
 USING data_source
 [ ROW FORMAT row_format ]
 [ STORED AS file_format ]
-[ TBLPROPERTIES [ ( key1=val1, key2=val2, ... ) | ( key1 val1, key2 val2, 
... ) ] ]
+[ TBLPROPERTIES ( ( key1=val1, key2=val2, ... ) | ( key1 val1, key2 val2, 
... ) ) ]
 [ LOCATION path ]
 ```
 

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] 02/09: Update docs other create table docs

2021-05-05 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch pull/31899
in repository https://gitbox.apache.org/repos/asf/spark.git

commit 2ff970350427835a7b7f7f9d0ec7bc8f1049f7fd
Author: Niklas Riekenbrauck 
AuthorDate: Fri Mar 19 15:17:48 2021 +0100

Update docs other create table docs
---
 docs/sql-ref-syntax-ddl-create-table-hiveformat.md | 2 +-
 docs/sql-ref-syntax-ddl-create-table-like.md   | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/docs/sql-ref-syntax-ddl-create-table-hiveformat.md 
b/docs/sql-ref-syntax-ddl-create-table-hiveformat.md
index b2f5957..63880d5 100644
--- a/docs/sql-ref-syntax-ddl-create-table-hiveformat.md
+++ b/docs/sql-ref-syntax-ddl-create-table-hiveformat.md
@@ -37,7 +37,7 @@ CREATE [ EXTERNAL ] TABLE [ IF NOT EXISTS ] table_identifier
 [ ROW FORMAT row_format ]
 [ STORED AS file_format ]
 [ LOCATION path ]
-[ TBLPROPERTIES ( key1=val1, key2=val2, ... ) ]
+[ TBLPROPERTIES [ ( key1=val1, key2=val2, ... ) | ( key1 val1, key2 val2, 
... ) ] ]
 [ AS select_statement ]
 ```
 
diff --git a/docs/sql-ref-syntax-ddl-create-table-like.md 
b/docs/sql-ref-syntax-ddl-create-table-like.md
index cfb959c..a374296a 100644
--- a/docs/sql-ref-syntax-ddl-create-table-like.md
+++ b/docs/sql-ref-syntax-ddl-create-table-like.md
@@ -30,7 +30,7 @@ CREATE TABLE [IF NOT EXISTS] table_identifier LIKE 
source_table_identifier
 USING data_source
 [ ROW FORMAT row_format ]
 [ STORED AS file_format ]
-[ TBLPROPERTIES ( key1=val1, key2=val2, ... ) ]
+[ TBLPROPERTIES [ ( key1=val1, key2=val2, ... ) | ( key1 val1, key2 val2, 
... ) ] ]
 [ LOCATION path ]
 ```
 

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] 05/09: Commit missing doc updates

2021-05-05 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch pull/31899
in repository https://gitbox.apache.org/repos/asf/spark.git

commit 83ec2ee71751142220464ea54ffc6e47ccc35ad4
Author: Niklas Riekenbrauck 
AuthorDate: Sat Mar 27 15:19:27 2021 +0100

Commit missing doc updates
---
 docs/sql-ref-syntax-ddl-alter-table.md | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/docs/sql-ref-syntax-ddl-alter-table.md 
b/docs/sql-ref-syntax-ddl-alter-table.md
index 912de0f..866b596 100644
--- a/docs/sql-ref-syntax-ddl-alter-table.md
+++ b/docs/sql-ref-syntax-ddl-alter-table.md
@@ -184,10 +184,10 @@ ALTER TABLE table_identifier UNSET TBLPROPERTIES [ IF 
EXISTS ] ( key1, key2, ...
 ```sql
 -- Set SERDE Properties
 ALTER TABLE table_identifier [ partition_spec ]
-SET SERDEPROPERTIES ( ( key1 = val1, key2 = val2, ... ) | ( key1 val1, 
key2 val2, ... ) )
+SET SERDEPROPERTIES ( key1 [=] val1, key2 [=] val2, ... )
 
 ALTER TABLE table_identifier [ partition_spec ] SET SERDE serde_class_name
-[ WITH SERDEPROPERTIES ( ( key1 = val1, key2 = val2, ... ) | ( key1 val1, 
key2 val2, ... ) ) ]
+[ WITH SERDEPROPERTIES ( key1 [=] val1, key2 [=] val2, ... ) ]
 ```
 
  SET LOCATION And SET FILE FORMAT
@@ -221,7 +221,7 @@ ALTER TABLE table_identifier [ partition_spec ] SET 
LOCATION 'new_location'
 
 **Syntax:** `PARTITION ( partition_col_name  = partition_col_val [ , ... ] 
)`
 
-* **SERDEPROPERTIES ( ( key1 = val1, key2 = val2, ... ) | ( key1 val1, key2 
val2, ... ) ) **
+* **SERDEPROPERTIES ( key1 [=] val1, key2 [=] val2, ... ) **
 
 Specifies the SERDE properties to be set.
 

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] 09/09: Fix

2021-05-05 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch pr31899
in repository https://gitbox.apache.org/repos/asf/spark.git

commit 0c4e71e00129bcffe933a8cceffab7cf51cf33ce
Author: Takeshi Yamamuro 
AuthorDate: Thu May 6 10:14:25 2021 +0900

Fix
---
 docs/sql-ref-syntax-hive-format.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/sql-ref-syntax-hive-format.md 
b/docs/sql-ref-syntax-hive-format.md
index 8092e58..01b8d3f 100644
--- a/docs/sql-ref-syntax-hive-format.md
+++ b/docs/sql-ref-syntax-hive-format.md
@@ -30,7 +30,7 @@ There are two ways to define a row format in `row_format` of 
`CREATE TABLE` and
 
 ```sql
 row_format:
-SERDE serde_class [ WITH SERDEPROPERTIES (k1=v1, k2=v2, ... ) ]
+SERDE serde_class [ WITH SERDEPROPERTIES (k1 [=] v1, k2 [=] v2, ... ) ]
 | DELIMITED [ FIELDS TERMINATED BY fields_terminated_char [ ESCAPED BY 
escaped_char ] ] 
 [ COLLECTION ITEMS TERMINATED BY collection_items_terminated_char ] 
 [ MAP KEYS TERMINATED BY map_key_terminated_char ]

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] 08/09: remove space

2021-05-05 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch pr31899
in repository https://gitbox.apache.org/repos/asf/spark.git

commit 2ebb2aac7a0c87c929e72bc7c8c080096c55a8f1
Author: Niklas Riekenbrauck 
AuthorDate: Tue Mar 30 13:20:32 2021 +0200

remove space
---
 docs/sql-ref-syntax-ddl-alter-table.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/sql-ref-syntax-ddl-alter-table.md 
b/docs/sql-ref-syntax-ddl-alter-table.md
index 915ccf8..ae40fe4 100644
--- a/docs/sql-ref-syntax-ddl-alter-table.md
+++ b/docs/sql-ref-syntax-ddl-alter-table.md
@@ -221,7 +221,7 @@ ALTER TABLE table_identifier [ partition_spec ] SET 
LOCATION 'new_location'
 
 **Syntax:** `PARTITION ( partition_col_name = partition_col_val [ , ... ] 
)`
 
-* **SERDEPROPERTIES ( key1 [=] val1, key2 [=] val2, ... ) **
+* **SERDEPROPERTIES ( key1 [=] val1, key2 [=] val2, ... )**
 
 Specifies the SERDE properties to be set.
 

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] 05/09: Commit missing doc updates

2021-05-05 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch pr31899
in repository https://gitbox.apache.org/repos/asf/spark.git

commit 83ec2ee71751142220464ea54ffc6e47ccc35ad4
Author: Niklas Riekenbrauck 
AuthorDate: Sat Mar 27 15:19:27 2021 +0100

Commit missing doc updates
---
 docs/sql-ref-syntax-ddl-alter-table.md | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/docs/sql-ref-syntax-ddl-alter-table.md 
b/docs/sql-ref-syntax-ddl-alter-table.md
index 912de0f..866b596 100644
--- a/docs/sql-ref-syntax-ddl-alter-table.md
+++ b/docs/sql-ref-syntax-ddl-alter-table.md
@@ -184,10 +184,10 @@ ALTER TABLE table_identifier UNSET TBLPROPERTIES [ IF 
EXISTS ] ( key1, key2, ...
 ```sql
 -- Set SERDE Properties
 ALTER TABLE table_identifier [ partition_spec ]
-SET SERDEPROPERTIES ( ( key1 = val1, key2 = val2, ... ) | ( key1 val1, 
key2 val2, ... ) )
+SET SERDEPROPERTIES ( key1 [=] val1, key2 [=] val2, ... )
 
 ALTER TABLE table_identifier [ partition_spec ] SET SERDE serde_class_name
-[ WITH SERDEPROPERTIES ( ( key1 = val1, key2 = val2, ... ) | ( key1 val1, 
key2 val2, ... ) ) ]
+[ WITH SERDEPROPERTIES ( key1 [=] val1, key2 [=] val2, ... ) ]
 ```
 
  SET LOCATION And SET FILE FORMAT
@@ -221,7 +221,7 @@ ALTER TABLE table_identifier [ partition_spec ] SET 
LOCATION 'new_location'
 
 **Syntax:** `PARTITION ( partition_col_name  = partition_col_val [ , ... ] 
)`
 
-* **SERDEPROPERTIES ( ( key1 = val1, key2 = val2, ... ) | ( key1 val1, key2 
val2, ... ) ) **
+* **SERDEPROPERTIES ( key1 [=] val1, key2 [=] val2, ... ) **
 
 Specifies the SERDE properties to be set.
 

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] 06/09: Some more fixes

2021-05-05 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch pr31899
in repository https://gitbox.apache.org/repos/asf/spark.git

commit fff449bd54f2204d7cfc7a5fcf5c8877aa37a992
Author: Niklas Riekenbrauck 
AuthorDate: Sat Mar 27 15:22:11 2021 +0100

Some more fixes
---
 docs/sql-ref-syntax-ddl-alter-table.md | 4 ++--
 docs/sql-ref-syntax-ddl-create-table-hiveformat.md | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/docs/sql-ref-syntax-ddl-alter-table.md 
b/docs/sql-ref-syntax-ddl-alter-table.md
index 866b596..915ccf8 100644
--- a/docs/sql-ref-syntax-ddl-alter-table.md
+++ b/docs/sql-ref-syntax-ddl-alter-table.md
@@ -169,7 +169,7 @@ this overrides the old value with the new one.
 
 ```sql
 -- Set Table Properties 
-ALTER TABLE table_identifier SET TBLPROPERTIES ( ( key1 [=] val1, key2 [=] 
val2, ... ) )
+ALTER TABLE table_identifier SET TBLPROPERTIES ( key1 [=] val1, key2 [=] val2, 
... )
 
 -- Unset Table Properties
 ALTER TABLE table_identifier UNSET TBLPROPERTIES [ IF EXISTS ] ( key1, key2, 
... )
@@ -219,7 +219,7 @@ ALTER TABLE table_identifier [ partition_spec ] SET 
LOCATION 'new_location'
 
 Specifies the partition on which the property has to be set. Note that one 
can use a typed literal (e.g., date'2019-01-02') in the partition spec.
 
-**Syntax:** `PARTITION ( partition_col_name  = partition_col_val [ , ... ] 
)`
+**Syntax:** `PARTITION ( partition_col_name = partition_col_val [ , ... ] 
)`
 
 * **SERDEPROPERTIES ( key1 [=] val1, key2 [=] val2, ... ) **
 
diff --git a/docs/sql-ref-syntax-ddl-create-table-hiveformat.md 
b/docs/sql-ref-syntax-ddl-create-table-hiveformat.md
index 48d089d..3231b66 100644
--- a/docs/sql-ref-syntax-ddl-create-table-hiveformat.md
+++ b/docs/sql-ref-syntax-ddl-create-table-hiveformat.md
@@ -37,7 +37,7 @@ CREATE [ EXTERNAL ] TABLE [ IF NOT EXISTS ] table_identifier
 [ ROW FORMAT row_format ]
 [ STORED AS file_format ]
 [ LOCATION path ]
-[ TBLPROPERTIES ( ( key1 [=] val1, key2 [=] val2, ... ) ]
+[ TBLPROPERTIES ( key1 [=] val1, key2 [=] val2, ... ) ]
 [ AS select_statement ]
 ```
 

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] 03/09: Fix alternatives with subrule grammar

2021-05-05 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch pr31899
in repository https://gitbox.apache.org/repos/asf/spark.git

commit 8245a55dd1092fe9ef3fbcacb5cf07d1888ac23a
Author: Niklas Riekenbrauck 
AuthorDate: Sat Mar 20 13:00:08 2021 +0100

Fix alternatives with subrule grammar
---
 docs/sql-ref-syntax-ddl-create-table-datasource.md | 4 ++--
 docs/sql-ref-syntax-ddl-create-table-hiveformat.md | 2 +-
 docs/sql-ref-syntax-ddl-create-table-like.md   | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/docs/sql-ref-syntax-ddl-create-table-datasource.md 
b/docs/sql-ref-syntax-ddl-create-table-datasource.md
index 82d3a09..9926bc6 100644
--- a/docs/sql-ref-syntax-ddl-create-table-datasource.md
+++ b/docs/sql-ref-syntax-ddl-create-table-datasource.md
@@ -29,14 +29,14 @@ The `CREATE TABLE` statement defines a new table using a 
Data Source.
 CREATE TABLE [ IF NOT EXISTS ] table_identifier
 [ ( col_name1 col_type1 [ COMMENT col_comment1 ], ... ) ]
 USING data_source
-[ OPTIONS [ ( key1=val1, key2=val2, ... ) | ( key1 val1, key2 val2, ... ) 
] ]
+[ OPTIONS ( ( key1=val1, key2=val2, ... ) | ( key1 val1, key2 val2, ... ) 
) ]
 [ PARTITIONED BY ( col_name1, col_name2, ... ) ]
 [ CLUSTERED BY ( col_name3, col_name4, ... ) 
 [ SORTED BY ( col_name [ ASC | DESC ], ... ) ] 
 INTO num_buckets BUCKETS ]
 [ LOCATION path ]
 [ COMMENT table_comment ]
-[ TBLPROPERTIES [ ( key1=val1, key2=val2, ... ) | ( key1 val1, key2 val2, 
... ) ] ]
+[ TBLPROPERTIES ( ( key1=val1, key2=val2, ... ) | ( key1 val1, key2 val2, 
... ) ) ]
 [ AS select_statement ]
 ```
 
diff --git a/docs/sql-ref-syntax-ddl-create-table-hiveformat.md 
b/docs/sql-ref-syntax-ddl-create-table-hiveformat.md
index 63880d5..2e05e64 100644
--- a/docs/sql-ref-syntax-ddl-create-table-hiveformat.md
+++ b/docs/sql-ref-syntax-ddl-create-table-hiveformat.md
@@ -37,7 +37,7 @@ CREATE [ EXTERNAL ] TABLE [ IF NOT EXISTS ] table_identifier
 [ ROW FORMAT row_format ]
 [ STORED AS file_format ]
 [ LOCATION path ]
-[ TBLPROPERTIES [ ( key1=val1, key2=val2, ... ) | ( key1 val1, key2 val2, 
... ) ] ]
+[ TBLPROPERTIES ( ( key1=val1, key2=val2, ... ) | ( key1 val1, key2 val2, 
... ) ) ]
 [ AS select_statement ]
 ```
 
diff --git a/docs/sql-ref-syntax-ddl-create-table-like.md 
b/docs/sql-ref-syntax-ddl-create-table-like.md
index a374296a..772b299 100644
--- a/docs/sql-ref-syntax-ddl-create-table-like.md
+++ b/docs/sql-ref-syntax-ddl-create-table-like.md
@@ -30,7 +30,7 @@ CREATE TABLE [IF NOT EXISTS] table_identifier LIKE 
source_table_identifier
 USING data_source
 [ ROW FORMAT row_format ]
 [ STORED AS file_format ]
-[ TBLPROPERTIES [ ( key1=val1, key2=val2, ... ) | ( key1 val1, key2 val2, 
... ) ] ]
+[ TBLPROPERTIES ( ( key1=val1, key2=val2, ... ) | ( key1 val1, key2 val2, 
... ) ) ]
 [ LOCATION path ]
 ```
 

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] 02/09: Update docs other create table docs

2021-05-05 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch pr31899
in repository https://gitbox.apache.org/repos/asf/spark.git

commit 2ff970350427835a7b7f7f9d0ec7bc8f1049f7fd
Author: Niklas Riekenbrauck 
AuthorDate: Fri Mar 19 15:17:48 2021 +0100

Update docs other create table docs
---
 docs/sql-ref-syntax-ddl-create-table-hiveformat.md | 2 +-
 docs/sql-ref-syntax-ddl-create-table-like.md   | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/docs/sql-ref-syntax-ddl-create-table-hiveformat.md 
b/docs/sql-ref-syntax-ddl-create-table-hiveformat.md
index b2f5957..63880d5 100644
--- a/docs/sql-ref-syntax-ddl-create-table-hiveformat.md
+++ b/docs/sql-ref-syntax-ddl-create-table-hiveformat.md
@@ -37,7 +37,7 @@ CREATE [ EXTERNAL ] TABLE [ IF NOT EXISTS ] table_identifier
 [ ROW FORMAT row_format ]
 [ STORED AS file_format ]
 [ LOCATION path ]
-[ TBLPROPERTIES ( key1=val1, key2=val2, ... ) ]
+[ TBLPROPERTIES [ ( key1=val1, key2=val2, ... ) | ( key1 val1, key2 val2, 
... ) ] ]
 [ AS select_statement ]
 ```
 
diff --git a/docs/sql-ref-syntax-ddl-create-table-like.md 
b/docs/sql-ref-syntax-ddl-create-table-like.md
index cfb959c..a374296a 100644
--- a/docs/sql-ref-syntax-ddl-create-table-like.md
+++ b/docs/sql-ref-syntax-ddl-create-table-like.md
@@ -30,7 +30,7 @@ CREATE TABLE [IF NOT EXISTS] table_identifier LIKE 
source_table_identifier
 USING data_source
 [ ROW FORMAT row_format ]
 [ STORED AS file_format ]
-[ TBLPROPERTIES ( key1=val1, key2=val2, ... ) ]
+[ TBLPROPERTIES [ ( key1=val1, key2=val2, ... ) | ( key1 val1, key2 val2, 
... ) ] ]
 [ LOCATION path ]
 ```
 

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch pr31899 created (now 0c4e71e)

2021-05-05 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch pr31899
in repository https://gitbox.apache.org/repos/asf/spark.git.


  at 0c4e71e  Fix

This branch includes the following new commits:

 new c685abe  Update docs to reflect alternative key value notation
 new 2ff9703  Update docs other create table docs
 new 8245a55  Fix alternatives with subrule grammar
 new 1d157ed  Update to eaasier KV syntax
 new 83ec2ee  Commit missing doc updates
 new fff449b  Some more fixes
 new 42cd52e  Remove unnecessary change
 new 2ebb2aa  remove space
 new 0c4e71e  Fix

The 9 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] 07/09: Remove unnecessary change

2021-05-05 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch pr31899
in repository https://gitbox.apache.org/repos/asf/spark.git

commit 42cd52e297b141a8b837a8315ca4c84a5ffc3def
Author: Niklas Riekenbrauck 
AuthorDate: Sat Mar 27 15:27:52 2021 +0100

Remove unnecessary change
---
 docs/sql-ref-syntax-ddl-alter-database.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/sql-ref-syntax-ddl-alter-database.md 
b/docs/sql-ref-syntax-ddl-alter-database.md
index 2de9675..6ac6863 100644
--- a/docs/sql-ref-syntax-ddl-alter-database.md
+++ b/docs/sql-ref-syntax-ddl-alter-database.md
@@ -31,7 +31,7 @@ for a database and may be used for auditing purposes.
 
 ```sql
 ALTER { DATABASE | SCHEMA } database_name
-SET DBPROPERTIES ( ( property_name [=] property_value [ , ... ] | ( 
property_name property_value [ , ... ] )
+SET DBPROPERTIES ( property_name [=] property_value [ , ... ] )
 ```
 
 ### Parameters

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] 01/09: Update docs to reflect alternative key value notation

2021-05-05 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch pr31899
in repository https://gitbox.apache.org/repos/asf/spark.git

commit c685abe33681fcbf0bfa6aa86ba229f19e4d451f
Author: Niklas Riekenbrauck 
AuthorDate: Fri Mar 19 14:53:53 2021 +0100

Update docs to reflect alternative key value notation
---
 docs/sql-ref-syntax-ddl-create-table-datasource.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/docs/sql-ref-syntax-ddl-create-table-datasource.md 
b/docs/sql-ref-syntax-ddl-create-table-datasource.md
index ba0516a..82d3a09 100644
--- a/docs/sql-ref-syntax-ddl-create-table-datasource.md
+++ b/docs/sql-ref-syntax-ddl-create-table-datasource.md
@@ -29,14 +29,14 @@ The `CREATE TABLE` statement defines a new table using a 
Data Source.
 CREATE TABLE [ IF NOT EXISTS ] table_identifier
 [ ( col_name1 col_type1 [ COMMENT col_comment1 ], ... ) ]
 USING data_source
-[ OPTIONS ( key1=val1, key2=val2, ... ) ]
+[ OPTIONS [ ( key1=val1, key2=val2, ... ) | ( key1 val1, key2 val2, ... ) 
] ]
 [ PARTITIONED BY ( col_name1, col_name2, ... ) ]
 [ CLUSTERED BY ( col_name3, col_name4, ... ) 
 [ SORTED BY ( col_name [ ASC | DESC ], ... ) ] 
 INTO num_buckets BUCKETS ]
 [ LOCATION path ]
 [ COMMENT table_comment ]
-[ TBLPROPERTIES ( key1=val1, key2=val2, ... ) ]
+[ TBLPROPERTIES [ ( key1=val1, key2=val2, ... ) | ( key1 val1, key2 val2, 
... ) ] ]
 [ AS select_statement ]
 ```
 

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] 04/09: Update to eaasier KV syntax

2021-05-05 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch pr31899
in repository https://gitbox.apache.org/repos/asf/spark.git

commit 1d157ed7209b355294b8e07e672ba8b5916e93f5
Author: Niklas Riekenbrauck 
AuthorDate: Sat Mar 27 15:17:26 2021 +0100

Update to eaasier KV syntax
---
 docs/sql-ref-syntax-ddl-alter-database.md  | 2 +-
 docs/sql-ref-syntax-ddl-alter-table.md | 8 
 docs/sql-ref-syntax-ddl-alter-view.md  | 2 +-
 docs/sql-ref-syntax-ddl-create-database.md | 4 ++--
 docs/sql-ref-syntax-ddl-create-table-datasource.md | 4 ++--
 docs/sql-ref-syntax-ddl-create-table-hiveformat.md | 2 +-
 docs/sql-ref-syntax-ddl-create-table-like.md   | 2 +-
 7 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/docs/sql-ref-syntax-ddl-alter-database.md 
b/docs/sql-ref-syntax-ddl-alter-database.md
index fbc454e..2de9675 100644
--- a/docs/sql-ref-syntax-ddl-alter-database.md
+++ b/docs/sql-ref-syntax-ddl-alter-database.md
@@ -31,7 +31,7 @@ for a database and may be used for auditing purposes.
 
 ```sql
 ALTER { DATABASE | SCHEMA } database_name
-SET DBPROPERTIES ( property_name = property_value [ , ... ] )
+SET DBPROPERTIES ( ( property_name [=] property_value [ , ... ] | ( 
property_name property_value [ , ... ] )
 ```
 
 ### Parameters
diff --git a/docs/sql-ref-syntax-ddl-alter-table.md 
b/docs/sql-ref-syntax-ddl-alter-table.md
index 2d42eb4..912de0f 100644
--- a/docs/sql-ref-syntax-ddl-alter-table.md
+++ b/docs/sql-ref-syntax-ddl-alter-table.md
@@ -169,7 +169,7 @@ this overrides the old value with the new one.
 
 ```sql
 -- Set Table Properties 
-ALTER TABLE table_identifier SET TBLPROPERTIES ( key1 = val1, key2 = val2, ... 
)
+ALTER TABLE table_identifier SET TBLPROPERTIES ( ( key1 [=] val1, key2 [=] 
val2, ... ) )
 
 -- Unset Table Properties
 ALTER TABLE table_identifier UNSET TBLPROPERTIES [ IF EXISTS ] ( key1, key2, 
... )
@@ -184,10 +184,10 @@ ALTER TABLE table_identifier UNSET TBLPROPERTIES [ IF 
EXISTS ] ( key1, key2, ...
 ```sql
 -- Set SERDE Properties
 ALTER TABLE table_identifier [ partition_spec ]
-SET SERDEPROPERTIES ( key1 = val1, key2 = val2, ... )
+SET SERDEPROPERTIES ( ( key1 = val1, key2 = val2, ... ) | ( key1 val1, 
key2 val2, ... ) )
 
 ALTER TABLE table_identifier [ partition_spec ] SET SERDE serde_class_name
-[ WITH SERDEPROPERTIES ( key1 = val1, key2 = val2, ... ) ]
+[ WITH SERDEPROPERTIES ( ( key1 = val1, key2 = val2, ... ) | ( key1 val1, 
key2 val2, ... ) ) ]
 ```
 
  SET LOCATION And SET FILE FORMAT
@@ -221,7 +221,7 @@ ALTER TABLE table_identifier [ partition_spec ] SET 
LOCATION 'new_location'
 
 **Syntax:** `PARTITION ( partition_col_name  = partition_col_val [ , ... ] 
)`
 
-* **SERDEPROPERTIES ( key1 = val1, key2 = val2, ... )**
+* **SERDEPROPERTIES ( ( key1 = val1, key2 = val2, ... ) | ( key1 val1, key2 
val2, ... ) ) **
 
 Specifies the SERDE properties to be set.
 
diff --git a/docs/sql-ref-syntax-ddl-alter-view.md 
b/docs/sql-ref-syntax-ddl-alter-view.md
index d69f246..25280c4 100644
--- a/docs/sql-ref-syntax-ddl-alter-view.md
+++ b/docs/sql-ref-syntax-ddl-alter-view.md
@@ -49,7 +49,7 @@ the properties.
 
  Syntax
 ```sql
-ALTER VIEW view_identifier SET TBLPROPERTIES ( property_key = property_val [ , 
... ] )
+ALTER VIEW view_identifier SET TBLPROPERTIES ( property_key [=] property_val [ 
, ... ] )
 ```
 
  Parameters
diff --git a/docs/sql-ref-syntax-ddl-create-database.md 
b/docs/sql-ref-syntax-ddl-create-database.md
index 9d8bf47..7db410e 100644
--- a/docs/sql-ref-syntax-ddl-create-database.md
+++ b/docs/sql-ref-syntax-ddl-create-database.md
@@ -29,7 +29,7 @@ Creates a database with the specified name. If database with 
the same name alrea
 CREATE { DATABASE | SCHEMA } [ IF NOT EXISTS ] database_name
 [ COMMENT database_comment ]
 [ LOCATION database_directory ]
-[ WITH DBPROPERTIES ( property_name = property_value [ , ... ] ) ]
+[ WITH DBPROPERTIES ( property_name [=] property_value [ , ... ] ) ]
 ```
 
 ### Parameters
@@ -50,7 +50,7 @@ CREATE { DATABASE | SCHEMA } [ IF NOT EXISTS ] database_name
 
 Specifies the description for the database.
 
-* **WITH DBPROPERTIES ( property_name=property_value [ , ... ] )**
+* **WITH DBPROPERTIES ( property_name [=] property_value [ , ... ] )**
 
 Specifies the properties for the database in key-value pairs.
 
diff --git a/docs/sql-ref-syntax-ddl-create-table-datasource.md 
b/docs/sql-ref-syntax-ddl-create-table-datasource.md
index 9926bc6..7d8e692 100644
--- a/docs/sql-ref-syntax-ddl-create-table-datasource.md
+++ b/docs/sql-ref-syntax-ddl-create-table-datasource.md
@@ -29,14 +29,14 @@ The `CREATE TABLE` statement defines a new table using a 
Data Source.
 CREATE TABLE [ IF NOT EXISTS ] table_identifier
 [ ( col_name1 col_type1 [ COMMENT col_comment1 ], ... ) ]
 USING data_source
-[ OPTIONS ( ( key1=val1, key2=val2, ... ) | ( key1 val1,

[spark] branch branch-3.0 updated: [SPARK-34794][SQL] Fix lambda variable name issues in nested DataFrame functions

2021-05-04 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 8ef4023  [SPARK-34794][SQL] Fix lambda variable name issues in nested 
DataFrame functions
8ef4023 is described below

commit 8ef4023683dee537a40d376d93c329a802a929bd
Author: dsolow 
AuthorDate: Wed May 5 12:46:13 2021 +0900

[SPARK-34794][SQL] Fix lambda variable name issues in nested DataFrame 
functions

### What changes were proposed in this pull request?

To fix lambda variable name issues in nested DataFrame functions, this PR 
modifies code to use a global counter for `LambdaVariables` names created by 
higher order functions.

This is the rework of #31887. Closes #31887.

### Why are the changes needed?

 This moves away from the current hard-coded variable names which break on 
nested function calls. There is currently a bug where nested transforms in 
particular fail (the inner variable shadows the outer variable)

For this query:
```
val df = Seq(
(Seq(1,2,3), Seq("a", "b", "c"))
).toDF("numbers", "letters")

df.select(
f.flatten(
f.transform(
$"numbers",
(number: Column) => { f.transform(
$"letters",
(letter: Column) => { f.struct(
number.as("number"),
letter.as("letter")
) }
) }
)
).as("zipped")
).show(10, false)
```
This is the current (incorrect) output:
```
++
|zipped  |
++
|[{a, a}, {b, b}, {c, c}, {a, a}, {b, b}, {c, c}, {a, a}, {b, b}, {c, c}]|
++
```
And this is the correct output after fix:
```
++
|zipped  |
++
|[{1, a}, {1, b}, {1, c}, {2, a}, {2, b}, {2, c}, {3, a}, {3, b}, {3, c}]|
++
```

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Added the new test in `DataFrameFunctionsSuite`.
    
Closes #32424 from maropu/pr31887.

Lead-authored-by: dsolow 
Co-authored-by: Takeshi Yamamuro 
Co-authored-by: dmsolow 
Signed-off-by: Takeshi Yamamuro 
    (cherry picked from commit f550e03b96638de93381734c4eada2ace02d9a4f)
Signed-off-by: Takeshi Yamamuro 
---
 .../expressions/higherOrderFunctions.scala | 12 ++-
 .../scala/org/apache/spark/sql/functions.scala | 12 +--
 .../apache/spark/sql/DataFrameFunctionsSuite.scala | 23 ++
 3 files changed, 40 insertions(+), 7 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala
index e5cf8c0..a530ce5 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala
@@ -18,7 +18,7 @@
 package org.apache.spark.sql.catalyst.expressions
 
 import java.util.Comparator
-import java.util.concurrent.atomic.AtomicReference
+import java.util.concurrent.atomic.{AtomicInteger, AtomicReference}
 
 import scala.collection.mutable
 
@@ -52,6 +52,16 @@ case class UnresolvedNamedLambdaVariable(nameParts: 
Seq[String])
   override def sql: String = name
 }
 
+object UnresolvedNamedLambdaVariable {
+
+  // Counter to ensure lambda variable names are unique
+  private val nextVarNameId = new AtomicInteger(0)
+
+  def freshVarName(name: String): String = {
+s"${name}_${nextVarNameId.getAndIncrement()}"
+  }
+}
+
 /**
  * A named lambda variable.
  */
diff --git a/sql/core/src/main/scala/org/apache/spark/sql/functions.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/functions.scala
index bb77c7e..f6d6200 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/functions.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/functions.scala
@@ -3489,22 +3489,22 @@ object functions {
   }
 
   private def createLambda(f: Column

[spark] branch branch-3.1 updated: [SPARK-34794][SQL] Fix lambda variable name issues in nested DataFrame functions

2021-05-04 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.1 by this push:
 new 6df4ec0  [SPARK-34794][SQL] Fix lambda variable name issues in nested 
DataFrame functions
6df4ec0 is described below

commit 6df4ec09a17077c2a0b114a7bf5736711ba268e4
Author: dsolow 
AuthorDate: Wed May 5 12:46:13 2021 +0900

[SPARK-34794][SQL] Fix lambda variable name issues in nested DataFrame 
functions

### What changes were proposed in this pull request?

To fix lambda variable name issues in nested DataFrame functions, this PR 
modifies code to use a global counter for `LambdaVariables` names created by 
higher order functions.

This is the rework of #31887. Closes #31887.

### Why are the changes needed?

 This moves away from the current hard-coded variable names which break on 
nested function calls. There is currently a bug where nested transforms in 
particular fail (the inner variable shadows the outer variable)

For this query:
```
val df = Seq(
(Seq(1,2,3), Seq("a", "b", "c"))
).toDF("numbers", "letters")

df.select(
f.flatten(
f.transform(
$"numbers",
(number: Column) => { f.transform(
$"letters",
(letter: Column) => { f.struct(
number.as("number"),
letter.as("letter")
) }
) }
)
).as("zipped")
).show(10, false)
```
This is the current (incorrect) output:
```
++
|zipped  |
++
|[{a, a}, {b, b}, {c, c}, {a, a}, {b, b}, {c, c}, {a, a}, {b, b}, {c, c}]|
++
```
And this is the correct output after fix:
```
++
|zipped  |
++
|[{1, a}, {1, b}, {1, c}, {2, a}, {2, b}, {2, c}, {3, a}, {3, b}, {3, c}]|
++
```

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Added the new test in `DataFrameFunctionsSuite`.
    
Closes #32424 from maropu/pr31887.

Lead-authored-by: dsolow 
Co-authored-by: Takeshi Yamamuro 
Co-authored-by: dmsolow 
Signed-off-by: Takeshi Yamamuro 
    (cherry picked from commit f550e03b96638de93381734c4eada2ace02d9a4f)
Signed-off-by: Takeshi Yamamuro 
---
 .../expressions/higherOrderFunctions.scala | 12 ++-
 .../scala/org/apache/spark/sql/functions.scala | 12 +--
 .../apache/spark/sql/DataFrameFunctionsSuite.scala | 23 ++
 3 files changed, 40 insertions(+), 7 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala
index ba447ea..a4e069d 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/higherOrderFunctions.scala
@@ -18,7 +18,7 @@
 package org.apache.spark.sql.catalyst.expressions
 
 import java.util.Comparator
-import java.util.concurrent.atomic.AtomicReference
+import java.util.concurrent.atomic.{AtomicInteger, AtomicReference}
 
 import scala.collection.mutable
 
@@ -52,6 +52,16 @@ case class UnresolvedNamedLambdaVariable(nameParts: 
Seq[String])
   override def sql: String = name
 }
 
+object UnresolvedNamedLambdaVariable {
+
+  // Counter to ensure lambda variable names are unique
+  private val nextVarNameId = new AtomicInteger(0)
+
+  def freshVarName(name: String): String = {
+s"${name}_${nextVarNameId.getAndIncrement()}"
+  }
+}
+
 /**
  * A named lambda variable.
  */
diff --git a/sql/core/src/main/scala/org/apache/spark/sql/functions.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/functions.scala
index e6b41cd..6bc49b6 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/functions.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/functions.scala
@@ -3644,22 +3644,22 @@ object functions {
   }
 
   private def createLambda(f: Column

[spark] branch master updated (7fd3f8f -> f550e03)

2021-05-04 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 7fd3f8f  [SPARK-35294][SQL] Add tree traversal pruning in rules with 
dedicated files under optimizer
 add f550e03  [SPARK-34794][SQL] Fix lambda variable name issues in nested 
DataFrame functions

No new revisions were added by this update.

Summary of changes:
 .../expressions/higherOrderFunctions.scala | 12 ++-
 .../scala/org/apache/spark/sql/functions.scala | 12 +--
 .../apache/spark/sql/DataFrameFunctionsSuite.scala | 23 ++
 3 files changed, 40 insertions(+), 7 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (caa46ce -> cd689c9)

2021-05-02 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from caa46ce  [SPARK-35112][SQL] Support Cast string to day-second interval
 add cd689c9  [SPARK-35192][SQL][TESTS] Port minimal TPC-DS datagen code 
from databricks/spark-sql-perf

No new revisions were added by this update.

Summary of changes:
 .github/workflows/build_and_test.yml   |  31 +-
 .../scala/org/apache/spark/sql/GenTPCDSData.scala  | 445 +
 .../scala/org/apache/spark/sql/TPCDSBase.scala | 537 +
 .../sql/{TPCDSBase.scala => TPCDSSchema.scala} |  92 +---
 4 files changed, 466 insertions(+), 639 deletions(-)
 create mode 100644 
sql/core/src/test/scala/org/apache/spark/sql/GenTPCDSData.scala
 copy sql/core/src/test/scala/org/apache/spark/sql/{TPCDSBase.scala => 
TPCDSSchema.scala} (83%)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (86d3bb5 -> 403e479)

2021-04-28 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 86d3bb5  [SPARK-34981][SQL] Implement V2 function resolution and 
evaluation
 add 403e479  [SPARK-35244][SQL][FOLLOWUP] Add null check for the exception 
cause

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/catalyst/expressions/objects/objects.scala| 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated (a556bc8 -> c6659e6)

2021-04-28 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git.


from a556bc8  [SPARK-33976][SQL][DOCS][3.0] Add a SQL doc page for a 
TRANSFORM clause
 add c6659e6  [SPARK-35159][SQL][DOCS][3.0] Extract hive format doc

No new revisions were added by this update.

Summary of changes:
 docs/sql-ref-syntax-ddl-create-table-hiveformat.md | 52 +--
 docs/sql-ref-syntax-hive-format.md | 73 ++
 docs/sql-ref-syntax-qry-select-transform.md| 48 +-
 3 files changed, 77 insertions(+), 96 deletions(-)
 create mode 100644 docs/sql-ref-syntax-hive-format.md

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.1 updated (361e684 -> db8204e)

2021-04-28 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 361e684  [SPARK-33976][SQL][DOCS][3.1] Add a SQL doc page for a 
TRANSFORM clause
 add db8204e  [SPARK-35159][SQL][DOCS][3.1] Extract hive format doc

No new revisions were added by this update.

Summary of changes:
 docs/sql-ref-syntax-ddl-create-table-hiveformat.md | 52 +--
 docs/sql-ref-syntax-hive-format.md | 73 ++
 docs/sql-ref-syntax-qry-select-transform.md| 48 +-
 3 files changed, 77 insertions(+), 96 deletions(-)
 create mode 100644 docs/sql-ref-syntax-hive-format.md

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (26a5e33 -> 8b62c29)

2021-04-28 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 26a5e33  [SPARK-33976][SQL][DOCS][FOLLOWUP] Fix syntax error in select 
doc page
 add 8b62c29  [SPARK-35214][SQL] OptimizeSkewedJoin support 
ShuffledHashJoinExec

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/internal/SQLConf.scala|   4 +-
 .../execution/adaptive/OptimizeSkewedJoin.scala| 189 -
 .../execution/exchange/EnsureRequirements.scala|   9 +-
 .../sql/execution/joins/ShuffledHashJoinExec.scala |   3 +-
 .../spark/sql/execution/joins/ShuffledJoin.scala   |  18 +-
 .../sql/execution/joins/SortMergeJoinExec.scala|  17 --
 .../adaptive/AdaptiveQueryExecSuite.scala  | 130 +++---
 7 files changed, 204 insertions(+), 166 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated (6e83789b -> a556bc8)

2021-04-28 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 6e83789b [SPARK-35244][SQL] Invoke should throw the original exception
 add a556bc8  [SPARK-33976][SQL][DOCS][3.0] Add a SQL doc page for a 
TRANSFORM clause

No new revisions were added by this update.

Summary of changes:
 docs/_data/menu-sql.yaml|   2 +
 docs/sql-ref-syntax-qry-select-transform.md | 235 
 docs/sql-ref-syntax-qry-select.md   |   7 +-
 docs/sql-ref-syntax-qry.md  |   1 +
 docs/sql-ref-syntax.md  |   1 +
 5 files changed, 245 insertions(+), 1 deletion(-)
 create mode 100644 docs/sql-ref-syntax-qry-select-transform.md

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.1 updated (e58055b -> 361e684)

2021-04-28 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git.


from e58055b  [SPARK-35244][SQL] Invoke should throw the original exception
 add 361e684  [SPARK-33976][SQL][DOCS][3.1] Add a SQL doc page for a 
TRANSFORM clause

No new revisions were added by this update.

Summary of changes:
 docs/_data/menu-sql.yaml|   2 +
 docs/sql-ref-syntax-qry-select-transform.md | 235 
 docs/sql-ref-syntax-qry-select.md   |   7 +-
 docs/sql-ref-syntax-qry.md  |   1 +
 docs/sql-ref-syntax.md  |   1 +
 5 files changed, 245 insertions(+), 1 deletion(-)
 create mode 100644 docs/sql-ref-syntax-qry-select-transform.md

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-33976][SQL][DOCS][FOLLOWUP] Fix syntax error in select doc page

2021-04-28 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 26a5e33  [SPARK-33976][SQL][DOCS][FOLLOWUP] Fix syntax error in select 
doc page
26a5e33 is described below

commit 26a5e339a61ab06fb2949166db705f1b575addd3
Author: Angerszh 
AuthorDate: Wed Apr 28 16:47:02 2021 +0900

[SPARK-33976][SQL][DOCS][FOLLOWUP] Fix syntax error in select doc page

### What changes were proposed in this pull request?
Add doc about `TRANSFORM` and related function.

### Why are the changes needed?

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Not need

Closes #32257 from AngersZh/SPARK-33976-followup.

Authored-by: Angerszh 
Signed-off-by: Takeshi Yamamuro 
---
 docs/sql-ref-syntax-qry-select.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/sql-ref-syntax-qry-select.md 
b/docs/sql-ref-syntax-qry-select.md
index 62a7f5f..500eda1 100644
--- a/docs/sql-ref-syntax-qry-select.md
+++ b/docs/sql-ref-syntax-qry-select.md
@@ -41,7 +41,7 @@ select_statement [ { UNION | INTERSECT | EXCEPT } [ ALL | 
DISTINCT ] select_stat
 
 While `select_statement` is defined as
 ```sql
-SELECT [ hints , ... ] [ ALL | DISTINCT ] { [[ named_expression | 
regex_column_names ] [ , ... ] | TRANSFORM (...)) ] }
+SELECT [ hints , ... ] [ ALL | DISTINCT ] { [ [ named_expression | 
regex_column_names ] [ , ... ] | TRANSFORM (...) ] }
 FROM { from_item [ , ... ] }
 [ PIVOT clause ]
 [ LATERAL VIEW clause ] [ ... ] 

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (9af338c -> e503b9c)

2021-04-23 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 9af338c  [SPARK-35078][SQL] Add tree traversal pruning in expression 
rules
 add e503b9c  [SPARK-35201][SQL] Format empty grouping set exception in 
CUBE/ROLLUP

No new revisions were added by this update.

Summary of changes:
 .../scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala | 6 ++
 .../main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala | 3 +++
 2 files changed, 5 insertions(+), 4 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.1 updated (034ba76 -> 5f48abe)

2021-04-20 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 034ba76  [SPARK-35080][SQL] Only allow a subset of correlated equality 
predicates when a subquery is aggregated
 add 5f48abe  [SPARK-34639][SQL][3.1] RelationalGroupedDataset.alias should 
not create UnresolvedAlias

No new revisions were added by this update.

Summary of changes:
 .../main/scala/org/apache/spark/sql/RelationalGroupedDataset.scala  | 6 +-
 sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala| 3 +++
 2 files changed, 4 insertions(+), 5 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (978cd0b -> fd08c93)

2021-04-18 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 978cd0b  [SPARK-35092][UI] the auto-generated rdd's name in the 
storage tab should be truncated if it is too long
 add fd08c93  [SPARK-35109][SQL] Fix minor exception messages of 
HashedRelation and HashJoin

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/errors/QueryExecutionErrors.scala | 18 ++
 .../spark/sql/execution/joins/HashedRelation.scala |  6 ++
 2 files changed, 8 insertions(+), 16 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (12abfe7 -> 074f770)

2021-04-18 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 12abfe7  [SPARK-34716][SQL] Support ANSI SQL intervals by the 
aggregate function `sum`
 add 074f770  [SPARK-35115][SQL][TESTS] Check ANSI intervals in 
`MutableProjectionSuite`

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/expressions/MutableProjectionSuite.scala   | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (26f312e -> caf33be)

2021-04-12 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 26f312e  [SPARK-35037][SQL] Recognize sign before the interval string 
in literals
 add caf33be  [SPARK-33411][SQL] Cardinality estimation of union, sort and 
range operator

No new revisions were added by this update.

Summary of changes:
 .../plans/logical/LogicalPlanVisitor.scala |   3 +
 .../plans/logical/basicLogicalOperators.scala  |  22 ++-
 .../statsEstimation/BasicStatsPlanVisitor.scala|  12 +-
 .../SizeInBytesOnlyStatsPlanVisitor.scala  |   2 +
 .../logical/statsEstimation/UnionEstimation.scala  | 120 +
 .../BasicStatsEstimationSuite.scala| 136 +--
 .../statsEstimation/UnionEstimationSuite.scala | 194 +
 .../spark/sql/StatisticsCollectionSuite.scala  |   4 +-
 8 files changed, 473 insertions(+), 20 deletions(-)
 create mode 100644 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/UnionEstimation.scala
 create mode 100644 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/statsEstimation/UnionEstimationSuite.scala

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (9c1f807 -> 278203d)

2021-04-12 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 9c1f807  [SPARK-35031][PYTHON] Port Koalas operations on different 
frames tests into PySpark
 add 278203d  [SPARK-28227][SQL] Support projection, aggregate/window 
functions, and lateral view in the TRANSFORM clause

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/catalyst/parser/SqlBase.g4|   6 +-
 .../spark/sql/catalyst/analysis/Analyzer.scala |   9 +-
 .../spark/sql/catalyst/parser/AstBuilder.scala |  79 --
 .../sql/catalyst/parser/PlanParserSuite.scala  |  18 +-
 .../test/resources/sql-tests/inputs/transform.sql  | 132 +
 .../resources/sql-tests/results/transform.sql.out  | 316 -
 .../spark/sql/execution/SparkSqlParserSuite.scala  | 164 +--
 .../sql/execution/command/DDLParserSuite.scala |  14 +-
 8 files changed, 662 insertions(+), 76 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-34922][SQL][3.0] Use a relative cost comparison function in the CBO

2021-04-07 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new b9ee41f  [SPARK-34922][SQL][3.0] Use a relative cost comparison 
function in the CBO
b9ee41f is described below

commit b9ee41fa9957631ca0f859ee928358c108fbd9a9
Author: Tanel Kiis 
AuthorDate: Thu Apr 8 11:03:59 2021 +0900

[SPARK-34922][SQL][3.0] Use a relative cost comparison function in the CBO

### What changes were proposed in this pull request?

Changed the cost comparison function of the CBO to use the ratios of row 
counts and sizes in bytes.

### Why are the changes needed?

In #30965 we changed to CBO cost comparison function so it would be 
"symetric": `A.betterThan(B)` now implies, that `!B.betterThan(A)`.
With that we caused a performance regressions in some queries - TPCDS q19 
for example.

The original cost comparison function used the ratios `relativeRows = 
A.rowCount / B.rowCount` and `relativeSize = A.size / B.size`. The changed 
function compared "absolute" cost values `costA = w*A.rowCount + (1-w)*A.size` 
and `costB = w*B.rowCount + (1-w)*B.size`.

Given the input from wzhfy we decided to go back to the relative values, 
because otherwise one (size) may overwhelm the other (rowCount). But this time 
we avoid adding up the ratios.

Originally `A.betterThan(B) => w*relativeRows + (1-w)*relativeSize < 1` was 
used. Besides being "non-symteric", this also can exhibit one overwhelming 
other.
For `w=0.5` If `A` size (bytes) is at least 2x larger than `B`, then no 
matter how many times more rows does the `B` plan have, `B` will allways be 
considered to be better - `0.5*2 + 0.5*0.01 > 1`.

When working with ratios, then it would be better to multiply them.
The proposed cost comparison function is: `A.betterThan(B) => 
relativeRows^w  * relativeSize^(1-w) < 1`.

### Does this PR introduce _any_ user-facing change?

Comparison of the changed TPCDS v1.4 query execution times at sf=10:

  | absolute | multiplicative |   | additive |  
-- | -- | -- | -- | -- | --
q12 | 145 | 137 | -5.52% | 141 | -2.76%
q13 | 264 | 271 | 2.65% | 271 | 2.65%
q17 | 4521 | 4243 | -6.15% | 4348 | -3.83%
q18 | 758 | 466 | -38.52% | 480 | -36.68%
q19 | 38503 | 2167 | -94.37% | 2176 | -94.35%
q20 | 119 | 120 | 0.84% | 126 | 5.88%
q24a | 16429 | 16838 | 2.49% | 17103 | 4.10%
q24b | 16592 | 16999 | 2.45% | 17268 | 4.07%
q25 | 3558 | 3556 | -0.06% | 3675 | 3.29%
q33 | 362 | 361 | -0.28% | 380 | 4.97%
q52 | 1020 | 1032 | 1.18% | 1052 | 3.14%
q55 | 927 | 938 | 1.19% | 961 | 3.67%
q72 | 24169 | 13377 | -44.65% | 24306 | 0.57%
q81 | 1285 | 1185 | -7.78% | 1168 | -9.11%
q91 | 324 | 336 | 3.70% | 337 | 4.01%
q98 | 126 | 129 | 2.38% | 131 | 3.97%

All times are in ms, the change is compared to the situation in the master 
branch (absolute).
The proposed cost function (multiplicative) significantlly improves the 
performance on q18, q19 and q72. The original cost function (additive) has 
similar improvements at q18 and q19. All other chagnes are within the error 
bars and I would ignore them - perhaps q81 has also improved.

### How was this patch tested?

PlanStabilitySuite

Closes #32076 from tanelk/SPARK-34922_cbo_better_cost_function_3.0.

Lead-authored-by: Tanel Kiis 
Co-authored-by: tanel.k...@gmail.com 
Signed-off-by: Takeshi Yamamuro 
---
 .../catalyst/optimizer/CostBasedJoinReorder.scala  | 28 ++
 .../org/apache/spark/sql/internal/SQLConf.scala|  6 +++--
 .../sql/catalyst/optimizer/JoinReorderSuite.scala  |  3 ---
 .../optimizer/StarJoinCostBasedReorderSuite.scala  |  9 +++
 4 files changed, 32 insertions(+), 14 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala
index 93c608dc..ed7d92e 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/CostBasedJoinReorder.scala
@@ -343,12 +343,30 @@ object JoinReorderDP extends PredicateHelper with Logging 
{
   }
 }
 
+/**
+ * To identify the plan with smaller computational cost,
+ * we use the weighted geometric mean of ratio of rows and the ratio of 
sizes in bytes.
+ *
+ * There are other ways to combine these values as a cost comparison 
function.
+ * Some of these, that we have experimented with, but have gotten worse 
result,
+ * than with the current one:
+ * 1) Weighted ar

[spark] branch branch-3.1 updated (f6b5c6f -> 84d96e8)

2021-04-07 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git.


from f6b5c6f  [SPARK-34970][SQL][SERCURITY][3.1] Redact map-type options in 
the output of explain()
 add 84d96e8  [SPARK-34922][SQL][3.1] Use a relative cost comparison 
function in the CBO

No new revisions were added by this update.

Summary of changes:
 .../catalyst/optimizer/CostBasedJoinReorder.scala  |  28 +-
 .../org/apache/spark/sql/internal/SQLConf.scala|   6 +-
 .../optimizer/joinReorder/JoinReorderSuite.scala   |   3 -
 .../StarJoinCostBasedReorderSuite.scala|   9 +-
 .../approved-plans-modified/q73.sf100/explain.txt  |   8 +-
 .../approved-plans-v1_4/q12.sf100/explain.txt  | 174 ++---
 .../approved-plans-v1_4/q12.sf100/simplified.txt   |  52 +-
 .../approved-plans-v1_4/q13.sf100/explain.txt  | 138 ++--
 .../approved-plans-v1_4/q13.sf100/simplified.txt   |  34 +-
 .../approved-plans-v1_4/q18.sf100/explain.txt  | 303 
 .../approved-plans-v1_4/q18.sf100/simplified.txt   |  50 +-
 .../approved-plans-v1_4/q19.sf100/explain.txt  | 368 -
 .../approved-plans-v1_4/q19.sf100/simplified.txt   | 116 +--
 .../approved-plans-v1_4/q20.sf100/explain.txt  | 174 ++---
 .../approved-plans-v1_4/q20.sf100/simplified.txt   |  52 +-
 .../approved-plans-v1_4/q24a.sf100/explain.txt | 832 +++--
 .../approved-plans-v1_4/q24a.sf100/simplified.txt  |  34 +-
 .../approved-plans-v1_4/q24b.sf100/explain.txt | 832 +++--
 .../approved-plans-v1_4/q24b.sf100/simplified.txt  |  34 +-
 .../approved-plans-v1_4/q25.sf100/explain.txt  | 186 ++---
 .../approved-plans-v1_4/q25.sf100/simplified.txt   | 130 ++--
 .../approved-plans-v1_4/q33.sf100/explain.txt  | 395 +-
 .../approved-plans-v1_4/q33.sf100/simplified.txt   |  58 +-
 .../approved-plans-v1_4/q52.sf100/explain.txt  | 138 ++--
 .../approved-plans-v1_4/q52.sf100/simplified.txt   |  26 +-
 .../approved-plans-v1_4/q55.sf100/explain.txt  | 134 ++--
 .../approved-plans-v1_4/q55.sf100/simplified.txt   |  26 +-
 .../approved-plans-v1_4/q72.sf100/explain.txt  | 260 +++
 .../approved-plans-v1_4/q72.sf100/simplified.txt   | 150 ++--
 .../approved-plans-v1_4/q81.sf100/explain.txt  | 570 +++---
 .../approved-plans-v1_4/q81.sf100/simplified.txt   | 142 ++--
 .../approved-plans-v1_4/q91.sf100/explain.txt  | 304 
 .../approved-plans-v1_4/q91.sf100/simplified.txt   |  62 +-
 .../approved-plans-v1_4/q98.sf100/explain.txt  | 182 ++---
 .../approved-plans-v1_4/q98.sf100/simplified.txt   |  52 +-
 .../approved-plans-v2_7/q12.sf100/explain.txt  | 174 ++---
 .../approved-plans-v2_7/q12.sf100/simplified.txt   |  52 +-
 .../approved-plans-v2_7/q18a.sf100/explain.txt | 737 +-
 .../approved-plans-v2_7/q18a.sf100/simplified.txt  |  54 +-
 .../approved-plans-v2_7/q20.sf100/explain.txt  | 174 ++---
 .../approved-plans-v2_7/q20.sf100/simplified.txt   |  52 +-
 .../approved-plans-v2_7/q72.sf100/explain.txt  | 260 +++
 .../approved-plans-v2_7/q72.sf100/simplified.txt   | 150 ++--
 .../approved-plans-v2_7/q98.sf100/explain.txt  | 178 ++---
 .../approved-plans-v2_7/q98.sf100/simplified.txt   |  52 +-
 45 files changed, 4024 insertions(+), 3921 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (390d5bd -> 7c8dc5e)

2021-04-06 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 390d5bd  [SPARK-34968][TEST][PYTHON] Add the `-fr` argument to xargs rm
 add 7c8dc5e  [SPARK-34922][SQL] Use a relative cost comparison function in 
the CBO

No new revisions were added by this update.

Summary of changes:
 .../catalyst/optimizer/CostBasedJoinReorder.scala  |  28 +-
 .../org/apache/spark/sql/internal/SQLConf.scala|   6 +-
 .../optimizer/joinReorder/JoinReorderSuite.scala   |   3 -
 .../StarJoinCostBasedReorderSuite.scala|   9 +-
 .../approved-plans-modified/q73.sf100/explain.txt  |  86 +--
 .../q73.sf100/simplified.txt   |  20 +-
 .../approved-plans-v1_4/q12.sf100/explain.txt  | 178 +++
 .../approved-plans-v1_4/q12.sf100/simplified.txt   |  52 +-
 .../approved-plans-v1_4/q13.sf100/explain.txt  | 134 ++---
 .../approved-plans-v1_4/q13.sf100/simplified.txt   |  38 +-
 .../approved-plans-v1_4/q18.sf100/explain.txt  | 152 +++---
 .../approved-plans-v1_4/q18.sf100/simplified.txt   |  50 +-
 .../approved-plans-v1_4/q19.sf100/explain.txt  | 376 ++---
 .../approved-plans-v1_4/q19.sf100/simplified.txt   | 118 ++---
 .../approved-plans-v1_4/q20.sf100/explain.txt  | 178 +++
 .../approved-plans-v1_4/q20.sf100/simplified.txt   |  52 +-
 .../approved-plans-v1_4/q24a.sf100/explain.txt | 116 ++--
 .../approved-plans-v1_4/q24a.sf100/simplified.txt  |  34 +-
 .../approved-plans-v1_4/q24b.sf100/explain.txt | 116 ++--
 .../approved-plans-v1_4/q24b.sf100/simplified.txt  |  34 +-
 .../approved-plans-v1_4/q25.sf100/explain.txt  | 192 +++
 .../approved-plans-v1_4/q25.sf100/simplified.txt   | 138 ++---
 .../approved-plans-v1_4/q33.sf100/explain.txt  | 264 +-
 .../approved-plans-v1_4/q33.sf100/simplified.txt   |  58 +-
 .../approved-plans-v1_4/q52.sf100/explain.txt  | 146 +++---
 .../approved-plans-v1_4/q52.sf100/simplified.txt   |  30 +-
 .../approved-plans-v1_4/q55.sf100/explain.txt  | 142 ++---
 .../approved-plans-v1_4/q55.sf100/simplified.txt   |  30 +-
 .../approved-plans-v1_4/q72.sf100/explain.txt  | 326 ++--
 .../approved-plans-v1_4/q72.sf100/simplified.txt   | 154 +++---
 .../approved-plans-v1_4/q81.sf100/explain.txt  | 582 ++---
 .../approved-plans-v1_4/q81.sf100/simplified.txt   | 146 +++---
 .../approved-plans-v1_4/q91.sf100/explain.txt  | 312 +--
 .../approved-plans-v1_4/q91.sf100/simplified.txt   |  66 +--
 .../approved-plans-v1_4/q98.sf100/explain.txt  | 186 +++
 .../approved-plans-v1_4/q98.sf100/simplified.txt   |  52 +-
 .../approved-plans-v2_7/q12.sf100/explain.txt  | 178 +++
 .../approved-plans-v2_7/q12.sf100/simplified.txt   |  52 +-
 .../approved-plans-v2_7/q18a.sf100/explain.txt | 172 +++---
 .../approved-plans-v2_7/q18a.sf100/simplified.txt  |  54 +-
 .../approved-plans-v2_7/q20.sf100/explain.txt  | 178 +++
 .../approved-plans-v2_7/q20.sf100/simplified.txt   |  52 +-
 .../approved-plans-v2_7/q72.sf100/explain.txt  | 326 ++--
 .../approved-plans-v2_7/q72.sf100/simplified.txt   | 154 +++---
 .../approved-plans-v2_7/q98.sf100/explain.txt  | 182 +++
 .../approved-plans-v2_7/q98.sf100/simplified.txt   |  52 +-
 46 files changed, 3011 insertions(+), 2993 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (39d5677 -> 7cfface)

2021-04-05 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 39d5677  [SPARK-34932][SQL] deprecate GROUP BY ... GROUPING SETS (...) 
and promote GROUP BY GROUPING SETS (...)
 add 7cfface  [SPARK-34935][SQL] CREATE TABLE LIKE should respect the 
reserved table properties

No new revisions were added by this update.

Summary of changes:
 docs/sql-migration-guide.md   | 2 ++
 .../scala/org/apache/spark/sql/execution/SparkSqlParser.scala | 3 ++-
 .../org/apache/spark/sql/execution/SparkSqlParserSuite.scala  | 8 
 3 files changed, 12 insertions(+), 1 deletion(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (a9ca197 -> 39d5677)

2021-04-05 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from a9ca197  [SPARK-34949][CORE] Prevent BlockManager reregister when 
Executor is shutting down
 add 39d5677  [SPARK-34932][SQL] deprecate GROUP BY ... GROUPING SETS (...) 
and promote GROUP BY GROUPING SETS (...)

No new revisions were added by this update.

Summary of changes:
 docs/sql-ref-syntax-qry-select-groupby.md  | 34 +++
 .../spark/sql/catalyst/analysis/Analyzer.scala | 36 +++
 .../spark/sql/catalyst/expressions/grouping.scala  | 46 ---
 .../spark/sql/catalyst/parser/AstBuilder.scala | 13 +++---
 .../analysis/ResolveGroupingAnalyticsSuite.scala   | 51 +-
 .../sql/catalyst/parser/PlanParserSuite.scala  |  2 +-
 6 files changed, 72 insertions(+), 110 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (3951e33 -> 90f2d4d)

2021-03-31 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 3951e33  [SPARK-34881][SQL] New SQL Function: TRY_CAST
 add 90f2d4d  [SPARK-34882][SQL] Replace if with filter clause in 
RewriteDistinctAggregates

No new revisions were added by this update.

Summary of changes:
 .../optimizer/RewriteDistinctAggregates.scala  | 47 ++
 .../org/apache/spark/sql/DataFrameSuite.scala  | 29 -
 2 files changed, 49 insertions(+), 27 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (b2bfe98 -> fcef237)

2021-03-29 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from b2bfe98  [SPARK-34845][CORE] ProcfsMetricsGetter shouldn't return 
partial procfs metrics
 add fcef237  [SPARK-34622][SQL] Push down limit through Project with Join

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/optimizer/Optimizer.scala   | 33 +-
 .../catalyst/optimizer/LimitPushdownSuite.scala|  9 ++
 2 files changed, 29 insertions(+), 13 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.1 updated: [SPARK-34833][SQL][FOLLOWUP] Handle outer references in all the places

2021-03-25 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.1 by this push:
 new f3c1298  [SPARK-34833][SQL][FOLLOWUP] Handle outer references in all 
the places
f3c1298 is described below

commit f3c129827986ba06c8a9ab00bd687e8d025103d1
Author: Wenchen Fan 
AuthorDate: Fri Mar 26 09:10:03 2021 +0900

[SPARK-34833][SQL][FOLLOWUP] Handle outer references in all the places

### What changes were proposed in this pull request?

This is a follow-up of https://github.com/apache/spark/pull/31940 . This PR 
generalizes the matching of attributes and outer references, so that outer 
references are handled everywhere.

Note that, currently correlated subquery has a lot of limitations in Spark, 
and the newly covered cases are not possible to happen. So this PR is a code 
refactor.

### Why are the changes needed?

code cleanup

### Does this PR introduce _any_ user-facing change?

no

### How was this patch tested?

existing tests

Closes #31959 from cloud-fan/follow.

Authored-by: Wenchen Fan 
Signed-off-by: Takeshi Yamamuro 
(cherry picked from commit 658e95c345d5aa2a98b8d2a854e003a5c77ed581)
Signed-off-by: Takeshi Yamamuro 
---
 .../spark/sql/catalyst/analysis/Analyzer.scala | 67 +-
 1 file changed, 41 insertions(+), 26 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
index d490845..600a5af 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
@@ -3919,6 +3919,14 @@ object UpdateOuterReferences extends Rule[LogicalPlan] {
  */
 object ApplyCharTypePadding extends Rule[LogicalPlan] {
 
+  object AttrOrOuterRef {
+def unapply(e: Expression): Option[Attribute] = e match {
+  case a: Attribute => Some(a)
+  case OuterReference(a: Attribute) => Some(a)
+  case _ => None
+}
+  }
+
   override def apply(plan: LogicalPlan): LogicalPlan = {
 plan.resolveOperatorsUp {
   case operator => operator.transformExpressionsUp {
@@ -3926,27 +3934,17 @@ object ApplyCharTypePadding extends Rule[LogicalPlan] {
 
 // String literal is treated as char type when it's compared to a char 
type column.
 // We should pad the shorter one to the longer length.
-case b @ BinaryComparison(attr: Attribute, lit) if lit.foldable =>
-  padAttrLitCmp(attr, attr.metadata, lit).map { newChildren =>
-b.withNewChildren(newChildren)
-  }.getOrElse(b)
-
-case b @ BinaryComparison(lit, attr: Attribute) if lit.foldable =>
-  padAttrLitCmp(attr, attr.metadata, lit).map { newChildren =>
-b.withNewChildren(newChildren.reverse)
-  }.getOrElse(b)
-
-case b @ BinaryComparison(or @ OuterReference(attr: Attribute), lit) 
if lit.foldable =>
-  padAttrLitCmp(or, attr.metadata, lit).map { newChildren =>
+case b @ BinaryComparison(e @ AttrOrOuterRef(attr), lit) if 
lit.foldable =>
+  padAttrLitCmp(e, attr.metadata, lit).map { newChildren =>
 b.withNewChildren(newChildren)
   }.getOrElse(b)
 
-case b @ BinaryComparison(lit, or @ OuterReference(attr: Attribute)) 
if lit.foldable =>
-  padAttrLitCmp(or, attr.metadata, lit).map { newChildren =>
+case b @ BinaryComparison(lit, e @ AttrOrOuterRef(attr)) if 
lit.foldable =>
+  padAttrLitCmp(e, attr.metadata, lit).map { newChildren =>
 b.withNewChildren(newChildren.reverse)
   }.getOrElse(b)
 
-case i @ In(attr: Attribute, list)
+case i @ In(e @ AttrOrOuterRef(attr), list)
   if attr.dataType == StringType && list.forall(_.foldable) =>
   CharVarcharUtils.getRawType(attr.metadata).flatMap {
 case CharType(length) =>
@@ -3955,7 +3953,7 @@ object ApplyCharTypePadding extends Rule[LogicalPlan] {
   val literalCharLengths = literalChars.map(_.numChars())
   val targetLen = (length +: literalCharLengths).max
   Some(i.copy(
-value = addPadding(attr, length, targetLen),
+value = addPadding(e, length, targetLen),
 list = list.zip(literalCharLengths).map {
   case (lit, charLength) => addPadding(lit, charLength, 
targetLen)
 } ++ nulls.map(Literal.create(_, StringType
@@ -3963,19 +3961,36 @@ object ApplyCharTypePadding extends Rule[LogicalPlan] {
   }.getOrElse(i)
 
 // For char type colum

[spark] branch master updated (6d88212 -> 658e95c)

2021-03-25 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 6d88212  [SPARK-34840][SHUFFLE] Fixes cases of corruption in merged 
shuffle …
 add 658e95c  [SPARK-34833][SQL][FOLLOWUP] Handle outer references in all 
the places

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/analysis/Analyzer.scala | 67 +-
 1 file changed, 41 insertions(+), 26 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.1 updated: [SPARK-34833][SQL] Apply right-padding correctly for correlated subqueries

2021-03-24 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.1 by this push:
 new 5ecf306  [SPARK-34833][SQL] Apply right-padding correctly for 
correlated subqueries
5ecf306 is described below

commit 5ecf306245d17053e25b68c844828878a66b593a
Author: Takeshi Yamamuro 
AuthorDate: Thu Mar 25 08:31:57 2021 +0900

[SPARK-34833][SQL] Apply right-padding correctly for correlated subqueries

### What changes were proposed in this pull request?

This PR intends to fix the bug that does not apply right-padding for char 
types inside correlated subquries.
For example,  a query below returns nothing in master, but a correct result 
is `c`.
```
scala> sql(s"CREATE TABLE t1(v VARCHAR(3), c CHAR(5)) USING parquet")
scala> sql(s"CREATE TABLE t2(v VARCHAR(5), c CHAR(7)) USING parquet")
scala> sql("INSERT INTO t1 VALUES ('c', 'b')")
scala> sql("INSERT INTO t2 VALUES ('a', 'b')")
scala> val df = sql("""
  |SELECT v FROM t1
  |WHERE 'a' IN (SELECT v FROM t2 WHERE t2.c = t1.c )""".stripMargin)

scala> df.show()
+---+
|  v|
+---+
+---+

```

This is because `ApplyCharTypePadding`  does not handle the case above to 
apply right-padding into `'abc'`. This PR modifies the code in 
`ApplyCharTypePadding` for handling it correctly.

```
// Before this PR:
scala> df.explain(true)
== Analyzed Logical Plan ==
v: string
Project [v#13]
+- Filter a IN (list#12 [c#14])
   :  +- Project [v#15]
   : +- Filter (c#16 = outer(c#14))
   :+- SubqueryAlias spark_catalog.default.t2
   :   +- Relation default.t2[v#15,c#16] parquet
   +- SubqueryAlias spark_catalog.default.t1
  +- Relation default.t1[v#13,c#14] parquet

scala> df.show()
+---+
|  v|
+---+
+---+

// After this PR:
scala> df.explain(true)
== Analyzed Logical Plan ==
v: string
Project [v#43]
+- Filter a IN (list#42 [c#44])
   :  +- Project [v#45]
   : +- Filter (c#46 = rpad(outer(c#44), 7,  ))
   :+- SubqueryAlias spark_catalog.default.t2
   :   +- Relation default.t2[v#45,c#46] parquet
   +- SubqueryAlias spark_catalog.default.t1
  +- Relation default.t1[v#43,c#44] parquet

scala> df.show()
+---+
|  v|
+---+
|  c|
+---+
```

This fix is lated to TPCDS q17; the query returns nothing because of this 
bug: https://github.com/apache/spark/pull/31886/files#r599333799

### Why are the changes needed?

Bugfix.

### Does this PR introduce _any_ user-facing change?
    
No.

### How was this patch tested?

Unit tests added.

Closes #31940 from maropu/FixCharPadding.

Authored-by: Takeshi Yamamuro 
Signed-off-by: Takeshi Yamamuro 
(cherry picked from commit 150769bcedb6e4a97596e0f04d686482cd09e92a)
Signed-off-by: Takeshi Yamamuro 
---
 .../spark/sql/catalyst/analysis/Analyzer.scala | 45 ++---
 .../apache/spark/sql/CharVarcharTestSuite.scala| 57 --
 2 files changed, 79 insertions(+), 23 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
index f4cdeab..d490845 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
@@ -3921,16 +3921,28 @@ object ApplyCharTypePadding extends Rule[LogicalPlan] {
 
   override def apply(plan: LogicalPlan): LogicalPlan = {
 plan.resolveOperatorsUp {
-  case operator if operator.resolved => operator.transformExpressionsUp {
+  case operator => operator.transformExpressionsUp {
+case e if !e.childrenResolved => e
+
 // String literal is treated as char type when it's compared to a char 
type column.
 // We should pad the shorter one to the longer length.
 case b @ BinaryComparison(attr: Attribute, lit) if lit.foldable =>
-  padAttrLitCmp(attr, lit).map { newChildren =>
+  padAttrLitCmp(attr, attr.metadata, lit).map { newChildren =>
 b.withNewChildren(newChildren)
   }.getOrElse(b)
 
 case b @ BinaryComparison(lit, attr: Attribute) if lit.foldable =>
-  padAttrLitCmp(attr, lit).map { newChildren =>
+  padAttrLitCmp(attr, attr.metadata, lit).map { newChildren =>
+b.withNewChildren(newChildren.reverse)

[spark] branch master updated (88cf86f -> 150769b)

2021-03-24 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 88cf86f  [SPARK-34797][ML] Refactor Logistic Aggregator - support 
virtual centering
 add 150769b  [SPARK-34833][SQL] Apply right-padding correctly for 
correlated subqueries

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/analysis/Analyzer.scala | 45 ++---
 .../apache/spark/sql/CharVarcharTestSuite.scala| 57 --
 2 files changed, 79 insertions(+), 23 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-34853][SQL] Remove duplicated definition of output partitioning/ordering for limit operator

2021-03-24 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 35c70e4  [SPARK-34853][SQL] Remove duplicated definition of output 
partitioning/ordering for limit operator
35c70e4 is described below

commit 35c70e417d8c6e3958e0da8a4bec731f9e394a28
Author: Cheng Su 
AuthorDate: Wed Mar 24 23:06:35 2021 +0900

[SPARK-34853][SQL] Remove duplicated definition of output 
partitioning/ordering for limit operator

### What changes were proposed in this pull request?

Both local limit and global limit define the output partitioning and output 
ordering in the same way and this is duplicated 
(https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/limit.scala#L159-L175
 ). We can move the output partitioning and ordering into their parent trait - 
`BaseLimitExec`. This is doable as `BaseLimitExec` has no more other child 
class. This is a minor code refactoring.

### Why are the changes needed?

Clean up the code a little bit. Better readability.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pure refactoring. Rely on existing unit tests.

Closes #31950 from c21/limit-cleanup.

Authored-by: Cheng Su 
Signed-off-by: Takeshi Yamamuro 
---
 .../main/scala/org/apache/spark/sql/execution/limit.scala | 15 +--
 1 file changed, 5 insertions(+), 10 deletions(-)

diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/limit.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/limit.scala
index d8f67fb..e5a2995 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/execution/limit.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/limit.scala
@@ -113,6 +113,10 @@ object BaseLimitExec {
 trait BaseLimitExec extends LimitExec with CodegenSupport {
   override def output: Seq[Attribute] = child.output
 
+  override def outputPartitioning: Partitioning = child.outputPartitioning
+
+  override def outputOrdering: Seq[SortOrder] = child.outputOrdering
+
   protected override def doExecute(): RDD[InternalRow] = 
child.execute().mapPartitions { iter =>
 iter.take(limit)
   }
@@ -156,12 +160,7 @@ trait BaseLimitExec extends LimitExec with CodegenSupport {
 /**
  * Take the first `limit` elements of each child partition, but do not collect 
or shuffle them.
  */
-case class LocalLimitExec(limit: Int, child: SparkPlan) extends BaseLimitExec {
-
-  override def outputOrdering: Seq[SortOrder] = child.outputOrdering
-
-  override def outputPartitioning: Partitioning = child.outputPartitioning
-}
+case class LocalLimitExec(limit: Int, child: SparkPlan) extends BaseLimitExec
 
 /**
  * Take the first `limit` elements of the child's single output partition.
@@ -169,10 +168,6 @@ case class LocalLimitExec(limit: Int, child: SparkPlan) 
extends BaseLimitExec {
 case class GlobalLimitExec(limit: Int, child: SparkPlan) extends BaseLimitExec 
{
 
   override def requiredChildDistribution: List[Distribution] = AllTuples :: Nil
-
-  override def outputPartitioning: Partitioning = child.outputPartitioning
-
-  override def outputOrdering: Seq[SortOrder] = child.outputOrdering
 }
 
 /**

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.1 updated (da013d0 -> 250c820)

2021-03-21 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git.


from da013d0  [MINOR][DOCS][ML] Doc 'mode' as a supported Imputer strategy 
in Pyspark
 add 250c820  [SPARK-34796][SQL][3.1] Initialize counter variable for LIMIT 
code-gen in doProduce()

No new revisions were added by this update.

Summary of changes:
 .../scala/org/apache/spark/sql/execution/limit.scala  | 12 
 .../scala/org/apache/spark/sql/SQLQuerySuite.scala| 19 +++
 2 files changed, 27 insertions(+), 4 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated: [SPARK-34776][SQL][3.0][2.4] Window class should override producedAttributes

2021-03-19 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new 59e4ae4  [SPARK-34776][SQL][3.0][2.4] Window class should override 
producedAttributes
59e4ae4 is described below

commit 59e4ae4149ff93bd64c8b3210c27dc2fbebe2a96
Author: Liang-Chi Hsieh 
AuthorDate: Sat Mar 20 11:26:01 2021 +0900

[SPARK-34776][SQL][3.0][2.4] Window class should override producedAttributes

### What changes were proposed in this pull request?

This patch proposes to override `producedAttributes` of  `Window` class.

### Why are the changes needed?

This is a backport of #31897 to branch-3.0/2.4. Unlike original PR, nested 
column pruning does not allow pushing through `Window` in branch-3.0/2.4 yet. 
But `Window` doesn't override `producedAttributes`. It's wrong and could cause 
potential issue. So backport `Window` related change.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Existing tests.

Closes #31904 from viirya/SPARK-34776-3.0.

Authored-by: Liang-Chi Hsieh 
Signed-off-by: Takeshi Yamamuro 
(cherry picked from commit 828cf76bced1b70769b0453f3e9ba95faaa84e39)
Signed-off-by: Takeshi Yamamuro 
---
 .../apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala | 2 ++
 1 file changed, 2 insertions(+)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala
index a0086c1..2fe9cd4 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala
@@ -621,6 +621,8 @@ case class Window(
   override def output: Seq[Attribute] =
 child.output ++ windowExpressions.map(_.toAttribute)
 
+  override def producedAttributes: AttributeSet = windowOutputSet
+
   def windowOutputSet: AttributeSet = 
AttributeSet(windowExpressions.map(_.toAttribute))
 }
 

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated (25d7219 -> 828cf76)

2021-03-19 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 25d7219  [SPARK-34719][SQL][3.0] Correctly resolve the view query with 
duplicated column names
 add 828cf76  [SPARK-34776][SQL][3.0][2.4] Window class should override 
producedAttributes

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala | 2 ++
 1 file changed, 2 insertions(+)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (620cae0 -> 2ff0032)

2021-03-19 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 620cae0  [SPARK-33122][SQL] Remove redundant aggregates in the 
Optimzier
 add 2ff0032  [SPARK-34796][SQL] Initialize counter variable for LIMIT 
code-gen in doProduce()

No new revisions were added by this update.

Summary of changes:
 .../scala/org/apache/spark/sql/execution/limit.scala  | 12 
 .../scala/org/apache/spark/sql/SQLQuerySuite.scala| 19 +++
 2 files changed, 27 insertions(+), 4 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (7a8a600 -> 620cae0)

2021-03-19 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 7a8a600  [SPARK-34776][SQL] Nested column pruning should not prune 
Window produced attributes
 add 620cae0  [SPARK-33122][SQL] Remove redundant aggregates in the 
Optimzier

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/analysis/Analyzer.scala |  50 ---
 .../analysis/PullOutNondeterministic.scala |  74 ++
 .../spark/sql/catalyst/optimizer/Optimizer.scala   |  45 ++
 .../plans/logical/basicLogicalOperators.scala  |   2 +-
 .../optimizer/RemoveRedundantAggregatesSuite.scala | 163 +
 .../execution/RemoveRedundantProjectsSuite.scala   |   2 +-
 6 files changed, 284 insertions(+), 52 deletions(-)
 create mode 100644 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/PullOutNondeterministic.scala
 create mode 100644 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/RemoveRedundantAggregatesSuite.scala

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-34719][SQL][3.0] Correctly resolve the view query with duplicated column names

2021-03-19 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 25d7219  [SPARK-34719][SQL][3.0] Correctly resolve the view query with 
duplicated column names
25d7219 is described below

commit 25d72191de7c842aa2acd4b7307ba8e6585dd182
Author: Wenchen Fan 
AuthorDate: Sat Mar 20 11:09:50 2021 +0900

[SPARK-34719][SQL][3.0] Correctly resolve the view query with duplicated 
column names

backport https://github.com/apache/spark/pull/31811 to 3.0

### What changes were proposed in this pull request?

For permanent views (and the new SQL temp view in Spark 3.1), we store the 
view SQL text and re-parse/analyze the view SQL text when reading the view. In 
the case of `SELECT * FROM ...`, we want to avoid view schema change (e.g. the 
referenced table changes its schema) and will record the view query output 
column names when creating the view, so that when reading the view we can add a 
`SELECT recorded_column_names FROM ...` to retain the original view query 
schema.

In Spark 3.1 and before, the final SELECT is added after the analysis 
phase: 
https://github.com/apache/spark/blob/branch-3.1/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/view.scala#L67

If the view query has duplicated output column names, we always pick the 
first column when reading a view. A simple repro:
```
scala> sql("create view c(x, y) as select 1 a, 2 a")
res0: org.apache.spark.sql.DataFrame = []

scala> sql("select * from c").show
+---+---+
|  x|  y|
+---+---+
|  1|  1|
+---+---+
```

In the master branch, we will fail at the view reading time due to 
https://github.com/apache/spark/commit/b891862fb6b740b103d5a09530626ee4e0e8f6e3 
, which adds the final SELECT during analysis, so that the query fails with 
`Reference 'a' is ambiguous`

This PR proposes to resolve the view query output column names from the 
matching attributes by ordinal.

For example,  `create view c(x, y) as select 1 a, 2 a`, the view query 
output column names are `[a, a]`. When we reading the view, there are 2 
matching attributes (e.g.`[a#1, a#2]`) and we can simply match them by ordinal.

A negative example is
```
create table t(a int)
create view v as select *, 1 as col from t
replace table t(a int, col int)
```
When reading the view, the view query output column names are `[a, col]`, 
and there are two matching attributes of `col`, and we should fail the query. 
See the tests for details.

### Why are the changes needed?

bug fix

### Does this PR introduce _any_ user-facing change?

yes

### How was this patch tested?

new test

Closes #31894 from cloud-fan/backport.

Authored-by: Wenchen Fan 
Signed-off-by: Takeshi Yamamuro 
---
 .../apache/spark/sql/catalyst/analysis/view.scala  | 44 ++---
 .../apache/spark/sql/execution/SQLViewSuite.scala  | 45 +-
 2 files changed, 82 insertions(+), 7 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/view.scala 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/view.scala
index 6560164..013a303 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/view.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/view.scala
@@ -17,7 +17,10 @@
 
 package org.apache.spark.sql.catalyst.analysis
 
-import org.apache.spark.sql.catalyst.expressions.Alias
+import java.util.Locale
+
+import org.apache.spark.sql.AnalysisException
+import org.apache.spark.sql.catalyst.expressions.{Alias, Attribute}
 import org.apache.spark.sql.catalyst.plans.logical.{LogicalPlan, Project, View}
 import org.apache.spark.sql.catalyst.rules.Rule
 import org.apache.spark.sql.internal.SQLConf
@@ -60,15 +63,44 @@ object EliminateView extends Rule[LogicalPlan] with 
CastSupport {
 // The child has the different output attributes with the View operator. 
Adds a Project over
 // the child of the view.
 case v @ View(desc, output, child) if child.resolved && 
!v.sameOutput(child) =>
+  // Use the stored view query output column names to find the matching 
attributes. The column
+  // names may have duplication, e.g. `CREATE VIEW v(x, y) AS SELECT 1 
col, 2 col`. We need to
+  // make sure the that matching attributes have the same number of 
duplications, and pick the
+  // corresponding attribute by ordinal.
   val resolver = conf.resolver
   val queryColumnNames = desc.viewQueryColumnNames
   val queryOutput = if (queryColumnNames.nonEmpty) {
-// Find the attribute that has the expected at

[spark] branch branch-3.1 updated (1b70aad -> c2629a7)

2021-03-18 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 1b70aad  [SPARK-34747][SQL][DOCS] Add virtual operators to the 
built-in function document
 add c2629a7  [SPARK-34719][SQL][3.1] Correctly resolve the view query with 
duplicated column names

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/catalyst/analysis/view.scala  | 44 +---
 .../spark/sql/execution/SQLViewTestSuite.scala | 48 ++
 2 files changed, 86 insertions(+), 6 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-34781][SQL] Eliminate LEFT SEMI/ANTI joins to its left child side in AQE

2021-03-18 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 8207e2f  [SPARK-34781][SQL] Eliminate LEFT SEMI/ANTI joins to its left 
child side in AQE
8207e2f is described below

commit 8207e2f65cc2ce2d87ee60ee05a2c1ee896cf93e
Author: Cheng Su 
AuthorDate: Fri Mar 19 09:41:52 2021 +0900

[SPARK-34781][SQL] Eliminate LEFT SEMI/ANTI joins to its left child side in 
AQE

### What changes were proposed in this pull request?

In `EliminateJoinToEmptyRelation.scala`, we can extend it to cover more 
cases for LEFT SEMI and LEFT ANI joins:

* Join is left semi join, join right side is non-empty and condition is 
empty. Eliminate join to its left side.
* Join is left anti join, join right side is empty. Eliminate join to its 
left side.

Given we eliminate join to its left side here, renaming the current 
optimization rule to `EliminateUnnecessaryJoin` instead.
In addition, also change to use `checkRowCount()` to check run time row 
count, instead of using `EmptyHashedRelation`. So this can cover 
`BroadcastNestedLoopJoin` as well. (`BroadcastNestedLoopJoin`'s broadcast side 
is `Array[InternalRow]`, not `HashedRelation`).

### Why are the changes needed?

Cover more join cases, and improve query performance for affected queries.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Added unit tests in `AdaptiveQueryExecSuite.scala`.

Closes #31873 from c21/aqe-join.

Authored-by: Cheng Su 
Signed-off-by: Takeshi Yamamuro 
---
 .../sql/execution/adaptive/AQEOptimizer.scala  |  2 +-
 .../adaptive/EliminateJoinToEmptyRelation.scala| 71 -
 .../adaptive/EliminateUnnecessaryJoin.scala| 91 ++
 .../spark/sql/DynamicPartitionPruningSuite.scala   |  2 +-
 .../adaptive/AdaptiveQueryExecSuite.scala  | 51 
 5 files changed, 127 insertions(+), 90 deletions(-)

diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AQEOptimizer.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AQEOptimizer.scala
index 04b8ade..901637d 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AQEOptimizer.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AQEOptimizer.scala
@@ -29,7 +29,7 @@ class AQEOptimizer(conf: SQLConf) extends 
RuleExecutor[LogicalPlan] {
   private val defaultBatches = Seq(
 Batch("Demote BroadcastHashJoin", Once,
   DemoteBroadcastHashJoin),
-Batch("Eliminate Join to Empty Relation", Once, 
EliminateJoinToEmptyRelation)
+Batch("Eliminate Unnecessary Join", Once, EliminateUnnecessaryJoin)
   )
 
   final override protected def batches: Seq[Batch] = {
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/EliminateJoinToEmptyRelation.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/EliminateJoinToEmptyRelation.scala
deleted file mode 100644
index d6df522..000
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/EliminateJoinToEmptyRelation.scala
+++ /dev/null
@@ -1,71 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one or more
- * contributor license agreements.  See the NOTICE file distributed with
- * this work for additional information regarding copyright ownership.
- * The ASF licenses this file to You under the Apache License, Version 2.0
- * (the "License"); you may not use this file except in compliance with
- * the License.  You may obtain a copy of the License at
- *
- *http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.spark.sql.execution.adaptive
-
-import 
org.apache.spark.sql.catalyst.planning.ExtractSingleColumnNullAwareAntiJoin
-import org.apache.spark.sql.catalyst.plans.{Inner, LeftAnti, LeftSemi}
-import org.apache.spark.sql.catalyst.plans.logical.{Join, LocalRelation, 
LogicalPlan}
-import org.apache.spark.sql.catalyst.rules.Rule
-import org.apache.spark.sql.execution.joins.{EmptyHashedRelation, 
HashedRelation, HashedRelationWithAllNullKeys}
-
-/**
- * This optimization rule detects and converts a Join to an empty 
[[LocalRelation]]:
- * 1. Join is single column NULL-aware anti join (NAAJ), and broadcasted 
[[HashedRelation]]
- *is [[HashedRelationWithAllNullKeys]].
- *
- * 2. Join is in

[spark] branch branch-3.1 updated: [SPARK-34749][SQL][3.1] Simplify ResolveCreateNamedStruct

2021-03-17 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.1 by this push:
 new 448b8d0  [SPARK-34749][SQL][3.1] Simplify ResolveCreateNamedStruct
448b8d0 is described below

commit 448b8d07df41040058c21e6102406e1656727599
Author: Wenchen Fan 
AuthorDate: Thu Mar 18 07:44:11 2021 +0900

[SPARK-34749][SQL][3.1] Simplify ResolveCreateNamedStruct

backports https://github.com/apache/spark/pull/31843

### What changes were proposed in this pull request?

This is a follow-up of https://github.com/apache/spark/pull/31808 and 
simplifies its fix to one line (excluding comments).

### Why are the changes needed?

code simplification

### Does this PR introduce _any_ user-facing change?

no

### How was this patch tested?

N/A

Closes #31867 from cloud-fan/backport.

Authored-by: Wenchen Fan 
Signed-off-by: Takeshi Yamamuro 
---
 .../org/apache/spark/sql/catalyst/analysis/Analyzer.scala |  2 --
 .../spark/sql/catalyst/expressions/complexTypeCreator.scala   | 10 +-
 .../sql/catalyst/expressions/complexTypeExtractors.scala  | 11 +--
 .../spark/sql/catalyst/parser/ExpressionParserSuite.scala |  2 +-
 4 files changed, 11 insertions(+), 14 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
index f98f33b..f4cdeab 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
@@ -3840,8 +3840,6 @@ object ResolveCreateNamedStruct extends Rule[LogicalPlan] 
{
   val children = e.children.grouped(2).flatMap {
 case Seq(NamePlaceholder, e: NamedExpression) if e.resolved =>
   Seq(Literal(e.name), e)
-case Seq(NamePlaceholder, e: ExtractValue) if e.resolved && 
e.name.isDefined =>
-  Seq(Literal(e.name.get), e)
 case kv =>
   kv
   }
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala
index cb59fbd..1779d41 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala
@@ -20,7 +20,7 @@ package org.apache.spark.sql.catalyst.expressions
 import scala.collection.mutable.ArrayBuffer
 
 import org.apache.spark.sql.catalyst.InternalRow
-import org.apache.spark.sql.catalyst.analysis.{Resolver, TypeCheckResult, 
TypeCoercion, UnresolvedExtractValue}
+import org.apache.spark.sql.catalyst.analysis.{Resolver, TypeCheckResult, 
TypeCoercion, UnresolvedAttribute, UnresolvedExtractValue}
 import org.apache.spark.sql.catalyst.analysis.FunctionRegistry.{FUNC_ALIAS, 
FunctionBuilder}
 import org.apache.spark.sql.catalyst.expressions.codegen._
 import org.apache.spark.sql.catalyst.expressions.codegen.Block._
@@ -336,6 +336,14 @@ object CreateStruct {
*/
   def apply(children: Seq[Expression]): CreateNamedStruct = {
 CreateNamedStruct(children.zipWithIndex.flatMap {
+  // For multi-part column name like `struct(a.b.c)`, it may be resolved 
into:
+  //   1. Attribute if `a.b.c` is simply a qualified column name.
+  //   2. GetStructField if `a.b` refers to a struct-type column.
+  //   3. GetArrayStructFields if `a.b` refers to a array-of-struct-type 
column.
+  //   4. GetMapValue if `a.b` refers to a map-type column.
+  // We should always use the last part of the column name (`c` in the 
above example) as the
+  // alias name inside CreateNamedStruct.
+  case (u: UnresolvedAttribute, _) => Seq(Literal(u.nameParts.last), u)
   case (e: NamedExpression, _) if e.resolved => Seq(Literal(e.name), e)
   case (e: NamedExpression, _) => Seq(NamePlaceholder, e)
   case (e, index) => Seq(Literal(s"col${index + 1}"), e)
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeExtractors.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeExtractors.scala
index 9b80140..ef247ef 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeExtractors.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeExtractors.scala
@@ -94,10 +94,7 @@ object ExtractValue {
   }
 }
 
-trait ExtractValue extends Expression {
-  // The name that is used to extract the value.
-  def name: Option[String]
-}

[spark] branch master updated (bf4570b -> 9f7b0a0)

2021-03-17 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from bf4570b  [SPARK-34749][SQL] Simplify ResolveCreateNamedStruct
 add 9f7b0a0  [SPARK-34758][SQL] Simplify Analyzer.resolveLiteralFunction

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/analysis/Analyzer.scala | 29 ++
 1 file changed, 7 insertions(+), 22 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (48637a9 -> bf4570b)

2021-03-17 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 48637a9  [SPARK-34766][SQL] Do not capture maven config for views
 add bf4570b  [SPARK-34749][SQL] Simplify ResolveCreateNamedStruct

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/catalyst/analysis/Analyzer.scala  |  2 --
 .../sql/catalyst/expressions/complexTypeCreator.scala  | 10 +-
 .../sql/catalyst/expressions/complexTypeExtractors.scala   | 14 +-
 .../spark/sql/catalyst/parser/ExpressionParserSuite.scala  |  2 +-
 4 files changed, 11 insertions(+), 17 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-34665][SQL][DOCS] Revise the type coercion section of ANSI Compliance

2021-03-08 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new ee756fd  [SPARK-34665][SQL][DOCS] Revise the type coercion section of 
ANSI Compliance
ee756fd is described below

commit ee756fd69528f90f63ffd45edc821c6b69a8a35e
Author: Gengliang Wang 
AuthorDate: Tue Mar 9 13:19:14 2021 +0900

[SPARK-34665][SQL][DOCS] Revise the type coercion section of ANSI Compliance

### What changes were proposed in this pull request?

1. Fix the table of valid type coercion combinations. Binary type should be 
allowed casting to String type and disallowed casting to Numeric types.
2. Summary all the `CAST`s that can cause runtime exceptions.

### Why are the changes needed?

Fix a mistake in the docs.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Run `jekyll serve` and preview:


![image](https://user-images.githubusercontent.com/1097932/110334374-8fab5a80-7fd7-11eb-86e7-c519cfa41b99.png)

Closes #31781 from gengliangwang/reviseAnsiDoc2.

Authored-by: Gengliang Wang 
Signed-off-by: Takeshi Yamamuro 
---
 docs/sql-ref-ansi-compliance.md | 22 +-
 1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/docs/sql-ref-ansi-compliance.md b/docs/sql-ref-ansi-compliance.md
index 99e230b..4b3ff46 100644
--- a/docs/sql-ref-ansi-compliance.md
+++ b/docs/sql-ref-ansi-compliance.md
@@ -72,16 +72,23 @@ The type conversion of Spark ANSI mode follows the syntax 
rules of section 6.13
 
 | Source\Target | Numeric | String | Date | Timestamp | Interval | Boolean | 
Binary | Array | Map | Struct |
 
|---|-||--|---|--|-||---|-||
-| Numeric   | Y   | Y  | N| N | N| Y   | N 
 | N | N   | N  |
-| String| Y   | Y  | Y| Y | Y| Y   | Y 
 | N | N   | N  |
+| Numeric   | **Y** | Y  | N| N 
| N| Y   | N  | N | N   | N  |
+| String| **Y** | Y | **Y** | **Y** | **Y** | **Y** | Y | N   
  | N   | N  |
 | Date  | N   | Y  | Y| Y | N| N   | N 
 | N | N   | N  |
 | Timestamp | N   | Y  | Y| Y | N| N   | N 
 | N | N   | N  |
 | Interval  | N   | Y  | N| N | Y| N   | N 
 | N | N   | N  |
 | Boolean   | Y   | Y  | N| N | N| Y   | N 
 | N | N   | N  |
-| Binary| Y   | N  | N| N | N| N   | Y 
 | N | N   | N  |
-| Array | N   | N  | N| N | N| N   | N 
 | Y | N   | N  |
-| Map   | N   | N  | N| N | N| N   | N 
 | N | Y   | N  |
-| Struct| N   | N  | N| N | N| N   | N 
 | N | N   | Y  |
+| Binary| N   | Y  | N| N | N| N   | Y 
 | N | N   | N  |
+| Array | N   | N  | N| N | N| N   | N 
 | **Y** | N   | N  |
+| Map   | N   | N  | N| N | N| N   | N 
 | N | **Y** | N  |
+| Struct| N   | N  | N| N | N| N   | N 
 | N | N   | **Y** |
+
+In the table above, all the `CAST`s that can cause runtime exceptions are 
marked as red **Y**:
+* CAST(Numeric AS Numeric): raise an overflow exception if the value is out of 
the target data type's range.
+* CAST(String AS (Numeric/Date/Timestamp/Interval/Boolean)): raise a runtime 
exception if the value can't be parsed as the target data type.
+* CAST(Array AS Array): raise an exception if there is any on the conversion 
of the elements.
+* CAST(Map AS Map): raise an exception if there is any on the conversion of 
the keys and the values.
+* CAST(Struct AS Struct): raise an exception if there is any on the conversion 
of the struct fields.
 
 Currently, the ANSI mode affects explicit casting and assignment casting only.
 In future releases, the behaviour of type coercion might change along with the 
other two type conversion rules.
@@ -163,9 +170,6 @@ The behavior of some SQL functions can be different under 
ANSI mode (`spark.sql.
 The behavior of some SQL operators can be different under ANSI mode 
(`spark.sql.ansi.enabled=true`).
   - `array_col[index]`: This operator throws `ArrayIndexOutOfBoundsException` 
if using invalid indices.
   - `map_col[key]`: This operator throws `NoSuchElementException` if key does 
not exist in map.
-  - `CAST(string_col AS TIMESTAMP)`: This operator should fail with an 
except

[spark] branch master updated (f72b906 -> 1a97224)

2021-03-06 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from f72b906  [SPARK-34643][R][DOCS] Use CRAN URL in canonical form
 add 1a97224  [SPARK-34595][SQL] DPP support RLIKE

No new revisions were added by this update.

Summary of changes:
 .../dynamicpruning/PartitionPruning.scala  |  2 +-
 .../spark/sql/DynamicPartitionPruningSuite.scala   | 26 ++
 2 files changed, 27 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (9ac5ee2e -> dbce74d)

2021-03-04 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 9ac5ee2e [SPARK-32924][WEBUI] Make duration column in master UI sorted 
in the correct order
 add dbce74d  [SPARK-34607][SQL] Add `Utils.isMemberClass` to fix a 
malformed class name error on jdk8u

No new revisions were added by this update.

Summary of changes:
 .../main/scala/org/apache/spark/util/Utils.scala   | 28 +
 .../spark/sql/catalyst/encoders/OuterScopes.scala  |  2 +-
 .../sql/catalyst/expressions/objects/objects.scala |  2 +-
 .../catalyst/encoders/ExpressionEncoderSuite.scala | 70 ++
 4 files changed, 100 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (499f620 -> 56edb81)

2021-03-02 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 499f620  [MINOR][SQL][DOCS] Fix some wrong default values in SQL 
tuning guide's AQE section
 add 56edb81  [SPARK-33474][SQL] Support TypeConstructed partition spec 
value

No new revisions were added by this update.

Summary of changes:
 docs/sql-migration-guide.md|  2 +
 docs/sql-ref-syntax-ddl-alter-table.md |  8 ++--
 docs/sql-ref-syntax-dml-insert-into.md | 15 ++-
 docs/sql-ref-syntax-dml-insert-overwrite-table.md  | 25 ++-
 .../spark/sql/catalyst/parser/AstBuilder.scala | 14 +--
 .../spark/sql/catalyst/parser/DDLParserSuite.scala | 30 --
 .../org/apache/spark/sql/SQLInsertTestSuite.scala  | 48 ++
 .../command/AlterTableAddPartitionSuiteBase.scala  |  8 
 .../command/AlterTableDropPartitionSuiteBase.scala | 10 +
 .../AlterTableRenamePartitionSuiteBase.scala   | 11 +
 10 files changed, 158 insertions(+), 13 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (499cc79 -> b13a4b8)

2021-03-01 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 499cc79  [SPARK-34503][DOCS][FOLLOWUP] Document available codecs for 
event log compression
 add b13a4b8  [SPARK-34573][SQL] Avoid global locking in SQLConf object for 
sqlConfEntries map

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/internal/SQLConf.scala| 22 +++---
 1 file changed, 11 insertions(+), 11 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (d574308 -> 1afe284)

2021-02-28 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from d574308  [SPARK-34579][SQL][TEST] Fix wrong UT in SQLQuerySuite
 add 1afe284  [SPARK-34570][SQL] Remove dead code from constructors of 
[Hive]SessionStateBuilder

No new revisions were added by this update.

Summary of changes:
 .../src/main/scala/org/apache/spark/sql/SparkSession.scala| 11 ---
 .../apache/spark/sql/internal/BaseSessionStateBuilder.scala   |  3 +--
 .../scala/org/apache/spark/sql/internal/SessionState.scala|  7 +++
 .../test/scala/org/apache/spark/sql/test/TestSQLContext.scala |  9 -
 .../org/apache/spark/sql/hive/HiveSessionStateBuilder.scala   |  7 +++
 .../test/scala/org/apache/spark/sql/hive/test/TestHive.scala  |  9 -
 6 files changed, 19 insertions(+), 27 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-34506][CORE] ADD JAR with ivy coordinates should be compatible with Hive transitive behavior

2021-02-28 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 0216051  [SPARK-34506][CORE] ADD JAR with ivy coordinates should be 
compatible with Hive transitive behavior
0216051 is described below

commit 0216051acadedcc7e9bcd840aa78776159b200d1
Author: Shardul Mahadik 
AuthorDate: Mon Mar 1 09:10:20 2021 +0900

[SPARK-34506][CORE] ADD JAR with ivy coordinates should be compatible with 
Hive transitive behavior

### What changes were proposed in this pull request?
SPARK-33084 added the ability to use ivy coordinates with 
`SparkContext.addJar`. PR #29966 claims to mimic Hive behavior although I found 
a few cases where it doesn't

1) The default value of the transitive parameter is false, both in case of 
parameter not being specified in coordinate or parameter value being invalid. 
The Hive behavior is that transitive is [true if not 
specified](https://github.com/apache/hive/blob/cb2ac3dcc6af276c6f64ee00f034f082fe75222b/ql/src/java/org/apache/hadoop/hive/ql/util/DependencyResolver.java#L169)
 in the coordinate and [false for invalid 
values](https://github.com/apache/hive/blob/cb2ac3dcc6af276c6f64ee00f034f082fe752
 [...]

2) The parameter value for transitive parameter is regarded as 
case-sensitive [based on the 
understanding](https://github.com/apache/spark/pull/29966#discussion_r547752259)
 that Hive behavior is case-sensitive. However, this is not correct, Hive 
[treats the parameter value 
case-insensitively](https://github.com/apache/hive/blob/cb2ac3dcc6af276c6f64ee00f034f082fe75222b/ql/src/java/org/apache/hadoop/hive/ql/util/DependencyResolver.java#L122).

I propose that we be compatible with Hive for these behaviors

### Why are the changes needed?
To make `ADD JAR` with ivy coordinates compatible with Hive's transitive 
behavior

### Does this PR introduce _any_ user-facing change?

The user-facing changes here are within master as the feature introduced in 
SPARK-33084 has not been released yet
1. Previously an ivy coordinate without `transitive` parameter specified 
did not resolve transitive dependency, now it does.
2. Previously an `transitive` parameter value was treated case-sensitively. 
e.g. `transitive=TRUE` would be treated as false as it did not match exactly 
`true`. Now it will be treated case-insensitively.

### How was this patch tested?

Modified existing unit tests to test new behavior
Add new unit test to cover usage of `exclude` with unspecified `transitive`

Closes #31623 from shardulm94/spark-34506.

Authored-by: Shardul Mahadik 
Signed-off-by: Takeshi Yamamuro 
---
 .../org/apache/spark/util/DependencyUtils.scala| 14 +
 .../scala/org/apache/spark/SparkContextSuite.scala | 33 +++---
 docs/sql-ref-syntax-aux-resource-mgmt-add-jar.md   |  2 +-
 .../scala/org/apache/spark/sql/SQLQuerySuite.scala |  8 +++---
 .../spark/sql/hive/execution/HiveQuerySuite.scala  |  7 -
 5 files changed, 42 insertions(+), 22 deletions(-)

diff --git a/core/src/main/scala/org/apache/spark/util/DependencyUtils.scala 
b/core/src/main/scala/org/apache/spark/util/DependencyUtils.scala
index 60e866a..f7135edd 100644
--- a/core/src/main/scala/org/apache/spark/util/DependencyUtils.scala
+++ b/core/src/main/scala/org/apache/spark/util/DependencyUtils.scala
@@ -59,8 +59,9 @@ private[spark] object DependencyUtils extends Logging {
* @param uri Ivy URI need to be downloaded.
* @return Tuple value of parameter `transitive` and `exclude` value.
*
-   * 1. transitive: whether to download dependency jar of Ivy URI, 
default value is false
-   *and this parameter value is case-sensitive. Invalid value will 
be treat as false.
+   * 1. transitive: whether to download dependency jar of Ivy URI, 
default value is true
+   *and this parameter value is case-insensitive. This mimics 
Hive's behaviour for
+   *parsing the transitive parameter. Invalid value will be treat 
as false.
*Example: Input:  
exclude=org.mortbay.jetty:jetty&transitive=true
*Output:  true
*
@@ -72,7 +73,7 @@ private[spark] object DependencyUtils extends Logging {
   private def parseQueryParams(uri: URI): (Boolean, String) = {
 val uriQuery = uri.getQuery
 if (uriQuery == null) {
-  (false, "")
+  (true, "")
 } else {
   val mapTokens = uriQuery.split("&").map(_.split("="))
   if (mapTokens.exists(isInvalidQueryString)) {
@@ -81,14 +82,15 @@ private[spark] object DependencyUtils extends Logging {
   }
   val groupedParams = mapTokens.map(kv => (kv(0), kv(1))).groupBy(_._1)
 
-  // Parse transitive paramet

[spark] branch master updated (5a48eb8 -> d07fc30)

2021-02-28 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 5a48eb8  [SPARK-34415][ML] Python example
 add d07fc30  [SPARK-33687][SQL] Support analyze all tables in a specific 
database

No new revisions were added by this update.

Summary of changes:
 docs/_data/menu-sql.yaml   |   2 +
 docs/sql-ref-syntax-aux-analyze-table.md   |   6 +-
 docs/sql-ref-syntax-aux-analyze-tables.md  | 110 +
 docs/sql-ref-syntax-aux-analyze.md |   1 +
 docs/sql-ref-syntax.md |   1 +
 .../apache/spark/sql/catalyst/parser/SqlBase.g4|   2 +
 .../spark/sql/catalyst/analysis/Analyzer.scala |   2 +
 .../spark/sql/catalyst/parser/AstBuilder.scala |  19 
 .../sql/catalyst/plans/logical/v2Commands.scala|   9 ++
 .../spark/sql/catalyst/parser/DDLParserSuite.scala |   9 ++
 .../catalyst/analysis/ResolveSessionCatalog.scala  |   3 +
 .../execution/command/AnalyzeTableCommand.scala|  36 +--
 .../{cache.scala => AnalyzeTablesCommand.scala}|  22 -
 .../spark/sql/execution/command/CommandUtils.scala |  37 ++-
 .../spark/sql/StatisticsCollectionSuite.scala  |  37 +++
 15 files changed, 257 insertions(+), 39 deletions(-)
 create mode 100644 docs/sql-ref-syntax-aux-analyze-tables.md
 copy 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/{cache.scala => 
AnalyzeTablesCommand.scala} (60%)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (56e664c -> 05069ff)

2021-02-26 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 56e664c  [SPARK-34392][SQL] Support ZoneOffset +h:mm in DateTimeUtils. 
getZoneId
 add 05069ff  [SPARK-34353][SQL] CollectLimitExec avoid shuffle if input 
rdd has 0/1 partition

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/execution/limit.scala | 76 +-
 .../sql/execution/TakeOrderedAndProjectSuite.scala | 56 +---
 2 files changed, 78 insertions(+), 54 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-33971][SQL] Eliminate distinct from more aggregates

2021-02-26 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 67ec4f7  [SPARK-33971][SQL] Eliminate distinct from more aggregates
67ec4f7 is described below

commit 67ec4f7f67dc494c2619b7faf1b1145f2200b65c
Author: tanel.k...@gmail.com 
AuthorDate: Fri Feb 26 21:59:02 2021 +0900

[SPARK-33971][SQL] Eliminate distinct from more aggregates

### What changes were proposed in this pull request?

Add more aggregate expressions to `EliminateDistinct` rule.

### Why are the changes needed?

Distinct aggregation can add a significant overhead. It's better to remove 
distinct whenever possible.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

UT

Closes #30999 from tanelk/SPARK-33971_eliminate_distinct.

Authored-by: tanel.k...@gmail.com 
Signed-off-by: Takeshi Yamamuro 
---
 .../spark/sql/catalyst/optimizer/Optimizer.scala   | 16 ++---
 .../optimizer/EliminateDistinctSuite.scala | 41 +++---
 2 files changed, 32 insertions(+), 25 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
index 717770f..cb24180 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
@@ -352,11 +352,17 @@ abstract class Optimizer(catalogManager: CatalogManager)
  */
 object EliminateDistinct extends Rule[LogicalPlan] {
   override def apply(plan: LogicalPlan): LogicalPlan = plan 
transformExpressions  {
-case ae: AggregateExpression if ae.isDistinct =>
-  ae.aggregateFunction match {
-case _: Max | _: Min => ae.copy(isDistinct = false)
-case _ => ae
-  }
+case ae: AggregateExpression if ae.isDistinct && 
isDuplicateAgnostic(ae.aggregateFunction) =>
+  ae.copy(isDistinct = false)
+  }
+
+  private def isDuplicateAgnostic(af: AggregateFunction): Boolean = af match {
+case _: Max => true
+case _: Min => true
+case _: BitAndAgg => true
+case _: BitOrAgg => true
+case _: CollectSet => true
+case _ => false
   }
 }
 
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/EliminateDistinctSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/EliminateDistinctSuite.scala
index 51c7519..0848d56 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/EliminateDistinctSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/EliminateDistinctSuite.scala
@@ -18,6 +18,8 @@ package org.apache.spark.sql.catalyst.optimizer
 
 import org.apache.spark.sql.catalyst.dsl.expressions._
 import org.apache.spark.sql.catalyst.dsl.plans._
+import org.apache.spark.sql.catalyst.expressions.Expression
+import org.apache.spark.sql.catalyst.expressions.aggregate._
 import org.apache.spark.sql.catalyst.plans.PlanTest
 import org.apache.spark.sql.catalyst.plans.logical.{LocalRelation, LogicalPlan}
 import org.apache.spark.sql.catalyst.rules.RuleExecutor
@@ -32,25 +34,24 @@ class EliminateDistinctSuite extends PlanTest {
 
   val testRelation = LocalRelation('a.int)
 
-  test("Eliminate Distinct in Max") {
-val query = testRelation
-  .select(maxDistinct('a).as('result))
-  .analyze
-val answer = testRelation
-  .select(max('a).as('result))
-  .analyze
-assert(query != answer)
-comparePlans(Optimize.execute(query), answer)
-  }
-
-  test("Eliminate Distinct in Min") {
-val query = testRelation
-  .select(minDistinct('a).as('result))
-  .analyze
-val answer = testRelation
-  .select(min('a).as('result))
-  .analyze
-assert(query != answer)
-comparePlans(Optimize.execute(query), answer)
+  Seq(
+Max(_),
+Min(_),
+BitAndAgg(_),
+BitOrAgg(_),
+CollectSet(_: Expression)
+  ).foreach {
+aggBuilder =>
+  val agg = aggBuilder('a)
+  test(s"Eliminate Distinct in ${agg.prettyName}") {
+val query = testRelation
+  .select(agg.toAggregateExpression(isDistinct = true).as('result))
+  .analyze
+val answer = testRelation
+  .select(agg.toAggregateExpression(isDistinct = false).as('result))
+  .analyze
+assert(query != answer)
+comparePlans(Optimize.execute(query), answer)
+  }
   }
 }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (e6753c9 -> 0a37a95)

2021-02-09 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from e6753c9  [SPARK-33995][SQL] Expose make_interval as a Scala function
 add 0a37a95  [SPARK-31816][SQL][DOCS] Added high level description about 
JDBC connection providers for users/developers

No new revisions were added by this update.

Summary of changes:
 docs/sql-data-sources-jdbc.md  | 14 
 ...rg.apache.spark.sql.jdbc.JdbcConnectionProvider |  1 +
 .../sql/jdbc/ExampleJdbcConnectionProvider.scala   | 19 +++--
 .../main/scala/org/apache/spark/sql/jdbc/README.md | 84 ++
 4 files changed, 108 insertions(+), 10 deletions(-)
 create mode 100644 
examples/src/main/resources/META-INF/services/org.apache.spark.sql.jdbc.JdbcConnectionProvider
 copy 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/jdbc/connection/IntentionallyFaultyConnectionProvider.scala
 => 
examples/src/main/scala/org/apache/spark/examples/sql/jdbc/ExampleJdbcConnectionProvider.scala
 (70%)
 create mode 100644 sql/core/src/main/scala/org/apache/spark/sql/jdbc/README.md


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (3e12e9d -> f79305a)

2021-02-09 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 3e12e9d  [SPARK-34238][SQL][FOLLOW_UP] SHOW PARTITIONS Keep 
consistence with other `SHOW` command
 add f79305a  [SPARK-34311][SQL] PostgresDialect can't treat arrays of some 
types

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/jdbc/PostgresIntegrationSuite.scala  | 45 +-
 .../apache/spark/sql/jdbc/PostgresDialect.scala|  5 ++-
 2 files changed, 47 insertions(+), 3 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (76baaf7 -> 1f4135c)

2021-02-05 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 76baaf7  [SPARK-32985][SQL] Decouple bucket scan and bucket filter 
pruning for data source v1
 add 1f4135c  [SPARK-34350][SQL][TESTS] replace withTimeZone defined in 
OracleIntegrationSuite with DateTimeTestUtils.withDefaultTimeZone

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/jdbc/OracleIntegrationSuite.scala| 22 +++---
 1 file changed, 3 insertions(+), 19 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (361d702 -> 55399eb)

2021-02-04 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 361d702  [SPARK-34359][SQL] Add a legacy config to restore the output 
schema of SHOW DATABASES
 add 55399eb  [SPARK-34343][SQL][TESTS] Add missing test for some non-array 
types in PostgreSQL

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/jdbc/PostgresIntegrationSuite.scala  | 75 --
 1 file changed, 69 insertions(+), 6 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (5acc5b8 -> 66f3480)

2021-02-02 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 5acc5b8  [SPARK-34323][BUILD] Upgrade zstd-jni to 1.4.8-3
 add 66f3480  [SPARK-34318][SQL] Dataset.colRegex should work with column 
names and qualifiers which contain newlines

No new revisions were added by this update.

Summary of changes:
 .../scala/org/apache/spark/sql/catalyst/parser/ParserUtils.scala | 4 ++--
 .../src/test/scala/org/apache/spark/sql/DataFrameSuite.scala | 9 +
 2 files changed, 11 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 917 matches

Mail list logo