date:20220110

[spark] branch branch-3.0 updated: [SPARK-37860][UI] Fix taskindex in the stage page task event timeline

2022-01-10 Thread sarutak

This is an automated email from the ASF dual-hosted git repository.

sarutak pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 755d11d  [SPARK-37860][UI] Fix taskindex in the stage page task event 
timeline
755d11d is described below

commit 755d11d0d1479f5441c6ead2cc6142bab45d6e16
Author: stczwd 
AuthorDate: Tue Jan 11 15:23:12 2022 +0900

[SPARK-37860][UI] Fix taskindex in the stage page task event timeline

### What changes were proposed in this pull request?
This reverts commit 450b415028c3b00f3a002126cd11318d3932e28f.

### Why are the changes needed?
In #32888, shahidki31 change taskInfo.index to taskInfo.taskId. However, we 
generally use `index.attempt` or `taskId` to distinguish tasks within a stage, 
not `taskId.attempt`.
Thus #32888 was a wrong fix issue, we should revert it.

### Does this PR introduce _any_ user-facing change?
no

### How was this patch tested?
origin test suites

Closes #35160 from stczwd/SPARK-37860.

Authored-by: stczwd 
Signed-off-by: Kousuke Saruta 
(cherry picked from commit 3d2fde5242c8989688c176b8ed5eb0bff5e1f17f)
Signed-off-by: Kousuke Saruta 
---
 core/src/main/scala/org/apache/spark/ui/jobs/StagePage.scala | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/core/src/main/scala/org/apache/spark/ui/jobs/StagePage.scala 
b/core/src/main/scala/org/apache/spark/ui/jobs/StagePage.scala
index e9eb62e..ccaa70b 100644
--- a/core/src/main/scala/org/apache/spark/ui/jobs/StagePage.scala
+++ b/core/src/main/scala/org/apache/spark/ui/jobs/StagePage.scala
@@ -352,7 +352,7 @@ private[ui] class StagePage(parent: StagesTab, store: 
AppStatusStore) extends We
|'content': '
+ |data-title="${s"Task " + index + " (attempt " + attempt + 
")"}
  |Status: ${taskInfo.status}
  |Launch Time: ${UIUtils.formatDate(new Date(launchTime))}
  |${

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.1 updated: [SPARK-37860][UI] Fix taskindex in the stage page task event timeline

2022-01-10 Thread sarutak

This is an automated email from the ASF dual-hosted git repository.

sarutak pushed a commit to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.1 by this push:
 new 830d5b6  [SPARK-37860][UI] Fix taskindex in the stage page task event 
timeline
830d5b6 is described below

commit 830d5b650ce9ac00f2a64bbf3e7fe9d31b02e51d
Author: stczwd 
AuthorDate: Tue Jan 11 15:23:12 2022 +0900

[SPARK-37860][UI] Fix taskindex in the stage page task event timeline

### What changes were proposed in this pull request?
This reverts commit 450b415028c3b00f3a002126cd11318d3932e28f.

### Why are the changes needed?
In #32888, shahidki31 change taskInfo.index to taskInfo.taskId. However, we 
generally use `index.attempt` or `taskId` to distinguish tasks within a stage, 
not `taskId.attempt`.
Thus #32888 was a wrong fix issue, we should revert it.

### Does this PR introduce _any_ user-facing change?
no

### How was this patch tested?
origin test suites

Closes #35160 from stczwd/SPARK-37860.

Authored-by: stczwd 
Signed-off-by: Kousuke Saruta 
(cherry picked from commit 3d2fde5242c8989688c176b8ed5eb0bff5e1f17f)
Signed-off-by: Kousuke Saruta 
---
 core/src/main/scala/org/apache/spark/ui/jobs/StagePage.scala | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/core/src/main/scala/org/apache/spark/ui/jobs/StagePage.scala 
b/core/src/main/scala/org/apache/spark/ui/jobs/StagePage.scala
index 459e09a..47ba951 100644
--- a/core/src/main/scala/org/apache/spark/ui/jobs/StagePage.scala
+++ b/core/src/main/scala/org/apache/spark/ui/jobs/StagePage.scala
@@ -355,7 +355,7 @@ private[ui] class StagePage(parent: StagesTab, store: 
AppStatusStore) extends We
|'content': '
+ |data-title="${s"Task " + index + " (attempt " + attempt + 
")"}
  |Status: ${taskInfo.status}
  |Launch Time: ${UIUtils.formatDate(new Date(launchTime))}
  |${

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.2 updated: [SPARK-37860][UI] Fix taskindex in the stage page task event timeline

2022-01-10 Thread sarutak

This is an automated email from the ASF dual-hosted git repository.

sarutak pushed a commit to branch branch-3.2
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.2 by this push:
 new db1023c  [SPARK-37860][UI] Fix taskindex in the stage page task event 
timeline
db1023c is described below

commit db1023c728c5e0bdcd4ef457cf5f7ba4f13cb79d
Author: stczwd 
AuthorDate: Tue Jan 11 15:23:12 2022 +0900

[SPARK-37860][UI] Fix taskindex in the stage page task event timeline

### What changes were proposed in this pull request?
This reverts commit 450b415028c3b00f3a002126cd11318d3932e28f.

### Why are the changes needed?
In #32888, shahidki31 change taskInfo.index to taskInfo.taskId. However, we 
generally use `index.attempt` or `taskId` to distinguish tasks within a stage, 
not `taskId.attempt`.
Thus #32888 was a wrong fix issue, we should revert it.

### Does this PR introduce _any_ user-facing change?
no

### How was this patch tested?
origin test suites

Closes #35160 from stczwd/SPARK-37860.

Authored-by: stczwd 
Signed-off-by: Kousuke Saruta 
(cherry picked from commit 3d2fde5242c8989688c176b8ed5eb0bff5e1f17f)
Signed-off-by: Kousuke Saruta 
---
 core/src/main/scala/org/apache/spark/ui/jobs/StagePage.scala | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/core/src/main/scala/org/apache/spark/ui/jobs/StagePage.scala 
b/core/src/main/scala/org/apache/spark/ui/jobs/StagePage.scala
index 81dfe83..777a6b0 100644
--- a/core/src/main/scala/org/apache/spark/ui/jobs/StagePage.scala
+++ b/core/src/main/scala/org/apache/spark/ui/jobs/StagePage.scala
@@ -355,7 +355,7 @@ private[ui] class StagePage(parent: StagesTab, store: 
AppStatusStore) extends We
|'content': '
+ |data-title="${s"Task " + index + " (attempt " + attempt + 
")"}
  |Status: ${taskInfo.status}
  |Launch Time: ${UIUtils.formatDate(new Date(launchTime))}
  |${

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (7463564 -> 3d2fde5)

2022-01-10 Thread sarutak

This is an automated email from the ASF dual-hosted git repository.

sarutak pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 7463564  [SPARK-37847][CORE][SHUFFLE] PushBlockStreamCallback#isStale 
should check null to avoid NPE
 add 3d2fde5  [SPARK-37860][UI] Fix taskindex in the stage page task event 
timeline

No new revisions were added by this update.

Summary of changes:
 core/src/main/scala/org/apache/spark/ui/jobs/StagePage.scala | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-37847][CORE][SHUFFLE] PushBlockStreamCallback#isStale should check null to avoid NPE

2022-01-10 Thread mridulm80

This is an automated email from the ASF dual-hosted git repository.

mridulm80 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 7463564  [SPARK-37847][CORE][SHUFFLE] PushBlockStreamCallback#isStale 
should check null to avoid NPE
7463564 is described below

commit 74635649b230fd76199efe3da4bd3e4112894c4a
Author: Cheng Pan 
AuthorDate: Mon Jan 10 23:04:39 2022 -0600

[SPARK-37847][CORE][SHUFFLE] PushBlockStreamCallback#isStale should check 
null to avoid NPE

### What changes were proposed in this pull request?

Check `null` in `isStale` to avoid NPE.

### Why are the changes needed?

There is a chance that late push shuffle block request invokes 
`PushBlockStreamCallback#onData` after the merged partition finalized, which 
causes NPE.

```
2022-01-07 21:06:14,464 INFO shuffle.RemoteBlockPushResolver: shuffle 
partition application_1640143179334_0149_-1 102 6922, chunk_size=1, 
meta_length=138, data_length=112632
2022-01-07 21:06:14,615 ERROR shuffle.RemoteBlockPushResolver: Encountered 
issue when merging shufflePush_102_0_279_6922
java.lang.NullPointerException
at 
org.apache.spark.network.shuffle.RemoteBlockPushResolver$AppShuffleMergePartitionsInfo.access$200(RemoteBlockPushResolver.java:1017)
at 
org.apache.spark.network.shuffle.RemoteBlockPushResolver$PushBlockStreamCallback.isStale(RemoteBlockPushResolver.java:806)
at 
org.apache.spark.network.shuffle.RemoteBlockPushResolver$PushBlockStreamCallback.onData(RemoteBlockPushResolver.java:840)
at 
org.apache.spark.network.server.TransportRequestHandler$3.onData(TransportRequestHandler.java:209)
at 
org.apache.spark.network.client.StreamInterceptor.handle(StreamInterceptor.java:79)
at 
org.apache.spark.network.util.TransportFrameDecoder.feedInterceptor(TransportFrameDecoder.java:263)
at 
org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:87)
at 
org.sparkproject.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at 
org.sparkproject.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at 
org.sparkproject.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
at 
org.sparkproject.io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
at 
org.sparkproject.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
at 
org.sparkproject.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
at 
org.sparkproject.io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
at 
org.sparkproject.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
at 
org.sparkproject.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:722)
at 
org.sparkproject.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:658)
at 
org.sparkproject.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:584)
at 
org.sparkproject.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:496)
at 
org.sparkproject.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)
at 
org.sparkproject.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at 
org.sparkproject.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.lang.Thread.run(Thread.java:748)
```

`isTooLate` checks null but `isStale` does not, so check `isTooLate` first 
to avoid NPE
```java
   private boolean isTooLate(
AppShuffleMergePartitionsInfo appShuffleMergePartitionsInfo,
int reduceId) {
  return null == appShuffleMergePartitionsInfo ||
INDETERMINATE_SHUFFLE_FINALIZED == 
appShuffleMergePartitionsInfo.shuffleMergePartitions ||

!appShuffleMergePartitionsInfo.shuffleMergePartitions.containsKey(reduceId);
}
```

### Does this PR introduce _any_ user-facing change?
Bugfix, to avoid NPE in Yarn ESS.

### How was this patch tested?
I don't think it's easy to write a unit test for this issue based on 
current code, since it's a minor change, use exsiting ut to ensue the change 
doesn't break the current functionalities.

Closes #35146 from pan3793/SPARK-37847.

Authored-by: Cheng Pan 
Signed-off-by: Mridu

[spark] branch master updated (f9fbf3b -> 9c12c37)

2022-01-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from f9fbf3b  [SPARK-37851][SQL][TESTS] Mark 
org.apache.spark.sql.hive.execution as slow tests
 add 9c12c37  [SPARK-37853][CORE] Clean up deprecation compilation warning 
related to log4j2

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/util/logging/DriverLogger.scala  | 19 +++
 .../test/scala/org/apache/spark/SparkFunSuite.scala   |  6 +++---
 2 files changed, 18 insertions(+), 7 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (bd24b48 -> f9fbf3b)

2022-01-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from bd24b48  [SPARK-37852][PYTHON][INFRA] Enable flake's E741 rule in 
PySpark
 add f9fbf3b  [SPARK-37851][SQL][TESTS] Mark 
org.apache.spark.sql.hive.execution as slow tests

No new revisions were added by this update.

Summary of changes:
 .../src/test/scala/org/apache/spark/sql/PlanStabilitySuite.scala  | 8 
 sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala  | 2 ++
 .../src/test/scala/org/apache/spark/sql/TPCDSQueryTestSuite.scala | 2 ++
 .../apache/spark/sql/hive/execution/AggregationQuerySuite.scala   | 1 +
 .../apache/spark/sql/hive/execution/BigDataBenchmarkSuite.scala   | 3 +++
 .../org/apache/spark/sql/hive/execution/HiveCommandSuite.scala| 2 ++
 .../org/apache/spark/sql/hive/execution/HiveExplainSuite.scala| 2 ++
 .../scala/org/apache/spark/sql/hive/execution/HivePlanTest.scala  | 2 ++
 .../org/apache/spark/sql/hive/execution/HiveResolutionSuite.scala | 2 ++
 .../org/apache/spark/sql/hive/execution/HiveSQLViewSuite.scala| 2 ++
 .../spark/sql/hive/execution/HiveScriptTransformationSuite.scala  | 2 ++
 .../apache/spark/sql/hive/execution/HiveSerDeReadWriteSuite.scala | 2 ++
 .../org/apache/spark/sql/hive/execution/HiveSerDeSuite.scala  | 2 ++
 .../org/apache/spark/sql/hive/execution/HiveTableScanSuite.scala  | 2 ++
 .../apache/spark/sql/hive/execution/HiveTypeCoercionSuite.scala   | 2 ++
 .../scala/org/apache/spark/sql/hive/execution/HiveUDAFSuite.scala | 2 ++
 .../scala/org/apache/spark/sql/hive/execution/HiveUDFSuite.scala  | 2 ++
 .../spark/sql/hive/execution/ObjectHashAggregateSuite.scala   | 2 ++
 .../spark/sql/hive/execution/PruneHiveTablePartitionsSuite.scala  | 2 ++
 .../scala/org/apache/spark/sql/hive/execution/PruningSuite.scala  | 2 ++
 .../org/apache/spark/sql/hive/execution/SQLMetricsSuite.scala | 2 ++
 .../scala/org/apache/spark/sql/hive/execution/UDAQuerySuite.scala | 3 +++
 .../org/apache/spark/sql/hive/execution/WindowQuerySuite.scala| 2 ++
 23 files changed, 53 insertions(+)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] 01/01: Preparing development version 3.1.4-SNAPSHOT

2022-01-10 Thread holden

This is an automated email from the ASF dual-hosted git repository.

holden pushed a commit to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git

commit 1a275396da176d9b0436066f0173ec24bc128ab9
Author: Holden Karau 
AuthorDate: Mon Jan 10 21:23:44 2022 +

Preparing development version 3.1.4-SNAPSHOT
---
 R/pkg/DESCRIPTION  | 2 +-
 assembly/pom.xml   | 2 +-
 common/kvstore/pom.xml | 2 +-
 common/network-common/pom.xml  | 2 +-
 common/network-shuffle/pom.xml | 2 +-
 common/network-yarn/pom.xml| 2 +-
 common/sketch/pom.xml  | 2 +-
 common/tags/pom.xml| 2 +-
 common/unsafe/pom.xml  | 2 +-
 core/pom.xml   | 2 +-
 docs/_config.yml   | 6 +++---
 examples/pom.xml   | 2 +-
 external/avro/pom.xml  | 2 +-
 external/docker-integration-tests/pom.xml  | 2 +-
 external/kafka-0-10-assembly/pom.xml   | 2 +-
 external/kafka-0-10-sql/pom.xml| 2 +-
 external/kafka-0-10-token-provider/pom.xml | 2 +-
 external/kafka-0-10/pom.xml| 2 +-
 external/kinesis-asl-assembly/pom.xml  | 2 +-
 external/kinesis-asl/pom.xml   | 2 +-
 external/spark-ganglia-lgpl/pom.xml| 2 +-
 graphx/pom.xml | 2 +-
 hadoop-cloud/pom.xml   | 2 +-
 launcher/pom.xml   | 2 +-
 mllib-local/pom.xml| 2 +-
 mllib/pom.xml  | 2 +-
 pom.xml| 2 +-
 python/pyspark/version.py  | 2 +-
 repl/pom.xml   | 2 +-
 resource-managers/kubernetes/core/pom.xml  | 2 +-
 resource-managers/kubernetes/integration-tests/pom.xml | 2 +-
 resource-managers/mesos/pom.xml| 2 +-
 resource-managers/yarn/pom.xml | 2 +-
 sql/catalyst/pom.xml   | 2 +-
 sql/core/pom.xml   | 2 +-
 sql/hive-thriftserver/pom.xml  | 2 +-
 sql/hive/pom.xml   | 2 +-
 streaming/pom.xml  | 2 +-
 tools/pom.xml  | 2 +-
 39 files changed, 41 insertions(+), 41 deletions(-)

diff --git a/R/pkg/DESCRIPTION b/R/pkg/DESCRIPTION
index d42841c..db689ea 100644
--- a/R/pkg/DESCRIPTION
+++ b/R/pkg/DESCRIPTION
@@ -1,6 +1,6 @@
 Package: SparkR
 Type: Package
-Version: 3.1.3
+Version: 3.1.4
 Title: R Front End for 'Apache Spark'
 Description: Provides an R Front end for 'Apache Spark' 
.
 Authors@R: c(person("Shivaram", "Venkataraman", role = "aut",
diff --git a/assembly/pom.xml b/assembly/pom.xml
index 95309ee..8e2cded 100644
--- a/assembly/pom.xml
+++ b/assembly/pom.xml
@@ -21,7 +21,7 @@
   
 org.apache.spark
 spark-parent_2.12
-3.1.3
+3.1.4-SNAPSHOT
 ../pom.xml
   
 
diff --git a/common/kvstore/pom.xml b/common/kvstore/pom.xml
index 80f6630..baf20e6 100644
--- a/common/kvstore/pom.xml
+++ b/common/kvstore/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.spark
 spark-parent_2.12
-3.1.3
+3.1.4-SNAPSHOT
 ../../pom.xml
   
 
diff --git a/common/network-common/pom.xml b/common/network-common/pom.xml
index ead7a4f..1a9c791 100644
--- a/common/network-common/pom.xml
+++ b/common/network-common/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.spark
 spark-parent_2.12
-3.1.3
+3.1.4-SNAPSHOT
 ../../pom.xml
   
 
diff --git a/common/network-shuffle/pom.xml b/common/network-shuffle/pom.xml
index 232abd7..8a188a3 100644
--- a/common/network-shuffle/pom.xml
+++ b/common/network-shuffle/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.spark
 spark-parent_2.12
-3.1.3
+3.1.4-SNAPSHOT
 ../../pom.xml
   
 
diff --git a/common/network-yarn/pom.xml b/common/network-yarn/pom.xml
index 79068a6..d53f21d 100644
--- a/common/network-yarn/pom.xml
+++ b/common/network-yarn/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.spark
 spark-parent_2.12
-3.1.3
+3.1.4-SNAPSHOT
 ../../pom.xml
   
 
diff --git a/common/sketch/pom.xml b/common/sketch/pom.xml
index 2030f85..53b7312 100644
--- a/common/sketch/pom.xml
+++ b/common/sketch/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.spark
 spark-parent_2.12
-3.1.3
+3.1.4-SNAPSHOT
 ../../pom.xml
   
 
diff --git a/common/tags/pom.xml b/common/tags/pom.xml
index 798c

[spark] branch branch-3.1 updated (94a69ff -> 1a27539)

2022-01-10 Thread holden

This is an automated email from the ASF dual-hosted git repository.

holden pushed a change to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 94a69ff  Revert "[SPARK-37779][SQL] Make ColumnarToRowExec plan 
canonicalizable after (de)serialization"
 add df89eb2  Preparing Spark release v3.1.3-rc1
 new 1a27539  Preparing development version 3.1.4-SNAPSHOT

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 R/pkg/DESCRIPTION  | 2 +-
 assembly/pom.xml   | 2 +-
 common/kvstore/pom.xml | 2 +-
 common/network-common/pom.xml  | 2 +-
 common/network-shuffle/pom.xml | 2 +-
 common/network-yarn/pom.xml| 2 +-
 common/sketch/pom.xml  | 2 +-
 common/tags/pom.xml| 2 +-
 common/unsafe/pom.xml  | 2 +-
 core/pom.xml   | 2 +-
 docs/_config.yml   | 6 +++---
 examples/pom.xml   | 2 +-
 external/avro/pom.xml  | 2 +-
 external/docker-integration-tests/pom.xml  | 2 +-
 external/kafka-0-10-assembly/pom.xml   | 2 +-
 external/kafka-0-10-sql/pom.xml| 2 +-
 external/kafka-0-10-token-provider/pom.xml | 2 +-
 external/kafka-0-10/pom.xml| 2 +-
 external/kinesis-asl-assembly/pom.xml  | 2 +-
 external/kinesis-asl/pom.xml   | 2 +-
 external/spark-ganglia-lgpl/pom.xml| 2 +-
 graphx/pom.xml | 2 +-
 hadoop-cloud/pom.xml   | 2 +-
 launcher/pom.xml   | 2 +-
 mllib-local/pom.xml| 2 +-
 mllib/pom.xml  | 2 +-
 pom.xml| 2 +-
 python/pyspark/version.py  | 2 +-
 repl/pom.xml   | 2 +-
 resource-managers/kubernetes/core/pom.xml  | 2 +-
 resource-managers/kubernetes/integration-tests/pom.xml | 2 +-
 resource-managers/mesos/pom.xml| 2 +-
 resource-managers/yarn/pom.xml | 2 +-
 sql/catalyst/pom.xml   | 2 +-
 sql/core/pom.xml   | 2 +-
 sql/hive-thriftserver/pom.xml  | 2 +-
 sql/hive/pom.xml   | 2 +-
 streaming/pom.xml  | 2 +-
 tools/pom.xml  | 2 +-
 39 files changed, 41 insertions(+), 41 deletions(-)

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] tag v3.1.3-rc1 created (now df89eb2)

2022-01-10 Thread holden

This is an automated email from the ASF dual-hosted git repository.

holden pushed a change to tag v3.1.3-rc1
in repository https://gitbox.apache.org/repos/asf/spark.git.


  at df89eb2  (commit)
This tag includes the following new commits:

 new df89eb2  Preparing Spark release v3.1.3-rc1

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] 01/01: Preparing Spark release v3.1.3-rc1

2022-01-10 Thread holden

This is an automated email from the ASF dual-hosted git repository.

holden pushed a commit to tag v3.1.3-rc1
in repository https://gitbox.apache.org/repos/asf/spark.git

commit df89eb2d748c99665256d3ef297fef774db1014a
Author: Holden Karau 
AuthorDate: Mon Jan 10 21:23:40 2022 +

Preparing Spark release v3.1.3-rc1
---
 assembly/pom.xml   | 2 +-
 common/kvstore/pom.xml | 2 +-
 common/network-common/pom.xml  | 2 +-
 common/network-shuffle/pom.xml | 2 +-
 common/network-yarn/pom.xml| 2 +-
 common/sketch/pom.xml  | 2 +-
 common/tags/pom.xml| 2 +-
 common/unsafe/pom.xml  | 2 +-
 core/pom.xml   | 2 +-
 docs/_config.yml   | 2 +-
 examples/pom.xml   | 2 +-
 external/avro/pom.xml  | 2 +-
 external/docker-integration-tests/pom.xml  | 2 +-
 external/kafka-0-10-assembly/pom.xml   | 2 +-
 external/kafka-0-10-sql/pom.xml| 2 +-
 external/kafka-0-10-token-provider/pom.xml | 2 +-
 external/kafka-0-10/pom.xml| 2 +-
 external/kinesis-asl-assembly/pom.xml  | 2 +-
 external/kinesis-asl/pom.xml   | 2 +-
 external/spark-ganglia-lgpl/pom.xml| 2 +-
 graphx/pom.xml | 2 +-
 hadoop-cloud/pom.xml   | 2 +-
 launcher/pom.xml   | 2 +-
 mllib-local/pom.xml| 2 +-
 mllib/pom.xml  | 2 +-
 pom.xml| 2 +-
 python/pyspark/version.py  | 2 +-
 repl/pom.xml   | 2 +-
 resource-managers/kubernetes/core/pom.xml  | 2 +-
 resource-managers/kubernetes/integration-tests/pom.xml | 2 +-
 resource-managers/mesos/pom.xml| 2 +-
 resource-managers/yarn/pom.xml | 2 +-
 sql/catalyst/pom.xml   | 2 +-
 sql/core/pom.xml   | 2 +-
 sql/hive-thriftserver/pom.xml  | 2 +-
 sql/hive/pom.xml   | 2 +-
 streaming/pom.xml  | 2 +-
 tools/pom.xml  | 2 +-
 38 files changed, 38 insertions(+), 38 deletions(-)

diff --git a/assembly/pom.xml b/assembly/pom.xml
index bf75946..95309ee 100644
--- a/assembly/pom.xml
+++ b/assembly/pom.xml
@@ -21,7 +21,7 @@
   
 org.apache.spark
 spark-parent_2.12
-3.1.3-SNAPSHOT
+3.1.3
 ../pom.xml
   
 
diff --git a/common/kvstore/pom.xml b/common/kvstore/pom.xml
index 777477b..80f6630 100644
--- a/common/kvstore/pom.xml
+++ b/common/kvstore/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.spark
 spark-parent_2.12
-3.1.3-SNAPSHOT
+3.1.3
 ../../pom.xml
   
 
diff --git a/common/network-common/pom.xml b/common/network-common/pom.xml
index 2233100..ead7a4f 100644
--- a/common/network-common/pom.xml
+++ b/common/network-common/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.spark
 spark-parent_2.12
-3.1.3-SNAPSHOT
+3.1.3
 ../../pom.xml
   
 
diff --git a/common/network-shuffle/pom.xml b/common/network-shuffle/pom.xml
index aa5b2d4..232abd7 100644
--- a/common/network-shuffle/pom.xml
+++ b/common/network-shuffle/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.spark
 spark-parent_2.12
-3.1.3-SNAPSHOT
+3.1.3
 ../../pom.xml
   
 
diff --git a/common/network-yarn/pom.xml b/common/network-yarn/pom.xml
index d3cc674..79068a6 100644
--- a/common/network-yarn/pom.xml
+++ b/common/network-yarn/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.spark
 spark-parent_2.12
-3.1.3-SNAPSHOT
+3.1.3
 ../../pom.xml
   
 
diff --git a/common/sketch/pom.xml b/common/sketch/pom.xml
index c54d485..2030f85 100644
--- a/common/sketch/pom.xml
+++ b/common/sketch/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.spark
 spark-parent_2.12
-3.1.3-SNAPSHOT
+3.1.3
 ../../pom.xml
   
 
diff --git a/common/tags/pom.xml b/common/tags/pom.xml
index a02b306..798c88e 100644
--- a/common/tags/pom.xml
+++ b/common/tags/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.spark
 spark-parent_2.12
-3.1.3-SNAPSHOT
+3.1.3
 ../../pom.xml
   
 
diff --git a/common/unsafe/pom.xml b/common/unsafe/pom.xml
index 64ee6ba..bd829c0 100644
--- a/common/unsafe/pom.xml
+++ b/common/unsafe/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.spark
 spark-parent_2.12
-3.1.3-SNAPSHOT
+3.1.3
 ../../pom.xml
   
 
diff --git a/c

[spark] branch master updated: [SPARK-37852][PYTHON][INFRA] Enable flake's E741 rule in PySpark

2022-01-10 Thread zero323

This is an automated email from the ASF dual-hosted git repository.

zero323 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new bd24b48  [SPARK-37852][PYTHON][INFRA] Enable flake's E741 rule in 
PySpark
bd24b48 is described below

commit bd24b4884b804fc85a083f82b864823851d5980c
Author: Hyukjin Kwon 
AuthorDate: Mon Jan 10 22:09:45 2022 +0100

[SPARK-37852][PYTHON][INFRA] Enable flake's E741 rule in PySpark

### What changes were proposed in this pull request?

This PR enables flake's [E741](https://www.flake8rules.com/rules/E741.html) 
rule in PySpark codebase to comply PEP 8 
(https://www.python.org/dev/peps/pep-0008/#names-to-avoid)

### Why are the changes needed?

To comply PEP 8.

### Does this PR introduce _any_ user-facing change?

No, dev-only.

### How was this patch tested?

Existing test cases should cover.

Closes #35152 from HyukjinKwon/enable-E741.

Authored-by: Hyukjin Kwon 
Signed-off-by: zero323 
---
 dev/tox.ini|  9 ++---
 examples/src/main/python/sql/arrow.py  |  4 +-
 python/pyspark/ml/linalg/__init__.py   | 24 ++--
 python/pyspark/ml/tests/test_param.py  | 16 
 python/pyspark/mllib/linalg/__init__.py| 24 ++--
 python/pyspark/mllib/tests/test_linalg.py  |  4 +-
 python/pyspark/pandas/frame.py | 26 +++--
 python/pyspark/pandas/namespace.py |  4 +-
 python/pyspark/pandas/plot/matplotlib.py   |  4 +-
 python/pyspark/pandas/series.py| 12 +++---
 python/pyspark/rddsampler.py   |  4 +-
 .../pyspark/sql/tests/test_pandas_cogrouped_map.py | 20 +-
 python/pyspark/sql/tests/test_serde.py | 12 +++---
 python/pyspark/tests/test_conf.py  |  8 ++--
 python/pyspark/tests/test_rdd.py   |  8 ++--
 python/pyspark/tests/test_shuffle.py   | 44 +++---
 16 files changed, 112 insertions(+), 111 deletions(-)

diff --git a/dev/tox.ini b/dev/tox.ini
index 4047383..df4dfce 100644
--- a/dev/tox.ini
+++ b/dev/tox.ini
@@ -22,15 +22,14 @@ ignore =
 # 1. Type hints with def are treated as redefinition (e.g., functions.log).
 # 2. Some are used for testing.
 F811,
+# There are too many instances to fix. Ignored for now.
+W503,
+W504,
 
 # Below rules should be enabled in the future.
 E731,
-E741,
-W503,
-W504,
 per-file-ignores =
-# F405 is ignored as shared.py is auto-generated.
-# E501 can be removed after SPARK-37419.
+# F405 and E501 are ignored as shared.py is auto-generated.
 python/pyspark/ml/param/shared.py: F405 E501,
 # Examples contain some unused variables.
 examples/src/main/python/sql/datasource.py: F841,
diff --git a/examples/src/main/python/sql/arrow.py 
b/examples/src/main/python/sql/arrow.py
index d30082b..298830c 100644
--- a/examples/src/main/python/sql/arrow.py
+++ b/examples/src/main/python/sql/arrow.py
@@ -258,8 +258,8 @@ def cogrouped_apply_in_pandas_example(spark):
 [(2101, 1, "x"), (2101, 2, "y")],
 ("time", "id", "v2"))
 
-def asof_join(l, r):
-return pd.merge_asof(l, r, on="time", by="id")
+def asof_join(left, right):
+return pd.merge_asof(left, right, on="time", by="id")
 
 df1.groupby("id").cogroup(df2.groupby("id")).applyInPandas(
 asof_join, schema="time int, id int, v1 double, v2 string").show()
diff --git a/python/pyspark/ml/linalg/__init__.py 
b/python/pyspark/ml/linalg/__init__.py
index f940132..b361925 100644
--- a/python/pyspark/ml/linalg/__init__.py
+++ b/python/pyspark/ml/linalg/__init__.py
@@ -65,20 +65,20 @@ except BaseException:
 _have_scipy = False
 
 
-def _convert_to_vector(l):
-if isinstance(l, Vector):
-return l
-elif type(l) in (array.array, np.array, np.ndarray, list, tuple, range):
-return DenseVector(l)
-elif _have_scipy and scipy.sparse.issparse(l):
-assert l.shape[1] == 1, "Expected column vector"
+def _convert_to_vector(d):
+if isinstance(d, Vector):
+return d
+elif type(d) in (array.array, np.array, np.ndarray, list, tuple, range):
+return DenseVector(d)
+elif _have_scipy and scipy.sparse.issparse(d):
+assert d.shape[1] == 1, "Expected column vector"
 # Make sure the converted csc_matrix has sorted indices.
-csc = l.tocsc()
+csc = d.tocsc()
 if not csc.has_sorted_indices:
 csc.sort_indices()
-return SparseVector(l.shape[0], csc.indices, csc.data)
+return SparseVector(d.shape[0], csc.indices, csc.data)
 else:
-raise TypeError("Cannot convert type %s into Vector" % type(l))
+raise TypeError("Ca

[spark] branch master updated: [SPARK-37527][SQL] Compile `COVAR_POP`, `COVAR_SAMP` and `CORR` in `H2Dialet`

2022-01-10 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 2ac69e8  [SPARK-37527][SQL] Compile `COVAR_POP`, `COVAR_SAMP` and 
`CORR` in `H2Dialet`
2ac69e8 is described below

commit 2ac69e8cb0231edd753b9a91ac6d2b07e1d10525
Author: Jiaan Geng 
AuthorDate: Mon Jan 10 21:47:55 2022 +0800

[SPARK-37527][SQL] Compile `COVAR_POP`, `COVAR_SAMP` and `CORR` in 
`H2Dialet`

### What changes were proposed in this pull request?
https://github.com/apache/spark/pull/35101 translate `COVAR_POP`, 
`COVAR_SAMP` and `CORR`, but the H2 lower version cannot support them.

After https://github.com/apache/spark/pull/35013, we can compile the three 
aggregate functions in `H2Dialet` now.

### Why are the changes needed?
Supplement the implement of `H2Dialet`.

### Does this PR introduce _any_ user-facing change?
'Yes'. Spark could complete push-down `COVAR_POP`, `COVAR_SAMP` and `CORR` 
into H2.

### How was this patch tested?
Test updated.

Closes #35145 from beliefer/SPARK-37527_followup.

Authored-by: Jiaan Geng 
Signed-off-by: Wenchen Fan 
---
 .../src/main/scala/org/apache/spark/sql/jdbc/H2Dialect.scala | 12 
 .../test/scala/org/apache/spark/sql/jdbc/JDBCV2Suite.scala   | 12 
 2 files changed, 20 insertions(+), 4 deletions(-)

diff --git a/sql/core/src/main/scala/org/apache/spark/sql/jdbc/H2Dialect.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/jdbc/H2Dialect.scala
index 087c357..1f422e5 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/jdbc/H2Dialect.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/jdbc/H2Dialect.scala
@@ -47,6 +47,18 @@ private object H2Dialect extends JdbcDialect {
   assert(f.inputs().length == 1)
   val distinct = if (f.isDistinct) "DISTINCT " else ""
   Some(s"STDDEV_SAMP($distinct${f.inputs().head})")
+case f: GeneralAggregateFunc if f.name() == "COVAR_POP" =>
+  assert(f.inputs().length == 2)
+  val distinct = if (f.isDistinct) "DISTINCT " else ""
+  Some(s"COVAR_POP($distinct${f.inputs().head}, ${f.inputs().last})")
+case f: GeneralAggregateFunc if f.name() == "COVAR_SAMP" =>
+  assert(f.inputs().length == 2)
+  val distinct = if (f.isDistinct) "DISTINCT " else ""
+  Some(s"COVAR_SAMP($distinct${f.inputs().head}, ${f.inputs().last})")
+case f: GeneralAggregateFunc if f.name() == "CORR" =>
+  assert(f.inputs().length == 2)
+  val distinct = if (f.isDistinct) "DISTINCT " else ""
+  Some(s"CORR($distinct${f.inputs().head}, ${f.inputs().last})")
 case _ => None
   }
 )
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCV2Suite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCV2Suite.scala
index f99e40a..c5e1a6a 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCV2Suite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCV2Suite.scala
@@ -745,11 +745,13 @@ class JDBCV2Suite extends QueryTest with 
SharedSparkSession with ExplainSuiteHel
 val df = sql("select COVAR_POP(bonus, bonus), COVAR_SAMP(bonus, bonus)" +
   " FROM h2.test.employee where dept > 0 group by DePt")
 checkFiltersRemoved(df)
-checkAggregateRemoved(df, false)
+checkAggregateRemoved(df)
 df.queryExecution.optimizedPlan.collect {
   case _: DataSourceV2ScanRelation =>
 val expected_plan_fragment =
-  "PushedFilters: [IsNotNull(DEPT), GreaterThan(DEPT,0)]"
+  "PushedAggregates: [COVAR_POP(BONUS, BONUS), COVAR_SAMP(BONUS, 
BONUS)], " +
+"PushedFilters: [IsNotNull(DEPT), GreaterThan(DEPT,0)], " +
+"PushedGroupByColumns: [DEPT]"
 checkKeywordsExistsInExplain(df, expected_plan_fragment)
 }
 checkAnswer(df, Seq(Row(1d, 2d), Row(2500d, 5000d), Row(0d, null)))
@@ -759,11 +761,13 @@ class JDBCV2Suite extends QueryTest with 
SharedSparkSession with ExplainSuiteHel
 val df = sql("select CORR(bonus, bonus) FROM h2.test.employee where dept > 
0" +
   " group by DePt")
 checkFiltersRemoved(df)
-checkAggregateRemoved(df, false)
+checkAggregateRemoved(df)
 df.queryExecution.optimizedPlan.collect {
   case _: DataSourceV2ScanRelation =>
 val expected_plan_fragment =
-  "PushedFilters: [IsNotNull(DEPT), GreaterThan(DEPT,0)]"
+  "PushedAggregates: [CORR(BONUS, BONUS)], " +
+"PushedFilters: [IsNotNull(DEPT), GreaterThan(DEPT,0)], " +
+"PushedGroupByColumns: [DEPT]"
 checkKeywordsExistsInExplain(df, expected_plan_fragment)
 }
 checkAnswer(df, Seq(Row(1d), Row(1d), Row(null)))

-
To unsubscribe, e-ma

[spark] branch master updated: [SPARK-35442][SQL] Support propagate empty relation through aggregate/union

2022-01-10 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new ee4c4e5  [SPARK-35442][SQL] Support propagate empty relation through 
aggregate/union
ee4c4e5 is described below

commit ee4c4e5162f61b989a71f8f9d153845ee5e77a88
Author: ulysses-you 
AuthorDate: Mon Jan 10 20:07:01 2022 +0800

[SPARK-35442][SQL] Support propagate empty relation through aggregate/union

### What changes were proposed in this pull request?

- Add `LogicalQueryStage(_, agg: BaseAggregateExec)` check in 
`AQEPropagateEmptyRelation`
- Add `LeafNode` check in `PropagateEmptyRelationBase`, so we can eliminate 
`LogicalQueryStage` to `LocalRelation`
- Unify the `applyFunc` and `commonApplyFunc` in 
`PropagateEmptyRelationBase`

### Why are the changes needed?

The Aggregate in AQE is different with others, the `LogicalQueryStage` 
looks like `LogicalQueryStage(Aggregate, BaseAggregate)`. We should handle this 
case specially.

Logically, if the Aggregate grouping expression is not empty, we can 
eliminate it safely.

### Does this PR introduce _any_ user-facing change?

no

### How was this patch tested?

add new test in `AdaptiveQueryExecSuite`
- `Support propagate empty relation through aggregate`
- `Support propagate empty relation through union`

Closes #35149 from ulysses-you/SPARK-35442-GA-SPARK.

Authored-by: ulysses-you 
Signed-off-by: Wenchen Fan 
---
 .../optimizer/PropagateEmptyRelation.scala | 84 ++
 .../adaptive/AQEPropagateEmptyRelation.scala   | 21 +-
 .../adaptive/AdaptiveQueryExecSuite.scala  | 56 ++-
 3 files changed, 110 insertions(+), 51 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/PropagateEmptyRelation.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/PropagateEmptyRelation.scala
index 6ad0793..d02f12d 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/PropagateEmptyRelation.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/PropagateEmptyRelation.scala
@@ -28,14 +28,17 @@ import 
org.apache.spark.sql.catalyst.trees.TreePattern.{LOCAL_RELATION, TRUE_OR_
 /**
  * The base class of two rules in the normal and AQE Optimizer. It simplifies 
query plans with
  * empty or non-empty relations:
- *  1. Binary-node Logical Plans
+ *  1. Higher-node Logical Plans
+ * - Union with all empty children.
+ *  2. Binary-node Logical Plans
  * - Join with one or two empty children (including Intersect/Except).
  * - Left semi Join
  *   Right side is non-empty and condition is empty. Eliminate join to its 
left side.
  * - Left anti join
  *   Right side is non-empty and condition is empty. Eliminate join to an 
empty
  *   [[LocalRelation]].
- *  2. Unary-node Logical Plans
+ *  3. Unary-node Logical Plans
+ * - Project/Filter/Sample with all empty children.
  * - Limit/Repartition with all empty children.
  * - Aggregate with all empty children and at least one grouping 
expression.
  * - Generate(Explode) with all empty children. Others like Hive UDTF may 
return results.
@@ -59,6 +62,31 @@ abstract class PropagateEmptyRelationBase extends 
Rule[LogicalPlan] with CastSup
 plan.output.map{ a => Alias(cast(Literal(null), a.dataType), 
a.name)(a.exprId) }
 
   protected def commonApplyFunc: PartialFunction[LogicalPlan, LogicalPlan] = {
+case p: Union if p.children.exists(isEmpty) =>
+  val newChildren = p.children.filterNot(isEmpty)
+  if (newChildren.isEmpty) {
+empty(p)
+  } else {
+val newPlan = if (newChildren.size > 1) Union(newChildren) else 
newChildren.head
+val outputs = newPlan.output.zip(p.output)
+// the original Union may produce different output attributes than the 
new one so we alias
+// them if needed
+if (outputs.forall { case (newAttr, oldAttr) => newAttr.exprId == 
oldAttr.exprId }) {
+  newPlan
+} else {
+  val newOutput = outputs.map { case (newAttr, oldAttr) =>
+if (newAttr.exprId == oldAttr.exprId) {
+  newAttr
+} else {
+  val newExplicitMetadata =
+if (oldAttr.metadata != newAttr.metadata) 
Some(oldAttr.metadata) else None
+  Alias(newAttr, oldAttr.name)(oldAttr.exprId, explicitMetadata = 
newExplicitMetadata)
+}
+  }
+  Project(newOutput, newPlan)
+}
+  }
+
 // Joins on empty LocalRelations generated from streaming sources are not 
eliminated
 // as stateful streaming joins need to perform other state management 
operations other than
 // just processing the

[spark] branch branch-3.2 updated: [SPARK-37818][DOCS] Add option in the description document for show create table

2022-01-10 Thread gengliang

This is an automated email from the ASF dual-hosted git repository.

gengliang pushed a commit to branch branch-3.2
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.2 by this push:
 new 385b34d  [SPARK-37818][DOCS] Add option in the description document 
for show create table
385b34d is described below

commit 385b34d47a29d348378574fb2bd674f942da71f4
Author: PengLei 
AuthorDate: Mon Jan 10 19:40:02 2022 +0800

[SPARK-37818][DOCS] Add option in the description document for show create 
table

### What changes were proposed in this pull request?

Add options in the description document for `SHOW CREATE TABLE ` command.
https://user-images.githubusercontent.com/41178002/148747443-ecd6586f-e4c4-4ae4-8ea5-969896b7d416.png";>
https://user-images.githubusercontent.com/41178002/148747457-873bc0c3-08fa-4d31-89e7-b0372462.png";>

### Why are the changes needed?

[#discussion](https://github.com/apache/spark/pull/34719#discussion_r758189709)

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
SKIP_API=1 SKIP_RDOC=1 SKIP_PYTHONDOC=1 SKIP_SCALADOC=1 bundle exec jekyll 
build

Closes #35107 from Peng-Lei/SPARK-37818.

Authored-by: PengLei 
Signed-off-by: Gengliang Wang 
(cherry picked from commit 2f70e4f84073ac75457263a2b3f3cb835ee63d49)
Signed-off-by: Gengliang Wang 
---
 docs/sql-ref-syntax-aux-show-create-table.md | 25 -
 1 file changed, 24 insertions(+), 1 deletion(-)

diff --git a/docs/sql-ref-syntax-aux-show-create-table.md 
b/docs/sql-ref-syntax-aux-show-create-table.md
index ae8c10e..83013b0 100644
--- a/docs/sql-ref-syntax-aux-show-create-table.md
+++ b/docs/sql-ref-syntax-aux-show-create-table.md
@@ -26,7 +26,7 @@ license: |
 ### Syntax
 
 ```sql
-SHOW CREATE TABLE table_identifier
+SHOW CREATE TABLE table_identifier [ AS SERDE ]
 ```
 
 ### Parameters
@@ -37,6 +37,10 @@ SHOW CREATE TABLE table_identifier
 
 **Syntax:** `[ database_name. ] table_name`
 
+* **AS SERDE**
+
+Generates Hive DDL for a Hive SerDe table.
+
 ### Examples
 
 ```sql
@@ -55,6 +59,25 @@ SHOW CREATE TABLE test;
'prop1' = 'value1',
'prop2' = 'value2')
 ++
+
+SHOW CREATE TABLE test AS SERDE;
++--+
+|
createtab_stmt|
++--+
+|CREATE TABLE `default`.`test`(
+  `c` INT)
+ ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
+ WITH SERDEPROPERTIES (
+   'serialization.format' = ',',
+   'field.delim' = ',')
+ STORED AS
+   INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat'
+   OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
+ TBLPROPERTIES (
+   'prop1' = 'value1',
+   'prop2' = 'value2',
+   'transient_lastDdlTime' = '1641800515')
++--+
 ```
 
 ### Related Statements

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-37818][DOCS] Add option in the description document for show create table

2022-01-10 Thread gengliang

This is an automated email from the ASF dual-hosted git repository.

gengliang pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 2f70e4f  [SPARK-37818][DOCS] Add option in the description document 
for show create table
2f70e4f is described below

commit 2f70e4f84073ac75457263a2b3f3cb835ee63d49
Author: PengLei 
AuthorDate: Mon Jan 10 19:40:02 2022 +0800

[SPARK-37818][DOCS] Add option in the description document for show create 
table

### What changes were proposed in this pull request?

Add options in the description document for `SHOW CREATE TABLE ` command.
https://user-images.githubusercontent.com/41178002/148747443-ecd6586f-e4c4-4ae4-8ea5-969896b7d416.png";>
https://user-images.githubusercontent.com/41178002/148747457-873bc0c3-08fa-4d31-89e7-b0372462.png";>

### Why are the changes needed?

[#discussion](https://github.com/apache/spark/pull/34719#discussion_r758189709)

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
SKIP_API=1 SKIP_RDOC=1 SKIP_PYTHONDOC=1 SKIP_SCALADOC=1 bundle exec jekyll 
build

Closes #35107 from Peng-Lei/SPARK-37818.

Authored-by: PengLei 
Signed-off-by: Gengliang Wang 
---
 docs/sql-ref-syntax-aux-show-create-table.md | 25 -
 1 file changed, 24 insertions(+), 1 deletion(-)

diff --git a/docs/sql-ref-syntax-aux-show-create-table.md 
b/docs/sql-ref-syntax-aux-show-create-table.md
index ae8c10e..83013b0 100644
--- a/docs/sql-ref-syntax-aux-show-create-table.md
+++ b/docs/sql-ref-syntax-aux-show-create-table.md
@@ -26,7 +26,7 @@ license: |
 ### Syntax
 
 ```sql
-SHOW CREATE TABLE table_identifier
+SHOW CREATE TABLE table_identifier [ AS SERDE ]
 ```
 
 ### Parameters
@@ -37,6 +37,10 @@ SHOW CREATE TABLE table_identifier
 
 **Syntax:** `[ database_name. ] table_name`
 
+* **AS SERDE**
+
+Generates Hive DDL for a Hive SerDe table.
+
 ### Examples
 
 ```sql
@@ -55,6 +59,25 @@ SHOW CREATE TABLE test;
'prop1' = 'value1',
'prop2' = 'value2')
 ++
+
+SHOW CREATE TABLE test AS SERDE;
++--+
+|
createtab_stmt|
++--+
+|CREATE TABLE `default`.`test`(
+  `c` INT)
+ ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
+ WITH SERDEPROPERTIES (
+   'serialization.format' = ',',
+   'field.delim' = ',')
+ STORED AS
+   INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat'
+   OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
+ TBLPROPERTIES (
+   'prop1' = 'value1',
+   'prop2' = 'value2',
+   'transient_lastDdlTime' = '1641800515')
++--+
 ```
 
 ### Related Statements

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-37860][UI] Fix taskindex in the stage page task event timeline

[spark] branch branch-3.1 updated: [SPARK-37860][UI] Fix taskindex in the stage page task event timeline

[spark] branch branch-3.2 updated: [SPARK-37860][UI] Fix taskindex in the stage page task event timeline

[spark] branch master updated (7463564 -> 3d2fde5)

[spark] branch master updated: [SPARK-37847][CORE][SHUFFLE] PushBlockStreamCallback#isStale should check null to avoid NPE

[spark] branch master updated (f9fbf3b -> 9c12c37)

[spark] branch master updated (bd24b48 -> f9fbf3b)

[spark] 01/01: Preparing development version 3.1.4-SNAPSHOT

[spark] branch branch-3.1 updated (94a69ff -> 1a27539)

[spark] tag v3.1.3-rc1 created (now df89eb2)

[spark] 01/01: Preparing Spark release v3.1.3-rc1

[spark] branch master updated: [SPARK-37852][PYTHON][INFRA] Enable flake's E741 rule in PySpark

[spark] branch master updated: [SPARK-37527][SQL] Compile `COVAR_POP`, `COVAR_SAMP` and `CORR` in `H2Dialet`

[spark] branch master updated: [SPARK-35442][SQL] Support propagate empty relation through aggregate/union

[spark] branch branch-3.2 updated: [SPARK-37818][DOCS] Add option in the description document for show create table

[spark] branch master updated: [SPARK-37818][DOCS] Add option in the description document for show create table

16 matches

Site Navigation

Mail list logo

Footer information