from:"\"dongjoon\""

[spark] branch master updated (6f4a2e4 -> 3995728)

2020-02-27 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 6f4a2e4  [MINOR][ML] Fix confusing error message in VectorAssembler
 add 3995728  [SPARK-30968][BUILD] Upgrade aws-java-sdk-sts to 1.11.655

No new revisions were added by this update.

Summary of changes:
 pom.xml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-30968][BUILD] Upgrade aws-java-sdk-sts to 1.11.655

2020-02-27 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new c6f718b  [SPARK-30968][BUILD] Upgrade aws-java-sdk-sts to 1.11.655
c6f718b is described below

commit c6f718b5f41fe76c9c6eedcf2e9684d4d291cb4d
Author: Dongjoon Hyun 
AuthorDate: Thu Feb 27 17:05:56 2020 -0800

[SPARK-30968][BUILD] Upgrade aws-java-sdk-sts to 1.11.655

### What changes were proposed in this pull request?

This PR aims to upgrade `aws-java-sdk-sts` to `1.11.655`.

### Why are the changes needed?

[SPARK-29677](https://github.com/apache/spark/pull/26333) upgrades AWS 
Kinesis Client to 1.12.0 for Apache Spark 2.4.5 and 3.0.0.

Since AWS Kinesis Client 1.12.0 is using AWS SDK 1.11.665, 
`aws-java-sdk-sts` should be consistent with Kinesis client dependency.
- https://github.com/awslabs/amazon-kinesis-client/releases/tag/v1.12.0

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Pass the Jenkins.

Closes #27720 from dongjoon-hyun/SPARK-30968.

Authored-by: Dongjoon Hyun 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit 3995728c3ce9d85b0436c0220f957b9d9133d64a)
Signed-off-by: Dongjoon Hyun 
---
 pom.xml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/pom.xml b/pom.xml
index b3750e4..9d36faf 100644
--- a/pom.xml
+++ b/pom.xml
@@ -149,7 +149,7 @@
 hadoop2
 1.12.0
 
-1.11.271
+1.11.655
 
 0.12.8
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated: [SPARK-30968][BUILD] Upgrade aws-java-sdk-sts to 1.11.655

2020-02-27 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new 7574d99  [SPARK-30968][BUILD] Upgrade aws-java-sdk-sts to 1.11.655
7574d99 is described below

commit 7574d99e9c8c9d3e92b1f8269ae09a7b7f0cdbd0
Author: Dongjoon Hyun 
AuthorDate: Thu Feb 27 17:05:56 2020 -0800

[SPARK-30968][BUILD] Upgrade aws-java-sdk-sts to 1.11.655

### What changes were proposed in this pull request?

This PR aims to upgrade `aws-java-sdk-sts` to `1.11.655`.

### Why are the changes needed?

[SPARK-29677](https://github.com/apache/spark/pull/26333) upgrades AWS 
Kinesis Client to 1.12.0 for Apache Spark 2.4.5 and 3.0.0.

Since AWS Kinesis Client 1.12.0 is using AWS SDK 1.11.665, 
`aws-java-sdk-sts` should be consistent with Kinesis client dependency.
- https://github.com/awslabs/amazon-kinesis-client/releases/tag/v1.12.0

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Pass the Jenkins.

Closes #27720 from dongjoon-hyun/SPARK-30968.

Authored-by: Dongjoon Hyun 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit 3995728c3ce9d85b0436c0220f957b9d9133d64a)
Signed-off-by: Dongjoon Hyun 
---
 pom.xml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/pom.xml b/pom.xml
index 0741096..a162cd1 100644
--- a/pom.xml
+++ b/pom.xml
@@ -144,7 +144,7 @@
 hadoop2
 1.12.0
 
-1.11.271
+1.11.655
 
 0.12.8
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-30968][BUILD] Upgrade aws-java-sdk-sts to 1.11.655

2020-02-27 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new c6f718b  [SPARK-30968][BUILD] Upgrade aws-java-sdk-sts to 1.11.655
c6f718b is described below

commit c6f718b5f41fe76c9c6eedcf2e9684d4d291cb4d
Author: Dongjoon Hyun 
AuthorDate: Thu Feb 27 17:05:56 2020 -0800

[SPARK-30968][BUILD] Upgrade aws-java-sdk-sts to 1.11.655

### What changes were proposed in this pull request?

This PR aims to upgrade `aws-java-sdk-sts` to `1.11.655`.

### Why are the changes needed?

[SPARK-29677](https://github.com/apache/spark/pull/26333) upgrades AWS 
Kinesis Client to 1.12.0 for Apache Spark 2.4.5 and 3.0.0.

Since AWS Kinesis Client 1.12.0 is using AWS SDK 1.11.665, 
`aws-java-sdk-sts` should be consistent with Kinesis client dependency.
- https://github.com/awslabs/amazon-kinesis-client/releases/tag/v1.12.0

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Pass the Jenkins.

Closes #27720 from dongjoon-hyun/SPARK-30968.

Authored-by: Dongjoon Hyun 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit 3995728c3ce9d85b0436c0220f957b9d9133d64a)
Signed-off-by: Dongjoon Hyun 
---
 pom.xml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/pom.xml b/pom.xml
index b3750e4..9d36faf 100644
--- a/pom.xml
+++ b/pom.xml
@@ -149,7 +149,7 @@
 hadoop2
 1.12.0
 
-1.11.271
+1.11.655
 
 0.12.8
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated: [SPARK-30968][BUILD] Upgrade aws-java-sdk-sts to 1.11.655

2020-02-27 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new 7574d99  [SPARK-30968][BUILD] Upgrade aws-java-sdk-sts to 1.11.655
7574d99 is described below

commit 7574d99e9c8c9d3e92b1f8269ae09a7b7f0cdbd0
Author: Dongjoon Hyun 
AuthorDate: Thu Feb 27 17:05:56 2020 -0800

[SPARK-30968][BUILD] Upgrade aws-java-sdk-sts to 1.11.655

### What changes were proposed in this pull request?

This PR aims to upgrade `aws-java-sdk-sts` to `1.11.655`.

### Why are the changes needed?

[SPARK-29677](https://github.com/apache/spark/pull/26333) upgrades AWS 
Kinesis Client to 1.12.0 for Apache Spark 2.4.5 and 3.0.0.

Since AWS Kinesis Client 1.12.0 is using AWS SDK 1.11.665, 
`aws-java-sdk-sts` should be consistent with Kinesis client dependency.
- https://github.com/awslabs/amazon-kinesis-client/releases/tag/v1.12.0

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Pass the Jenkins.

Closes #27720 from dongjoon-hyun/SPARK-30968.

Authored-by: Dongjoon Hyun 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit 3995728c3ce9d85b0436c0220f957b9d9133d64a)
Signed-off-by: Dongjoon Hyun 
---
 pom.xml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/pom.xml b/pom.xml
index 0741096..a162cd1 100644
--- a/pom.xml
+++ b/pom.xml
@@ -144,7 +144,7 @@
 hadoop2
 1.12.0
 
-1.11.271
+1.11.655
 
 0.12.8
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-30968][BUILD] Upgrade aws-java-sdk-sts to 1.11.655

2020-02-27 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new c6f718b  [SPARK-30968][BUILD] Upgrade aws-java-sdk-sts to 1.11.655
c6f718b is described below

commit c6f718b5f41fe76c9c6eedcf2e9684d4d291cb4d
Author: Dongjoon Hyun 
AuthorDate: Thu Feb 27 17:05:56 2020 -0800

[SPARK-30968][BUILD] Upgrade aws-java-sdk-sts to 1.11.655

### What changes were proposed in this pull request?

This PR aims to upgrade `aws-java-sdk-sts` to `1.11.655`.

### Why are the changes needed?

[SPARK-29677](https://github.com/apache/spark/pull/26333) upgrades AWS 
Kinesis Client to 1.12.0 for Apache Spark 2.4.5 and 3.0.0.

Since AWS Kinesis Client 1.12.0 is using AWS SDK 1.11.665, 
`aws-java-sdk-sts` should be consistent with Kinesis client dependency.
- https://github.com/awslabs/amazon-kinesis-client/releases/tag/v1.12.0

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Pass the Jenkins.

Closes #27720 from dongjoon-hyun/SPARK-30968.

Authored-by: Dongjoon Hyun 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit 3995728c3ce9d85b0436c0220f957b9d9133d64a)
Signed-off-by: Dongjoon Hyun 
---
 pom.xml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/pom.xml b/pom.xml
index b3750e4..9d36faf 100644
--- a/pom.xml
+++ b/pom.xml
@@ -149,7 +149,7 @@
 hadoop2
 1.12.0
 
-1.11.271
+1.11.655
 
 0.12.8
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated: [SPARK-30968][BUILD] Upgrade aws-java-sdk-sts to 1.11.655

2020-02-27 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new 7574d99  [SPARK-30968][BUILD] Upgrade aws-java-sdk-sts to 1.11.655
7574d99 is described below

commit 7574d99e9c8c9d3e92b1f8269ae09a7b7f0cdbd0
Author: Dongjoon Hyun 
AuthorDate: Thu Feb 27 17:05:56 2020 -0800

[SPARK-30968][BUILD] Upgrade aws-java-sdk-sts to 1.11.655

### What changes were proposed in this pull request?

This PR aims to upgrade `aws-java-sdk-sts` to `1.11.655`.

### Why are the changes needed?

[SPARK-29677](https://github.com/apache/spark/pull/26333) upgrades AWS 
Kinesis Client to 1.12.0 for Apache Spark 2.4.5 and 3.0.0.

Since AWS Kinesis Client 1.12.0 is using AWS SDK 1.11.665, 
`aws-java-sdk-sts` should be consistent with Kinesis client dependency.
- https://github.com/awslabs/amazon-kinesis-client/releases/tag/v1.12.0

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Pass the Jenkins.

Closes #27720 from dongjoon-hyun/SPARK-30968.

Authored-by: Dongjoon Hyun 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit 3995728c3ce9d85b0436c0220f957b9d9133d64a)
Signed-off-by: Dongjoon Hyun 
---
 pom.xml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/pom.xml b/pom.xml
index 0741096..a162cd1 100644
--- a/pom.xml
+++ b/pom.xml
@@ -144,7 +144,7 @@
 hadoop2
 1.12.0
 
-1.11.271
+1.11.655
 
 0.12.8
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated: [SPARK-30968][BUILD] Upgrade aws-java-sdk-sts to 1.11.655

2020-02-27 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new 7574d99  [SPARK-30968][BUILD] Upgrade aws-java-sdk-sts to 1.11.655
7574d99 is described below

commit 7574d99e9c8c9d3e92b1f8269ae09a7b7f0cdbd0
Author: Dongjoon Hyun 
AuthorDate: Thu Feb 27 17:05:56 2020 -0800

[SPARK-30968][BUILD] Upgrade aws-java-sdk-sts to 1.11.655

### What changes were proposed in this pull request?

This PR aims to upgrade `aws-java-sdk-sts` to `1.11.655`.

### Why are the changes needed?

[SPARK-29677](https://github.com/apache/spark/pull/26333) upgrades AWS 
Kinesis Client to 1.12.0 for Apache Spark 2.4.5 and 3.0.0.

Since AWS Kinesis Client 1.12.0 is using AWS SDK 1.11.665, 
`aws-java-sdk-sts` should be consistent with Kinesis client dependency.
- https://github.com/awslabs/amazon-kinesis-client/releases/tag/v1.12.0

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Pass the Jenkins.

Closes #27720 from dongjoon-hyun/SPARK-30968.

Authored-by: Dongjoon Hyun 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit 3995728c3ce9d85b0436c0220f957b9d9133d64a)
Signed-off-by: Dongjoon Hyun 
---
 pom.xml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/pom.xml b/pom.xml
index 0741096..a162cd1 100644
--- a/pom.xml
+++ b/pom.xml
@@ -144,7 +144,7 @@
 hadoop2
 1.12.0
 
-1.11.271
+1.11.655
 
 0.12.8
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-30968][BUILD] Upgrade aws-java-sdk-sts to 1.11.655

2020-02-27 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new c6f718b  [SPARK-30968][BUILD] Upgrade aws-java-sdk-sts to 1.11.655
c6f718b is described below

commit c6f718b5f41fe76c9c6eedcf2e9684d4d291cb4d
Author: Dongjoon Hyun 
AuthorDate: Thu Feb 27 17:05:56 2020 -0800

[SPARK-30968][BUILD] Upgrade aws-java-sdk-sts to 1.11.655

### What changes were proposed in this pull request?

This PR aims to upgrade `aws-java-sdk-sts` to `1.11.655`.

### Why are the changes needed?

[SPARK-29677](https://github.com/apache/spark/pull/26333) upgrades AWS 
Kinesis Client to 1.12.0 for Apache Spark 2.4.5 and 3.0.0.

Since AWS Kinesis Client 1.12.0 is using AWS SDK 1.11.665, 
`aws-java-sdk-sts` should be consistent with Kinesis client dependency.
- https://github.com/awslabs/amazon-kinesis-client/releases/tag/v1.12.0

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Pass the Jenkins.

Closes #27720 from dongjoon-hyun/SPARK-30968.

Authored-by: Dongjoon Hyun 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit 3995728c3ce9d85b0436c0220f957b9d9133d64a)
Signed-off-by: Dongjoon Hyun 
---
 pom.xml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/pom.xml b/pom.xml
index b3750e4..9d36faf 100644
--- a/pom.xml
+++ b/pom.xml
@@ -149,7 +149,7 @@
 hadoop2
 1.12.0
 
-1.11.271
+1.11.655
 
 0.12.8
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated: [SPARK-30968][BUILD] Upgrade aws-java-sdk-sts to 1.11.655

2020-02-27 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new 7574d99  [SPARK-30968][BUILD] Upgrade aws-java-sdk-sts to 1.11.655
7574d99 is described below

commit 7574d99e9c8c9d3e92b1f8269ae09a7b7f0cdbd0
Author: Dongjoon Hyun 
AuthorDate: Thu Feb 27 17:05:56 2020 -0800

[SPARK-30968][BUILD] Upgrade aws-java-sdk-sts to 1.11.655

### What changes were proposed in this pull request?

This PR aims to upgrade `aws-java-sdk-sts` to `1.11.655`.

### Why are the changes needed?

[SPARK-29677](https://github.com/apache/spark/pull/26333) upgrades AWS 
Kinesis Client to 1.12.0 for Apache Spark 2.4.5 and 3.0.0.

Since AWS Kinesis Client 1.12.0 is using AWS SDK 1.11.665, 
`aws-java-sdk-sts` should be consistent with Kinesis client dependency.
- https://github.com/awslabs/amazon-kinesis-client/releases/tag/v1.12.0

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Pass the Jenkins.

Closes #27720 from dongjoon-hyun/SPARK-30968.

Authored-by: Dongjoon Hyun 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit 3995728c3ce9d85b0436c0220f957b9d9133d64a)
Signed-off-by: Dongjoon Hyun 
---
 pom.xml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/pom.xml b/pom.xml
index 0741096..a162cd1 100644
--- a/pom.xml
+++ b/pom.xml
@@ -144,7 +144,7 @@
 hadoop2
 1.12.0
 
-1.11.271
+1.11.655
 
 0.12.8
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (c0d4cc3 -> 1383bd4)

2020-02-28 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from c0d4cc3  [MINOR][SQL] Remove unnecessary MiMa excludes
 add 1383bd4  [SPARK-30970][K8S][CORE] Fix NPE while resolving k8s master 
url

No new revisions were added by this update.

Summary of changes:
 core/src/main/scala/org/apache/spark/util/Utils.scala | 15 ++-
 .../src/test/scala/org/apache/spark/util/UtilsSuite.scala |  4 
 2 files changed, 10 insertions(+), 9 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (c0d4cc3 -> 1383bd4)

2020-02-28 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from c0d4cc3  [MINOR][SQL] Remove unnecessary MiMa excludes
 add 1383bd4  [SPARK-30970][K8S][CORE] Fix NPE while resolving k8s master 
url

No new revisions were added by this update.

Summary of changes:
 core/src/main/scala/org/apache/spark/util/Utils.scala | 15 ++-
 .../src/test/scala/org/apache/spark/util/UtilsSuite.scala |  4 
 2 files changed, 10 insertions(+), 9 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-30970][K8S][CORE] Fix NPE while resolving k8s master url

2020-02-28 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new b8e9cdc  [SPARK-30970][K8S][CORE] Fix NPE while resolving k8s master 
url
b8e9cdc is described below

commit b8e9cdcd14dcda68dde0c646f58d10880332691e
Author: Kent Yao 
AuthorDate: Fri Feb 28 00:01:20 2020 -0800

[SPARK-30970][K8S][CORE] Fix NPE while resolving k8s master url

### What changes were proposed in this pull request?

```
bin/spark-sql --master  k8s:///https://kubernetes.docker.internal:6443 
--conf spark.kubernetes.container.image=yaooqinn/spark:v2.4.4
Exception in thread "main" java.lang.NullPointerException
at 
org.apache.spark.util.Utils$.checkAndGetK8sMasterUrl(Utils.scala:2739)
at 
org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:261)
at 
org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:774)
at 
org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
at 
org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
```
Althrough `k8s:///https://kubernetes.docker.internal:6443` is a wrong 
master url but should not throw npe
The `case null` will never be touched.

https://github.com/apache/spark/blob/3f4060c340d6bac412e8819c4388ccba226efcf3/core/src/main/scala/org/apache/spark/util/Utils.scala#L2772-L2776

### Why are the changes needed?

bug fix

### Does this PR introduce any user-facing change?

no

### How was this patch tested?

add ut case

Closes #27721 from yaooqinn/SPARK-30970.

Authored-by: Kent Yao 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit 1383bd459a834fb075c5b570338fab0886110df9)
Signed-off-by: Dongjoon Hyun 
---
 core/src/main/scala/org/apache/spark/util/Utils.scala | 15 ++-
 .../src/test/scala/org/apache/spark/util/UtilsSuite.scala |  4 
 2 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/core/src/main/scala/org/apache/spark/util/Utils.scala 
b/core/src/main/scala/org/apache/spark/util/Utils.scala
index 297cc5e..dde4323 100644
--- a/core/src/main/scala/org/apache/spark/util/Utils.scala
+++ b/core/src/main/scala/org/apache/spark/util/Utils.scala
@@ -2772,19 +2772,16 @@ private[spark] object Utils extends Logging {
 }
 
 val masterScheme = new URI(masterWithoutK8sPrefix).getScheme
-val resolvedURL = masterScheme.toLowerCase(Locale.ROOT) match {
-  case "https" =>
+
+val resolvedURL = Option(masterScheme).map(_.toLowerCase(Locale.ROOT)) 
match {
+  case Some("https") =>
 masterWithoutK8sPrefix
-  case "http" =>
+  case Some("http") =>
 logWarning("Kubernetes master URL uses HTTP instead of HTTPS.")
 masterWithoutK8sPrefix
-  case null =>
-val resolvedURL = s"https://$masterWithoutK8sPrefix";
-logInfo("No scheme specified for kubernetes master URL, so defaulting 
to https. Resolved " +
-  s"URL is $resolvedURL.")
-resolvedURL
   case _ =>
-throw new IllegalArgumentException("Invalid Kubernetes master scheme: 
" + masterScheme)
+throw new IllegalArgumentException("Invalid Kubernetes master scheme: 
" + masterScheme
+  + " found in URL: " + masterWithoutK8sPrefix)
 }
 
 s"k8s://$resolvedURL"
diff --git a/core/src/test/scala/org/apache/spark/util/UtilsSuite.scala 
b/core/src/test/scala/org/apache/spark/util/UtilsSuite.scala
index 8f8902e..f5e438b 100644
--- a/core/src/test/scala/org/apache/spark/util/UtilsSuite.scala
+++ b/core/src/test/scala/org/apache/spark/util/UtilsSuite.scala
@@ -1243,6 +1243,10 @@ class UtilsSuite extends SparkFunSuite with 
ResetSystemProperties with Logging {
 intercept[IllegalArgumentException] {
   Utils.checkAndGetK8sMasterUrl("k8s://foo://host:port")
 }
+
+intercept[IllegalArgumentException] {
+  Utils.checkAndGetK8sMasterUrl("k8s:///https://host:port";)
+}
   }
 
   test("stringHalfWidth") {


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated (7574d99 -> ff5ba49)

2020-02-28 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 7574d99  [SPARK-30968][BUILD] Upgrade aws-java-sdk-sts to 1.11.655
 add ff5ba49  [SPARK-30970][K8S][CORE][2.4] Fix NPE while resolving k8s 
master url

No new revisions were added by this update.

Summary of changes:
 core/src/main/scala/org/apache/spark/util/Utils.scala | 15 ++-
 .../src/test/scala/org/apache/spark/util/UtilsSuite.scala |  4 
 2 files changed, 10 insertions(+), 9 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-28998][SQL][FOLLOW-UP] Remove unnecessary MiMa excludes

2020-02-28 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 961c539  [SPARK-28998][SQL][FOLLOW-UP] Remove unnecessary MiMa excludes
961c539 is described below

commit 961c539a676d2646a9315b427ad81852aa81b658
Author: Huaxin Gao 
AuthorDate: Fri Feb 28 11:22:08 2020 -0800

[SPARK-28998][SQL][FOLLOW-UP] Remove unnecessary MiMa excludes

### What changes were proposed in this pull request?
Remove the cases for ```MissingTypesProblem```, 
```InheritedNewAbstractMethodProblem```, ```DirectMissingMethodProblem``` and 
```ReversedMissingMethodProblem```.

### Why are the changes needed?
After the changes, we don't have ```org.apache.spark.sql.sources.v2```  any 
more, so the only problem we can get is ```MissingClassProblem```

### Does this PR introduce any user-facing change?
No

### How was this patch tested?
Manually tested

Closes #27731 from huaxingao/spark-28998-followup.

Authored-by: Huaxin Gao 
Signed-off-by: Dongjoon Hyun 
---
 project/MimaExcludes.scala | 8 
 1 file changed, 8 deletions(-)

diff --git a/project/MimaExcludes.scala b/project/MimaExcludes.scala
index ccb545d..cd55fa8 100644
--- a/project/MimaExcludes.scala
+++ b/project/MimaExcludes.scala
@@ -339,14 +339,6 @@ object MimaExcludes {
 (problem: Problem) => problem match {
   case MissingClassProblem(cls) =>
 !cls.fullName.startsWith("org.apache.spark.sql.sources.v2")
-  case MissingTypesProblem(newCls, _) =>
-!newCls.fullName.startsWith("org.apache.spark.sql.sources.v2")
-  case InheritedNewAbstractMethodProblem(cls, _) =>
-!cls.fullName.startsWith("org.apache.spark.sql.sources.v2")
-  case DirectMissingMethodProblem(meth) =>
-!meth.owner.fullName.startsWith("org.apache.spark.sql.sources.v2")
-  case ReversedMissingMethodProblem(meth) =>
-!meth.owner.fullName.startsWith("org.apache.spark.sql.sources.v2")
   case _ => true
 },
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-28998][SQL][FOLLOW-UP] Remove unnecessary MiMa excludes

2020-02-28 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 2342e28  [SPARK-28998][SQL][FOLLOW-UP] Remove unnecessary MiMa excludes
2342e28 is described below

commit 2342e280a57a108ff2327aae7157d85065016244
Author: Huaxin Gao 
AuthorDate: Fri Feb 28 11:22:08 2020 -0800

[SPARK-28998][SQL][FOLLOW-UP] Remove unnecessary MiMa excludes

### What changes were proposed in this pull request?
Remove the cases for ```MissingTypesProblem```, 
```InheritedNewAbstractMethodProblem```, ```DirectMissingMethodProblem``` and 
```ReversedMissingMethodProblem```.

### Why are the changes needed?
After the changes, we don't have ```org.apache.spark.sql.sources.v2```  any 
more, so the only problem we can get is ```MissingClassProblem```

### Does this PR introduce any user-facing change?
No

### How was this patch tested?
Manually tested

Closes #27731 from huaxingao/spark-28998-followup.

Authored-by: Huaxin Gao 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit 961c539a676d2646a9315b427ad81852aa81b658)
Signed-off-by: Dongjoon Hyun 
---
 project/MimaExcludes.scala | 8 
 1 file changed, 8 deletions(-)

diff --git a/project/MimaExcludes.scala b/project/MimaExcludes.scala
index 23f33a6..7f66577 100644
--- a/project/MimaExcludes.scala
+++ b/project/MimaExcludes.scala
@@ -335,14 +335,6 @@ object MimaExcludes {
 (problem: Problem) => problem match {
   case MissingClassProblem(cls) =>
 !cls.fullName.startsWith("org.apache.spark.sql.sources.v2")
-  case MissingTypesProblem(newCls, _) =>
-!newCls.fullName.startsWith("org.apache.spark.sql.sources.v2")
-  case InheritedNewAbstractMethodProblem(cls, _) =>
-!cls.fullName.startsWith("org.apache.spark.sql.sources.v2")
-  case DirectMissingMethodProblem(meth) =>
-!meth.owner.fullName.startsWith("org.apache.spark.sql.sources.v2")
-  case ReversedMissingMethodProblem(meth) =>
-!meth.owner.fullName.startsWith("org.apache.spark.sql.sources.v2")
   case _ => true
 },
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-30977][CORE][3.0] Make ResourceProfile and ResourceProfileBuilder private

2020-02-28 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 4cac4a5  [SPARK-30977][CORE][3.0] Make ResourceProfile and 
ResourceProfileBuilder private
4cac4a5 is described below

commit 4cac4a50e0bd81edc7a6a18674a64045c7c247bb
Author: Thomas Graves 
AuthorDate: Fri Feb 28 18:12:20 2020 -0800

[SPARK-30977][CORE][3.0] Make ResourceProfile and ResourceProfileBuilder 
private

### What changes were proposed in this pull request?

Make the ResourceProfile and ResourceProfileBuilder apis private since the 
entire feature didn't make 3.0.

### Why are the changes needed?

to not expose to user to early.

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

unit tests

Closes #27737 from tgravescs/SPARK-30977.

Authored-by: Thomas Graves 
Signed-off-by: Dongjoon Hyun 
---
 core/src/main/scala/org/apache/spark/resource/ResourceProfile.scala  | 5 -
 .../scala/org/apache/spark/resource/ResourceProfileBuilder.scala | 5 -
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git 
a/core/src/main/scala/org/apache/spark/resource/ResourceProfile.scala 
b/core/src/main/scala/org/apache/spark/resource/ResourceProfile.scala
index 14019d2..f3c39d9 100644
--- a/core/src/main/scala/org/apache/spark/resource/ResourceProfile.scala
+++ b/core/src/main/scala/org/apache/spark/resource/ResourceProfile.scala
@@ -34,9 +34,12 @@ import 
org.apache.spark.internal.config.Python.PYSPARK_EXECUTOR_MEMORY
  * specify executor and task requirements for an RDD that will get applied 
during a
  * stage. This allows the user to change the resource requirements between 
stages.
  * This is meant to be immutable so user can't change it after building.
+ *
+ * This api is currently private until the rest of the pieces are in place and 
then it
+ * will become public.
  */
 @Evolving
-class ResourceProfile(
+private[spark] class ResourceProfile(
 val executorResources: Map[String, ExecutorResourceRequest],
 val taskResources: Map[String, TaskResourceRequest]) extends Serializable 
with Logging {
 
diff --git 
a/core/src/main/scala/org/apache/spark/resource/ResourceProfileBuilder.scala 
b/core/src/main/scala/org/apache/spark/resource/ResourceProfileBuilder.scala
index 0d55c17..db1c77d 100644
--- a/core/src/main/scala/org/apache/spark/resource/ResourceProfileBuilder.scala
+++ b/core/src/main/scala/org/apache/spark/resource/ResourceProfileBuilder.scala
@@ -29,9 +29,12 @@ import org.apache.spark.annotation.Evolving
  * A ResourceProfile allows the user to specify executor and task requirements 
for an RDD
  * that will get applied during a stage. This allows the user to change the 
resource
  * requirements between stages.
+ *
+ * This api is currently private until the rest of the pieces are in place and 
then it
+ * will become public.
  */
 @Evolving
-class ResourceProfileBuilder() {
+private[spark] class ResourceProfileBuilder() {
 
   private val _taskResources = new ConcurrentHashMap[String, 
TaskResourceRequest]()
   private val _executorResources = new ConcurrentHashMap[String, 
ExecutorResourceRequest]()


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-30977][CORE][3.0] Make ResourceProfile and ResourceProfileBuilder private

2020-02-28 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 4cac4a5  [SPARK-30977][CORE][3.0] Make ResourceProfile and 
ResourceProfileBuilder private
4cac4a5 is described below

commit 4cac4a50e0bd81edc7a6a18674a64045c7c247bb
Author: Thomas Graves 
AuthorDate: Fri Feb 28 18:12:20 2020 -0800

[SPARK-30977][CORE][3.0] Make ResourceProfile and ResourceProfileBuilder 
private

### What changes were proposed in this pull request?

Make the ResourceProfile and ResourceProfileBuilder apis private since the 
entire feature didn't make 3.0.

### Why are the changes needed?

to not expose to user to early.

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

unit tests

Closes #27737 from tgravescs/SPARK-30977.

Authored-by: Thomas Graves 
Signed-off-by: Dongjoon Hyun 
---
 core/src/main/scala/org/apache/spark/resource/ResourceProfile.scala  | 5 -
 .../scala/org/apache/spark/resource/ResourceProfileBuilder.scala | 5 -
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git 
a/core/src/main/scala/org/apache/spark/resource/ResourceProfile.scala 
b/core/src/main/scala/org/apache/spark/resource/ResourceProfile.scala
index 14019d2..f3c39d9 100644
--- a/core/src/main/scala/org/apache/spark/resource/ResourceProfile.scala
+++ b/core/src/main/scala/org/apache/spark/resource/ResourceProfile.scala
@@ -34,9 +34,12 @@ import 
org.apache.spark.internal.config.Python.PYSPARK_EXECUTOR_MEMORY
  * specify executor and task requirements for an RDD that will get applied 
during a
  * stage. This allows the user to change the resource requirements between 
stages.
  * This is meant to be immutable so user can't change it after building.
+ *
+ * This api is currently private until the rest of the pieces are in place and 
then it
+ * will become public.
  */
 @Evolving
-class ResourceProfile(
+private[spark] class ResourceProfile(
 val executorResources: Map[String, ExecutorResourceRequest],
 val taskResources: Map[String, TaskResourceRequest]) extends Serializable 
with Logging {
 
diff --git 
a/core/src/main/scala/org/apache/spark/resource/ResourceProfileBuilder.scala 
b/core/src/main/scala/org/apache/spark/resource/ResourceProfileBuilder.scala
index 0d55c17..db1c77d 100644
--- a/core/src/main/scala/org/apache/spark/resource/ResourceProfileBuilder.scala
+++ b/core/src/main/scala/org/apache/spark/resource/ResourceProfileBuilder.scala
@@ -29,9 +29,12 @@ import org.apache.spark.annotation.Evolving
  * A ResourceProfile allows the user to specify executor and task requirements 
for an RDD
  * that will get applied during a stage. This allows the user to change the 
resource
  * requirements between stages.
+ *
+ * This api is currently private until the rest of the pieces are in place and 
then it
+ * will become public.
  */
 @Evolving
-class ResourceProfileBuilder() {
+private[spark] class ResourceProfileBuilder() {
 
   private val _taskResources = new ConcurrentHashMap[String, 
TaskResourceRequest]()
   private val _executorResources = new ConcurrentHashMap[String, 
ExecutorResourceRequest]()


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (b517f99 -> f0010c8)

2020-03-02 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from b517f99  [SPARK-30969][CORE] Remove resource coordination support from 
Standalone
 add f0010c8  [SPARK-31003][TESTS] Fix incorrect uses of assume() in tests

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/catalyst/expressions/OrderingSuite.scala   | 2 +-
 sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala | 2 +-
 .../scala/org/apache/spark/sql/execution/command/DDLSuite.scala | 4 ++--
 .../src/test/scala/org/apache/spark/sql/internal/CatalogSuite.scala | 6 +++---
 .../test/scala/org/apache/spark/sql/sources/BucketedReadSuite.scala | 2 +-
 .../scala/org/apache/spark/sql/sources/BucketedWriteSuite.scala | 2 +-
 .../scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala| 6 +++---
 .../src/test/scala/org/apache/spark/sql/hive/test/TestHive.scala| 2 +-
 .../apache/spark/sql/sources/BucketedReadWithHiveSupportSuite.scala | 2 +-
 .../spark/sql/sources/BucketedWriteWithHiveSupportSuite.scala   | 2 +-
 10 files changed, 15 insertions(+), 15 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (b517f99 -> f0010c8)

2020-03-02 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from b517f99  [SPARK-30969][CORE] Remove resource coordination support from 
Standalone
 add f0010c8  [SPARK-31003][TESTS] Fix incorrect uses of assume() in tests

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/catalyst/expressions/OrderingSuite.scala   | 2 +-
 sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala | 2 +-
 .../scala/org/apache/spark/sql/execution/command/DDLSuite.scala | 4 ++--
 .../src/test/scala/org/apache/spark/sql/internal/CatalogSuite.scala | 6 +++---
 .../test/scala/org/apache/spark/sql/sources/BucketedReadSuite.scala | 2 +-
 .../scala/org/apache/spark/sql/sources/BucketedWriteSuite.scala | 2 +-
 .../scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala| 6 +++---
 .../src/test/scala/org/apache/spark/sql/hive/test/TestHive.scala| 2 +-
 .../apache/spark/sql/sources/BucketedReadWithHiveSupportSuite.scala | 2 +-
 .../spark/sql/sources/BucketedWriteWithHiveSupportSuite.scala   | 2 +-
 10 files changed, 15 insertions(+), 15 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-31003][TESTS] Fix incorrect uses of assume() in tests

2020-03-02 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 8cb23f0  [SPARK-31003][TESTS] Fix incorrect uses of assume() in tests
8cb23f0 is described below

commit 8cb23f0cb5b20b7e49fdd16c52d6451e901d9a7a
Author: Josh Rosen 
AuthorDate: Mon Mar 2 15:20:45 2020 -0800

[SPARK-31003][TESTS] Fix incorrect uses of assume() in tests

### What changes were proposed in this pull request?

This patch fixes several incorrect uses of `assume()` in our tests.

If a call to `assume(condition)` fails then it will cause the test to be 
marked as skipped instead of failed: this feature allows test cases to be 
skipped if certain prerequisites are missing. For example, we use this to skip 
certain tests when running on Windows (or when Python dependencies are 
unavailable).

In contrast, `assert(condition)` will fail the test if the condition 
doesn't hold.

If `assume()` is accidentally substituted for `assert()`then the resulting 
test will be marked as skipped in cases where it should have failed, 
undermining the purpose of the test.

This patch fixes several such cases, replacing certain `assume()` calls 
with `assert()`.

Credit to ahirreddy for spotting this problem.

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Existing tests.

Closes #27754 from JoshRosen/fix-assume-vs-assert.

Lead-authored-by: Josh Rosen 
Co-authored-by: Josh Rosen 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit f0010c81e2ef9b8859b39917bb62b48d739a4a22)
Signed-off-by: Dongjoon Hyun 
---
 .../org/apache/spark/sql/catalyst/expressions/OrderingSuite.scala   | 2 +-
 sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala | 2 +-
 .../scala/org/apache/spark/sql/execution/command/DDLSuite.scala | 4 ++--
 .../src/test/scala/org/apache/spark/sql/internal/CatalogSuite.scala | 6 +++---
 .../test/scala/org/apache/spark/sql/sources/BucketedReadSuite.scala | 2 +-
 .../scala/org/apache/spark/sql/sources/BucketedWriteSuite.scala | 2 +-
 .../scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala| 6 +++---
 .../src/test/scala/org/apache/spark/sql/hive/test/TestHive.scala| 2 +-
 .../apache/spark/sql/sources/BucketedReadWithHiveSupportSuite.scala | 2 +-
 .../spark/sql/sources/BucketedWriteWithHiveSupportSuite.scala   | 2 +-
 10 files changed, 15 insertions(+), 15 deletions(-)

diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/OrderingSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/OrderingSuite.scala
index 94e251d..4488902 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/OrderingSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/OrderingSuite.scala
@@ -106,7 +106,7 @@ class OrderingSuite extends SparkFunSuite with 
ExpressionEvalHelper {
   StructField("a", dataType, nullable = true) ::
 StructField("b", dataType, nullable = true) :: Nil)
 val maybeDataGenerator = RandomDataGenerator.forType(rowType, nullable 
= false)
-assume(maybeDataGenerator.isDefined)
+assert(maybeDataGenerator.isDefined)
 val randGenerator = maybeDataGenerator.get
 val toCatalyst = 
CatalystTypeConverters.createToCatalystConverter(rowType)
 for (_ <- 1 to 50) {
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala
index cd2c681..8189353 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala
@@ -195,7 +195,7 @@ class CachedTableSuite extends QueryTest with SQLTestUtils
   }
 
   test("SPARK-1669: cacheTable should be idempotent") {
-assume(!spark.table("testData").logicalPlan.isInstanceOf[InMemoryRelation])
+assert(!spark.table("testData").logicalPlan.isInstanceOf[InMemoryRelation])
 
 spark.catalog.cacheTable("testData")
 assertCached(spark.table("testData"))
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala
index 6c824c2..5a67dce 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala
+++ 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala
@@ -1033,7 +1033,7 @@ abstract class DDLSuite extends QueryTest with 
SQLTestUtils {
 df.write.insertInto("students")
 spark.

[spark] branch branch-3.0 updated: [SPARK-31003][TESTS] Fix incorrect uses of assume() in tests

2020-03-02 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 8cb23f0  [SPARK-31003][TESTS] Fix incorrect uses of assume() in tests
8cb23f0 is described below

commit 8cb23f0cb5b20b7e49fdd16c52d6451e901d9a7a
Author: Josh Rosen 
AuthorDate: Mon Mar 2 15:20:45 2020 -0800

[SPARK-31003][TESTS] Fix incorrect uses of assume() in tests

### What changes were proposed in this pull request?

This patch fixes several incorrect uses of `assume()` in our tests.

If a call to `assume(condition)` fails then it will cause the test to be 
marked as skipped instead of failed: this feature allows test cases to be 
skipped if certain prerequisites are missing. For example, we use this to skip 
certain tests when running on Windows (or when Python dependencies are 
unavailable).

In contrast, `assert(condition)` will fail the test if the condition 
doesn't hold.

If `assume()` is accidentally substituted for `assert()`then the resulting 
test will be marked as skipped in cases where it should have failed, 
undermining the purpose of the test.

This patch fixes several such cases, replacing certain `assume()` calls 
with `assert()`.

Credit to ahirreddy for spotting this problem.

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Existing tests.

Closes #27754 from JoshRosen/fix-assume-vs-assert.

Lead-authored-by: Josh Rosen 
Co-authored-by: Josh Rosen 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit f0010c81e2ef9b8859b39917bb62b48d739a4a22)
Signed-off-by: Dongjoon Hyun 
---
 .../org/apache/spark/sql/catalyst/expressions/OrderingSuite.scala   | 2 +-
 sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala | 2 +-
 .../scala/org/apache/spark/sql/execution/command/DDLSuite.scala | 4 ++--
 .../src/test/scala/org/apache/spark/sql/internal/CatalogSuite.scala | 6 +++---
 .../test/scala/org/apache/spark/sql/sources/BucketedReadSuite.scala | 2 +-
 .../scala/org/apache/spark/sql/sources/BucketedWriteSuite.scala | 2 +-
 .../scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala| 6 +++---
 .../src/test/scala/org/apache/spark/sql/hive/test/TestHive.scala| 2 +-
 .../apache/spark/sql/sources/BucketedReadWithHiveSupportSuite.scala | 2 +-
 .../spark/sql/sources/BucketedWriteWithHiveSupportSuite.scala   | 2 +-
 10 files changed, 15 insertions(+), 15 deletions(-)

diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/OrderingSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/OrderingSuite.scala
index 94e251d..4488902 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/OrderingSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/OrderingSuite.scala
@@ -106,7 +106,7 @@ class OrderingSuite extends SparkFunSuite with 
ExpressionEvalHelper {
   StructField("a", dataType, nullable = true) ::
 StructField("b", dataType, nullable = true) :: Nil)
 val maybeDataGenerator = RandomDataGenerator.forType(rowType, nullable 
= false)
-assume(maybeDataGenerator.isDefined)
+assert(maybeDataGenerator.isDefined)
 val randGenerator = maybeDataGenerator.get
 val toCatalyst = 
CatalystTypeConverters.createToCatalystConverter(rowType)
 for (_ <- 1 to 50) {
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala
index cd2c681..8189353 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala
@@ -195,7 +195,7 @@ class CachedTableSuite extends QueryTest with SQLTestUtils
   }
 
   test("SPARK-1669: cacheTable should be idempotent") {
-assume(!spark.table("testData").logicalPlan.isInstanceOf[InMemoryRelation])
+assert(!spark.table("testData").logicalPlan.isInstanceOf[InMemoryRelation])
 
 spark.catalog.cacheTable("testData")
 assertCached(spark.table("testData"))
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala
index 6c824c2..5a67dce 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala
+++ 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala
@@ -1033,7 +1033,7 @@ abstract class DDLSuite extends QueryTest with 
SQLTestUtils {
 df.write.insertInto("students")
 spark.

[spark] branch branch-2.4 updated (cd8f86a -> 0b71b4d)

2020-03-02 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git.


from cd8f86a  [SPARK-30813][ML] Fix Matrices.sprand comments
 add 0b71b4d  [SPARK-31003][TESTS] Fix incorrect uses of assume() in tests

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/catalyst/expressions/OrderingSuite.scala   | 2 +-
 sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala | 2 +-
 .../scala/org/apache/spark/sql/execution/command/DDLSuite.scala | 4 ++--
 .../src/test/scala/org/apache/spark/sql/internal/CatalogSuite.scala | 6 +++---
 .../test/scala/org/apache/spark/sql/sources/BucketedReadSuite.scala | 2 +-
 .../scala/org/apache/spark/sql/sources/BucketedWriteSuite.scala | 2 +-
 .../src/main/scala/org/apache/spark/sql/hive/test/TestHive.scala| 2 +-
 .../scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala| 6 +++---
 .../apache/spark/sql/sources/BucketedReadWithHiveSupportSuite.scala | 2 +-
 .../spark/sql/sources/BucketedWriteWithHiveSupportSuite.scala   | 2 +-
 10 files changed, 15 insertions(+), 15 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated (cd8f86a -> 0b71b4d)

2020-03-02 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git.


from cd8f86a  [SPARK-30813][ML] Fix Matrices.sprand comments
 add 0b71b4d  [SPARK-31003][TESTS] Fix incorrect uses of assume() in tests

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/catalyst/expressions/OrderingSuite.scala   | 2 +-
 sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala | 2 +-
 .../scala/org/apache/spark/sql/execution/command/DDLSuite.scala | 4 ++--
 .../src/test/scala/org/apache/spark/sql/internal/CatalogSuite.scala | 6 +++---
 .../test/scala/org/apache/spark/sql/sources/BucketedReadSuite.scala | 2 +-
 .../scala/org/apache/spark/sql/sources/BucketedWriteSuite.scala | 2 +-
 .../src/main/scala/org/apache/spark/sql/hive/test/TestHive.scala| 2 +-
 .../scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala| 6 +++---
 .../apache/spark/sql/sources/BucketedReadWithHiveSupportSuite.scala | 2 +-
 .../spark/sql/sources/BucketedWriteWithHiveSupportSuite.scala   | 2 +-
 10 files changed, 15 insertions(+), 15 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (c263c15 -> 4a1d273)

2020-03-03 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from c263c15  [SPARK-31015][SQL] Star(*) expression fails when used with 
qualified column names for v2 tables
 add 4a1d273  [SPARK-30997][SQL] Fix an analysis failure in generators with 
aggregate functions

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/catalyst/analysis/Analyzer.scala | 14 ++
 .../spark/sql/catalyst/analysis/AnalysisErrorSuite.scala  | 15 +++
 .../org/apache/spark/sql/GeneratorFunctionSuite.scala |  5 +
 3 files changed, 34 insertions(+)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (c263c15 -> 4a1d273)

2020-03-03 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from c263c15  [SPARK-31015][SQL] Star(*) expression fails when used with 
qualified column names for v2 tables
 add 4a1d273  [SPARK-30997][SQL] Fix an analysis failure in generators with 
aggregate functions

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/catalyst/analysis/Analyzer.scala | 14 ++
 .../spark/sql/catalyst/analysis/AnalysisErrorSuite.scala  | 15 +++
 .../org/apache/spark/sql/GeneratorFunctionSuite.scala |  5 +
 3 files changed, 34 insertions(+)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-30997][SQL] Fix an analysis failure in generators with aggregate functions

2020-03-03 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 7d853ab  [SPARK-30997][SQL] Fix an analysis failure in generators with 
aggregate functions
7d853ab is described below

commit 7d853ab6eba479a7cc5d8839b4fc497bc6b6d4c8
Author: Takeshi Yamamuro 
AuthorDate: Tue Mar 3 12:25:12 2020 -0800

[SPARK-30997][SQL] Fix an analysis failure in generators with aggregate 
functions

### What changes were proposed in this pull request?

We have supported generators in SQL aggregate expressions by SPARK-28782.
But, the generator(explode) query with aggregate functions in DataFrame 
failed as follows;

```
// SPARK-28782: Generator support in aggregate expressions
scala> spark.range(3).toDF("id").createOrReplaceTempView("t")
scala> sql("select explode(array(min(id), max(id))) from t").show()
+---+
|col|
+---+
|  0|
|  2|
+---+

// A failure case handled in this pr
scala> spark.range(3).select(explode(array(min($"id"), max($"id".show()
org.apache.spark.sql.AnalysisException:
The query operator `Generate` contains one or more unsupported
expression types Aggregate, Window or Generate.
Invalid expressions: [min(`id`), max(`id`)];;
Project [col#46L]
+- Generate explode(array(min(id#42L), max(id#42L))), false, [col#46L]
   +- Range (0, 3, step=1, splits=Some(4))

  at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis.failAnalysis(CheckAnalysis.scala:49)
  at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis.failAnalysis$(CheckAnalysis.scala:48)
  at 
org.apache.spark.sql.catalyst.analysis.Analyzer.failAnalysis(Analyzer.scala:129)
```

The root cause is that `ExtractGenerator` wrongly replaces a project w/ 
aggregate functions
before `GlobalAggregates` replaces it with an aggregate as follows;

```
scala> sql("SET spark.sql.optimizer.planChangeLog.level=warn")
scala> spark.range(3).select(explode(array(min($"id"), max($"id".show()

20/03/01 12:51:58 WARN HiveSessionStateBuilder$$anon$1:
=== Applying Rule 
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences ===
!'Project [explode(array(min('id), max('id))) AS List()]   'Project 
[explode(array(min(id#72L), max(id#72L))) AS List()]
 +- Range (0, 3, step=1, splits=Some(4))   +- Range (0, 3, 
step=1, splits=Some(4))

20/03/01 12:51:58 WARN HiveSessionStateBuilder$$anon$1:
=== Applying Rule 
org.apache.spark.sql.catalyst.analysis.Analyzer$ExtractGenerator ===
!'Project [explode(array(min(id#72L), max(id#72L))) AS List()]   Project 
[col#76L]
!+- Range (0, 3, step=1, splits=Some(4)) +- 
Generate explode(array(min(id#72L), max(id#72L))), false, [col#76L]
!   +- 
Range (0, 3, step=1, splits=Some(4))

20/03/01 12:51:58 WARN HiveSessionStateBuilder$$anon$1:
=== Result of Batch Resolution ===
!'Project [explode(array(min('id), max('id))) AS List()]   Project [col#76L]
!+- Range (0, 3, step=1, splits=Some(4))   +- Generate 
explode(array(min(id#72L), max(id#72L))), false, [col#76L]
! +- Range (0, 
3, step=1, splits=Some(4))

// the analysis failed here...
```

To avoid the case in `ExtractGenerator`, this pr addes a condition to 
ignore generators having aggregate functions.
A correct sequence of rules is as follows;

```
20/03/01 13:19:06 WARN HiveSessionStateBuilder$$anon$1:
=== Applying Rule 
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences ===
!'Project [explode(array(min('id), max('id))) AS List()]   'Project 
[explode(array(min(id#27L), max(id#27L))) AS List()]
 +- Range (0, 3, step=1, splits=Some(4))   +- Range (0, 3, 
step=1, splits=Some(4))

20/03/01 13:19:06 WARN HiveSessionStateBuilder$$anon$1:
=== Applying Rule 
org.apache.spark.sql.catalyst.analysis.Analyzer$GlobalAggregates ===
!'Project [explode(array(min(id#27L), max(id#27L))) AS List()]   'Aggregate 
[explode(array(min(id#27L), max(id#27L))) AS List()]
 +- Range (0, 3, step=1, splits=Some(4)) +- Range 
(0, 3, step=1, splits=Some(4))

20/03/01 13:19:06 WARN HiveSessionStateBuilder$$anon$1:
=== Applying Rule 
org.apache.spark.sql.catalyst.analysis.Analyzer$ExtractGenerator ===
!'Aggregate [explode(array(min(id#27L), max(id#27L))) AS List()]   'Project

[spark] branch branch-3.0 updated: [SPARK-30997][SQL] Fix an analysis failure in generators with aggregate functions

2020-03-03 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 7d853ab  [SPARK-30997][SQL] Fix an analysis failure in generators with 
aggregate functions
7d853ab is described below

commit 7d853ab6eba479a7cc5d8839b4fc497bc6b6d4c8
Author: Takeshi Yamamuro 
AuthorDate: Tue Mar 3 12:25:12 2020 -0800

[SPARK-30997][SQL] Fix an analysis failure in generators with aggregate 
functions

### What changes were proposed in this pull request?

We have supported generators in SQL aggregate expressions by SPARK-28782.
But, the generator(explode) query with aggregate functions in DataFrame 
failed as follows;

```
// SPARK-28782: Generator support in aggregate expressions
scala> spark.range(3).toDF("id").createOrReplaceTempView("t")
scala> sql("select explode(array(min(id), max(id))) from t").show()
+---+
|col|
+---+
|  0|
|  2|
+---+

// A failure case handled in this pr
scala> spark.range(3).select(explode(array(min($"id"), max($"id".show()
org.apache.spark.sql.AnalysisException:
The query operator `Generate` contains one or more unsupported
expression types Aggregate, Window or Generate.
Invalid expressions: [min(`id`), max(`id`)];;
Project [col#46L]
+- Generate explode(array(min(id#42L), max(id#42L))), false, [col#46L]
   +- Range (0, 3, step=1, splits=Some(4))

  at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis.failAnalysis(CheckAnalysis.scala:49)
  at 
org.apache.spark.sql.catalyst.analysis.CheckAnalysis.failAnalysis$(CheckAnalysis.scala:48)
  at 
org.apache.spark.sql.catalyst.analysis.Analyzer.failAnalysis(Analyzer.scala:129)
```

The root cause is that `ExtractGenerator` wrongly replaces a project w/ 
aggregate functions
before `GlobalAggregates` replaces it with an aggregate as follows;

```
scala> sql("SET spark.sql.optimizer.planChangeLog.level=warn")
scala> spark.range(3).select(explode(array(min($"id"), max($"id".show()

20/03/01 12:51:58 WARN HiveSessionStateBuilder$$anon$1:
=== Applying Rule 
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences ===
!'Project [explode(array(min('id), max('id))) AS List()]   'Project 
[explode(array(min(id#72L), max(id#72L))) AS List()]
 +- Range (0, 3, step=1, splits=Some(4))   +- Range (0, 3, 
step=1, splits=Some(4))

20/03/01 12:51:58 WARN HiveSessionStateBuilder$$anon$1:
=== Applying Rule 
org.apache.spark.sql.catalyst.analysis.Analyzer$ExtractGenerator ===
!'Project [explode(array(min(id#72L), max(id#72L))) AS List()]   Project 
[col#76L]
!+- Range (0, 3, step=1, splits=Some(4)) +- 
Generate explode(array(min(id#72L), max(id#72L))), false, [col#76L]
!   +- 
Range (0, 3, step=1, splits=Some(4))

20/03/01 12:51:58 WARN HiveSessionStateBuilder$$anon$1:
=== Result of Batch Resolution ===
!'Project [explode(array(min('id), max('id))) AS List()]   Project [col#76L]
!+- Range (0, 3, step=1, splits=Some(4))   +- Generate 
explode(array(min(id#72L), max(id#72L))), false, [col#76L]
! +- Range (0, 
3, step=1, splits=Some(4))

// the analysis failed here...
```

To avoid the case in `ExtractGenerator`, this pr addes a condition to 
ignore generators having aggregate functions.
A correct sequence of rules is as follows;

```
20/03/01 13:19:06 WARN HiveSessionStateBuilder$$anon$1:
=== Applying Rule 
org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveReferences ===
!'Project [explode(array(min('id), max('id))) AS List()]   'Project 
[explode(array(min(id#27L), max(id#27L))) AS List()]
 +- Range (0, 3, step=1, splits=Some(4))   +- Range (0, 3, 
step=1, splits=Some(4))

20/03/01 13:19:06 WARN HiveSessionStateBuilder$$anon$1:
=== Applying Rule 
org.apache.spark.sql.catalyst.analysis.Analyzer$GlobalAggregates ===
!'Project [explode(array(min(id#27L), max(id#27L))) AS List()]   'Aggregate 
[explode(array(min(id#27L), max(id#27L))) AS List()]
 +- Range (0, 3, step=1, splits=Some(4)) +- Range 
(0, 3, step=1, splits=Some(4))

20/03/01 13:19:06 WARN HiveSessionStateBuilder$$anon$1:
=== Applying Rule 
org.apache.spark.sql.catalyst.analysis.Analyzer$ExtractGenerator ===
!'Aggregate [explode(array(min(id#27L), max(id#27L))) AS List()]   'Project

[spark] branch master updated (ebcff67 -> 3edab6c)

2020-03-04 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from ebcff67  [SPARK-30889][SPARK-30913][CORE][DOC] Add version information 
to the configuration of Tests.scala and Worker
 add 3edab6c  [MINOR][CORE] Expose the alias -c flag of --conf for 
spark-submit

No new revisions were added by this update.

Summary of changes:
 core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala | 2 +-
 docs/configuration.md  | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [MINOR][CORE] Expose the alias -c flag of --conf for spark-submit

2020-03-04 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 104a768  [MINOR][CORE] Expose the alias -c flag of --conf for 
spark-submit
104a768 is described below

commit 104a768e242bf5399bce642b9c6295476d9cdad8
Author: Kent Yao 
AuthorDate: Wed Mar 4 20:37:51 2020 -0800

[MINOR][CORE] Expose the alias -c flag of --conf for spark-submit

### What changes were proposed in this pull request?

-c is short for --conf, it was introduced since v1.1.0 but hidden from 
users until now

### Why are the changes needed?

### Does this PR introduce any user-facing change?

no

expose hidden feature

### How was this patch tested?

Nah

Closes #27802 from yaooqinn/conf.

Authored-by: Kent Yao 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit 3edab6cc1d70c102093e973a2cf97208db19be8c)
Signed-off-by: Dongjoon Hyun 
---
 core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala | 2 +-
 docs/configuration.md  | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git 
a/core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala 
b/core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala
index 3f7cfea..3090a3b 100644
--- a/core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala
+++ b/core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala
@@ -513,7 +513,7 @@ private[deploy] class SparkSubmitArguments(args: 
Seq[String], env: Map[String, S
 |  directory of each executor. File paths 
of these files
 |  in executors can be accessed via 
SparkFiles.get(fileName).
 |
-|  --conf PROP=VALUE   Arbitrary Spark configuration property.
+|  --conf, -c PROP=VALUE   Arbitrary Spark configuration property.
 |  --properties-file FILE  Path to a file from which to load extra 
properties. If not
 |  specified, this will look for 
conf/spark-defaults.conf.
 |
diff --git a/docs/configuration.md b/docs/configuration.md
index 5e6fe93..f7b7e16 100644
--- a/docs/configuration.md
+++ b/docs/configuration.md
@@ -95,7 +95,7 @@ Then, you can supply configuration values at runtime:
 
 The Spark shell and [`spark-submit`](submitting-applications.html)
 tool support two ways to load configurations dynamically. The first is command 
line options,
-such as `--master`, as shown above. `spark-submit` can accept any Spark 
property using the `--conf`
+such as `--master`, as shown above. `spark-submit` can accept any Spark 
property using the `--conf/-c`
 flag, but uses special flags for properties that play a part in launching the 
Spark application.
 Running `./bin/spark-submit --help` will show the entire list of these options.
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated: [MINOR][CORE] Expose the alias -c flag of --conf for spark-submit

2020-03-04 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new 1c17ede  [MINOR][CORE] Expose the alias -c flag of --conf for 
spark-submit
1c17ede is described below

commit 1c17ede75082fe56d3c5aedc14ac6246fdf3b333
Author: Kent Yao 
AuthorDate: Wed Mar 4 20:37:51 2020 -0800

[MINOR][CORE] Expose the alias -c flag of --conf for spark-submit

### What changes were proposed in this pull request?

-c is short for --conf, it was introduced since v1.1.0 but hidden from 
users until now

### Why are the changes needed?

### Does this PR introduce any user-facing change?

no

expose hidden feature

### How was this patch tested?

Nah

Closes #27802 from yaooqinn/conf.

Authored-by: Kent Yao 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit 3edab6cc1d70c102093e973a2cf97208db19be8c)
Signed-off-by: Dongjoon Hyun 
---
 core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala | 2 +-
 docs/configuration.md  | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git 
a/core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala 
b/core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala
index 974a0b7..3d489a3 100644
--- a/core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala
+++ b/core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala
@@ -541,7 +541,7 @@ private[deploy] class SparkSubmitArguments(args: 
Seq[String], env: Map[String, S
 |  directory of each executor. File paths 
of these files
 |  in executors can be accessed via 
SparkFiles.get(fileName).
 |
-|  --conf PROP=VALUE   Arbitrary Spark configuration property.
+|  --conf, -c PROP=VALUE   Arbitrary Spark configuration property.
 |  --properties-file FILE  Path to a file from which to load extra 
properties. If not
 |  specified, this will look for 
conf/spark-defaults.conf.
 |
diff --git a/docs/configuration.md b/docs/configuration.md
index 1582082..6bb1bda 100644
--- a/docs/configuration.md
+++ b/docs/configuration.md
@@ -80,7 +80,7 @@ Then, you can supply configuration values at runtime:
 
 The Spark shell and [`spark-submit`](submitting-applications.html)
 tool support two ways to load configurations dynamically. The first is command 
line options,
-such as `--master`, as shown above. `spark-submit` can accept any Spark 
property using the `--conf`
+such as `--master`, as shown above. `spark-submit` can accept any Spark 
property using the `--conf/-c`
 flag, but uses special flags for properties that play a part in launching the 
Spark application.
 Running `./bin/spark-submit --help` will show the entire list of these options.
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated: [MINOR][CORE] Expose the alias -c flag of --conf for spark-submit

2020-03-04 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new 1c17ede  [MINOR][CORE] Expose the alias -c flag of --conf for 
spark-submit
1c17ede is described below

commit 1c17ede75082fe56d3c5aedc14ac6246fdf3b333
Author: Kent Yao 
AuthorDate: Wed Mar 4 20:37:51 2020 -0800

[MINOR][CORE] Expose the alias -c flag of --conf for spark-submit

### What changes were proposed in this pull request?

-c is short for --conf, it was introduced since v1.1.0 but hidden from 
users until now

### Why are the changes needed?

### Does this PR introduce any user-facing change?

no

expose hidden feature

### How was this patch tested?

Nah

Closes #27802 from yaooqinn/conf.

Authored-by: Kent Yao 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit 3edab6cc1d70c102093e973a2cf97208db19be8c)
Signed-off-by: Dongjoon Hyun 
---
 core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala | 2 +-
 docs/configuration.md  | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git 
a/core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala 
b/core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala
index 974a0b7..3d489a3 100644
--- a/core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala
+++ b/core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala
@@ -541,7 +541,7 @@ private[deploy] class SparkSubmitArguments(args: 
Seq[String], env: Map[String, S
 |  directory of each executor. File paths 
of these files
 |  in executors can be accessed via 
SparkFiles.get(fileName).
 |
-|  --conf PROP=VALUE   Arbitrary Spark configuration property.
+|  --conf, -c PROP=VALUE   Arbitrary Spark configuration property.
 |  --properties-file FILE  Path to a file from which to load extra 
properties. If not
 |  specified, this will look for 
conf/spark-defaults.conf.
 |
diff --git a/docs/configuration.md b/docs/configuration.md
index 1582082..6bb1bda 100644
--- a/docs/configuration.md
+++ b/docs/configuration.md
@@ -80,7 +80,7 @@ Then, you can supply configuration values at runtime:
 
 The Spark shell and [`spark-submit`](submitting-applications.html)
 tool support two ways to load configurations dynamically. The first is command 
line options,
-such as `--master`, as shown above. `spark-submit` can accept any Spark 
property using the `--conf`
+such as `--master`, as shown above. `spark-submit` can accept any Spark 
property using the `--conf/-c`
 flag, but uses special flags for properties that play a part in launching the 
Spark application.
 Running `./bin/spark-submit --help` will show the entire list of these options.
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-31050][TEST] Disable flaky `Roundtrip` test in KafkaDelegationTokenSuite

2020-03-05 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 0a22f19  [SPARK-31050][TEST] Disable flaky `Roundtrip` test in 
KafkaDelegationTokenSuite
0a22f19 is described below

commit 0a22f1966466629cb745d000a0608d521fece093
Author: yi.wu 
AuthorDate: Thu Mar 5 00:21:32 2020 -0800

[SPARK-31050][TEST] Disable flaky `Roundtrip` test in 
KafkaDelegationTokenSuite

### What changes were proposed in this pull request?

Disable test `KafkaDelegationTokenSuite`.

### Why are the changes needed?

`KafkaDelegationTokenSuite` is too flaky.

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Pass Jenkins.

Closes #27789 from Ngone51/retry_kafka.

Authored-by: yi.wu 
Signed-off-by: Dongjoon Hyun 
---
 .../scala/org/apache/spark/sql/kafka010/KafkaDelegationTokenSuite.scala | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git 
a/external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaDelegationTokenSuite.scala
 
b/external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaDelegationTokenSuite.scala
index 3064838..79239e5 100644
--- 
a/external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaDelegationTokenSuite.scala
+++ 
b/external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaDelegationTokenSuite.scala
@@ -62,7 +62,7 @@ class KafkaDelegationTokenSuite extends StreamTest with 
SharedSparkSession with
 }
   }
 
-  test("Roundtrip") {
+  ignore("Roundtrip") {
 val hadoopConf = new Configuration()
 val manager = new HadoopDelegationTokenManager(spark.sparkContext.conf, 
hadoopConf, null)
 val credentials = new Credentials()


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-31050][TEST] Disable flaky `Roundtrip` test in KafkaDelegationTokenSuite

2020-03-05 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 9cea92b  [SPARK-31050][TEST] Disable flaky `Roundtrip` test in 
KafkaDelegationTokenSuite
9cea92b is described below

commit 9cea92b6f2c2fc8e0effcec710e6ff6e8a7c965f
Author: yi.wu 
AuthorDate: Thu Mar 5 00:21:32 2020 -0800

[SPARK-31050][TEST] Disable flaky `Roundtrip` test in 
KafkaDelegationTokenSuite

### What changes were proposed in this pull request?

Disable test `KafkaDelegationTokenSuite`.

### Why are the changes needed?

`KafkaDelegationTokenSuite` is too flaky.

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Pass Jenkins.

Closes #27789 from Ngone51/retry_kafka.

Authored-by: yi.wu 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit 0a22f1966466629cb745d000a0608d521fece093)
Signed-off-by: Dongjoon Hyun 
---
 .../scala/org/apache/spark/sql/kafka010/KafkaDelegationTokenSuite.scala | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git 
a/external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaDelegationTokenSuite.scala
 
b/external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaDelegationTokenSuite.scala
index 3064838..79239e5 100644
--- 
a/external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaDelegationTokenSuite.scala
+++ 
b/external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaDelegationTokenSuite.scala
@@ -62,7 +62,7 @@ class KafkaDelegationTokenSuite extends StreamTest with 
SharedSparkSession with
 }
   }
 
-  test("Roundtrip") {
+  ignore("Roundtrip") {
 val hadoopConf = new Configuration()
 val manager = new HadoopDelegationTokenManager(spark.sparkContext.conf, 
hadoopConf, null)
 val credentials = new Credentials()


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (d705d36 -> afb84e9)

2020-03-05 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from d705d36  [SPARK-31045][SQL] Add config for AQE logging level
 add afb84e9  [SPARK-30886][SQL] Deprecate two-parameter TRIM/LTRIM/RTRIM 
functions

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/analysis/Analyzer.scala | 20 ++---
 .../sql/catalyst/analysis/AnalysisSuite.scala  | 52 ++
 2 files changed, 66 insertions(+), 6 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-30886][SQL] Deprecate two-parameter TRIM/LTRIM/RTRIM functions

2020-03-05 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 1535b2b  [SPARK-30886][SQL] Deprecate two-parameter TRIM/LTRIM/RTRIM 
functions
1535b2b is described below

commit 1535b2bb51782a89b271b1ebe53273ab610b390b
Author: Dongjoon Hyun 
AuthorDate: Thu Mar 5 20:09:39 2020 -0800

[SPARK-30886][SQL] Deprecate two-parameter TRIM/LTRIM/RTRIM functions

### What changes were proposed in this pull request?

This PR aims to show a deprecation warning on two-parameter 
TRIM/LTRIM/RTRIM function usages based on the community decision.
- 
https://lists.apache.org/thread.html/r48b6c2596ab06206b7b7fd4bbafd4099dccd4e2cf9801aaa9034c418%40%3Cdev.spark.apache.org%3E

### Why are the changes needed?

For backward compatibility, SPARK-28093 is reverted. However, from Apache 
Spark 3.0.0, we should give a safe guideline to use SQL syntax instead of the 
esoteric function signatures.

### Does this PR introduce any user-facing change?

Yes. This shows a directional warning.

### How was this patch tested?

Pass the Jenkins with a newly added test case.

Closes #27643 from dongjoon-hyun/SPARK-30886.

Authored-by: Dongjoon Hyun 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit afb84e9d378003c57cd01d21cdb1a977ba25454b)
Signed-off-by: Dongjoon Hyun 
---
 .../spark/sql/catalyst/analysis/Analyzer.scala | 20 ++---
 .../sql/catalyst/analysis/AnalysisSuite.scala  | 52 ++
 2 files changed, 66 insertions(+), 6 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
index 3cb754d..eadcd0f 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
@@ -19,6 +19,7 @@ package org.apache.spark.sql.catalyst.analysis
 
 import java.util
 import java.util.Locale
+import java.util.concurrent.atomic.AtomicBoolean
 
 import scala.collection.mutable
 import scala.collection.mutable.ArrayBuffer
@@ -1795,6 +1796,7 @@ class Analyzer(
* Replaces [[UnresolvedFunction]]s with concrete [[Expression]]s.
*/
   object ResolveFunctions extends Rule[LogicalPlan] {
+val trimWarningEnabled = new AtomicBoolean(true)
 def apply(plan: LogicalPlan): LogicalPlan = plan.resolveOperatorsUp {
   case q: LogicalPlan =>
 q transformExpressions {
@@ -1839,13 +1841,19 @@ class Analyzer(
   }
   AggregateExpression(agg, Complete, isDistinct, filter)
 // This function is not an aggregate function, just return the 
resolved one.
-case other =>
-  if (isDistinct || filter.isDefined) {
-failAnalysis("DISTINCT or FILTER specified, " +
-  s"but ${other.prettyName} is not an aggregate function")
-  } else {
-other
+case other if (isDistinct || filter.isDefined) =>
+  failAnalysis("DISTINCT or FILTER specified, " +
+s"but ${other.prettyName} is not an aggregate function")
+case e: String2TrimExpression if arguments.size == 2 =>
+  if (trimWarningEnabled.get) {
+log.warn("Two-parameter TRIM/LTRIM/RTRIM function 
signatures are deprecated." +
+  " Use SQL syntax `TRIM((BOTH | LEADING | TRAILING)? 
trimStr FROM str)`" +
+  " instead.")
+trimWarningEnabled.set(false)
   }
+  e
+case other =>
+  other
   }
 }
 }
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnalysisSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnalysisSuite.scala
index d385133..8451b9b 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnalysisSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnalysisSuite.scala
@@ -21,6 +21,7 @@ import java.util.{Locale, TimeZone}
 
 import scala.reflect.ClassTag
 
+import org.apache.log4j.Level
 import org.scalatest.Matchers
 
 import org.apache.spark.api.python.PythonEvalType
@@ -768,4 +769,55 @@ class AnalysisSuite extends AnalysisTest with Matchers {
 assert(message.startsWith(s"Max iterations ($maxIterations) reached for 
batch Resolution, " +
   s"please set '${SQLConf.ANALYZER_MAX_ITERATIONS.key}&

[spark] branch master updated (587266f -> 1426ad8)

2020-03-05 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 587266f  [SPARK-31010][SQL][FOLLOW-UP] Deprecate untyped scala UDF
 add 1426ad8  [SPARK-23817][FOLLOWUP][TEST] Add OrcV2QuerySuite

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/execution/datasources/orc/OrcQuerySuite.scala  | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated (5375b40 -> 7c09c9f)

2020-03-05 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 5375b40  [SPARK-31010][SQL][FOLLOW-UP] Deprecate untyped scala UDF
 add 7c09c9f  [SPARK-23817][FOLLOWUP][TEST] Add OrcV2QuerySuite

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/execution/datasources/orc/OrcQuerySuite.scala  | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-31045][SQL][FOLLOWUP][3.0] Fix build due to divergence between master and 3.0

2020-03-05 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 9b48f33  [SPARK-31045][SQL][FOLLOWUP][3.0] Fix build due to divergence 
between master and 3.0
9b48f33 is described below

commit 9b48f3358d3efb523715a5f258e5ed83e28692f6
Author: Jungtaek Lim (HeartSaVioR) 
AuthorDate: Thu Mar 5 21:31:08 2020 -0800

[SPARK-31045][SQL][FOLLOWUP][3.0] Fix build due to divergence between 
master and 3.0

### What changes were proposed in this pull request?

This patch fixes the build failure in `branch-3.0` due to cherry-picking 
SPARK-31045 to branch-3.0, as `.version()` is not available in `branch-3.0` yet.

### Why are the changes needed?

The build is failing in `branch-3.0`.

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Jenkins build will verify.

Closes #27826 from HeartSaVioR/SPARK-31045-branch-3.0-FOLLOWUP.

Authored-by: Jungtaek Lim (HeartSaVioR) 
Signed-off-by: Dongjoon Hyun 
---
 sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala | 1 -
 1 file changed, 1 deletion(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
index cd465bc..fdaf0ec 100644
--- a/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
+++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
@@ -382,7 +382,6 @@ object SQLConf {
 .internal()
 .doc("Configures the log level for adaptive execution logging of plan 
changes. The value " +
   "can be 'trace', 'debug', 'info', 'warn', or 'error'. The default log 
level is 'debug'.")
-.version("3.0.0")
 .stringConf
 .transform(_.toUpperCase(Locale.ROOT))
 .checkValues(Set("TRACE", "DEBUG", "INFO", "WARN", "ERROR"))


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated (d73ea97 -> 895ddde)

2020-03-07 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git.


from d73ea97  [SPARK-31012][ML][PYSPARK][DOCS] Updating ML API docs for 3.0 
changes
 add 895ddde  [SPARK-31014][CORE][3.0] InMemoryStore: remove key from 
parentToChildrenMap when removing key from CountingRemoveIfForEach

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/util/kvstore/InMemoryStore.java   | 30 +++---
 1 file changed, 21 insertions(+), 9 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-31053][SQL] mark connector APIs as Evolving

2020-03-08 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 1aa1847  [SPARK-31053][SQL] mark connector APIs as Evolving
1aa1847 is described below

commit 1aa184763aa49d70907669b2d8af5a713ee0d7fa
Author: Wenchen Fan 
AuthorDate: Sun Mar 8 11:41:09 2020 -0700

[SPARK-31053][SQL] mark connector APIs as Evolving

### What changes were proposed in this pull request?

The newly added catalog APIs are marked as Experimental but other DS v2 
APIs are marked as Evolving.

This PR makes it consistent and mark all Connector APIs as Evolving.

### Why are the changes needed?

For consistency.

### Does this PR introduce any user-facing change?

no

### How was this patch tested?

N/A

Closes #27811 from cloud-fan/tag.

Authored-by: Wenchen Fan 
Signed-off-by: Dongjoon Hyun 
---
 .../java/org/apache/spark/sql/connector/catalog/CatalogExtension.java | 4 ++--
 .../java/org/apache/spark/sql/connector/catalog/CatalogPlugin.java| 4 ++--
 .../spark/sql/connector/catalog/DelegatingCatalogExtension.java   | 4 ++--
 .../main/java/org/apache/spark/sql/connector/catalog/Identifier.java  | 4 ++--
 .../java/org/apache/spark/sql/connector/catalog/IdentifierImpl.java   | 4 ++--
 .../java/org/apache/spark/sql/connector/catalog/NamespaceChange.java  | 4 ++--
 .../main/java/org/apache/spark/sql/connector/catalog/StagedTable.java | 4 ++--
 .../org/apache/spark/sql/connector/catalog/StagingTableCatalog.java   | 4 ++--
 .../java/org/apache/spark/sql/connector/catalog/SupportsDelete.java   | 4 ++--
 .../org/apache/spark/sql/connector/catalog/SupportsNamespaces.java| 4 ++--
 .../java/org/apache/spark/sql/connector/catalog/SupportsRead.java | 4 ++--
 .../java/org/apache/spark/sql/connector/catalog/SupportsWrite.java| 4 ++--
 .../java/org/apache/spark/sql/connector/catalog/TableCapability.java  | 4 ++--
 .../java/org/apache/spark/sql/connector/catalog/TableCatalog.java | 4 ++--
 .../main/java/org/apache/spark/sql/connector/catalog/TableChange.java | 4 ++--
 .../java/org/apache/spark/sql/connector/expressions/Expression.java   | 4 ++--
 .../java/org/apache/spark/sql/connector/expressions/Expressions.java  | 4 ++--
 .../main/java/org/apache/spark/sql/connector/expressions/Literal.java | 4 ++--
 .../org/apache/spark/sql/connector/expressions/NamedReference.java| 4 ++--
 .../java/org/apache/spark/sql/connector/expressions/Transform.java| 4 ++--
 .../apache/spark/sql/connector/write/SupportsDynamicOverwrite.java| 3 +++
 .../java/org/apache/spark/sql/connector/write/SupportsOverwrite.java  | 2 ++
 .../java/org/apache/spark/sql/connector/write/SupportsTruncate.java   | 3 +++
 23 files changed, 48 insertions(+), 40 deletions(-)

diff --git 
a/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/CatalogExtension.java
 
b/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/CatalogExtension.java
index 61cb83c..155dca5 100644
--- 
a/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/CatalogExtension.java
+++ 
b/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/CatalogExtension.java
@@ -17,7 +17,7 @@
 
 package org.apache.spark.sql.connector.catalog;
 
-import org.apache.spark.annotation.Experimental;
+import org.apache.spark.annotation.Evolving;
 import org.apache.spark.sql.util.CaseInsensitiveStringMap;
 
 /**
@@ -29,7 +29,7 @@ import org.apache.spark.sql.util.CaseInsensitiveStringMap;
  *
  * @since 3.0.0
  */
-@Experimental
+@Evolving
 public interface CatalogExtension extends TableCatalog, SupportsNamespaces {
 
   /**
diff --git 
a/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/CatalogPlugin.java
 
b/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/CatalogPlugin.java
index 2958538..8ca4f56 100644
--- 
a/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/CatalogPlugin.java
+++ 
b/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/CatalogPlugin.java
@@ -17,7 +17,7 @@
 
 package org.apache.spark.sql.connector.catalog;
 
-import org.apache.spark.annotation.Experimental;
+import org.apache.spark.annotation.Evolving;
 import org.apache.spark.sql.internal.SQLConf;
 import org.apache.spark.sql.util.CaseInsensitiveStringMap;
 
@@ -41,7 +41,7 @@ import org.apache.spark.sql.util.CaseInsensitiveStringMap;
  *
  * @since 3.0.0
  */
-@Experimental
+@Evolving
 public interface CatalogPlugin {
   /**
* Called to initialize configuration.
diff --git 
a/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/DelegatingCatalogExtension.java
 
b/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/DelegatingCatalogExtension.java
index 5a51959..d07d299 100644
--- 
a/sql/catalyst/src/main/java/org/apache

[spark] branch master updated (f8a3730 -> 1aa1847)

2020-03-08 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from f8a3730  [SPARK-30841][SQL][DOC][FOLLOW-UP] Add version information to 
the configuration of SQL
 add 1aa1847  [SPARK-31053][SQL] mark connector APIs as Evolving

No new revisions were added by this update.

Summary of changes:
 .../java/org/apache/spark/sql/connector/catalog/CatalogExtension.java | 4 ++--
 .../java/org/apache/spark/sql/connector/catalog/CatalogPlugin.java| 4 ++--
 .../spark/sql/connector/catalog/DelegatingCatalogExtension.java   | 4 ++--
 .../main/java/org/apache/spark/sql/connector/catalog/Identifier.java  | 4 ++--
 .../java/org/apache/spark/sql/connector/catalog/IdentifierImpl.java   | 4 ++--
 .../java/org/apache/spark/sql/connector/catalog/NamespaceChange.java  | 4 ++--
 .../main/java/org/apache/spark/sql/connector/catalog/StagedTable.java | 4 ++--
 .../org/apache/spark/sql/connector/catalog/StagingTableCatalog.java   | 4 ++--
 .../java/org/apache/spark/sql/connector/catalog/SupportsDelete.java   | 4 ++--
 .../org/apache/spark/sql/connector/catalog/SupportsNamespaces.java| 4 ++--
 .../java/org/apache/spark/sql/connector/catalog/SupportsRead.java | 4 ++--
 .../java/org/apache/spark/sql/connector/catalog/SupportsWrite.java| 4 ++--
 .../java/org/apache/spark/sql/connector/catalog/TableCapability.java  | 4 ++--
 .../java/org/apache/spark/sql/connector/catalog/TableCatalog.java | 4 ++--
 .../main/java/org/apache/spark/sql/connector/catalog/TableChange.java | 4 ++--
 .../java/org/apache/spark/sql/connector/expressions/Expression.java   | 4 ++--
 .../java/org/apache/spark/sql/connector/expressions/Expressions.java  | 4 ++--
 .../main/java/org/apache/spark/sql/connector/expressions/Literal.java | 4 ++--
 .../org/apache/spark/sql/connector/expressions/NamedReference.java| 4 ++--
 .../java/org/apache/spark/sql/connector/expressions/Transform.java| 4 ++--
 .../apache/spark/sql/connector/write/SupportsDynamicOverwrite.java| 3 +++
 .../java/org/apache/spark/sql/connector/write/SupportsOverwrite.java  | 2 ++
 .../java/org/apache/spark/sql/connector/write/SupportsTruncate.java   | 3 +++
 23 files changed, 48 insertions(+), 40 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-31053][SQL] mark connector APIs as Evolving

2020-03-08 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 4287b03  [SPARK-31053][SQL] mark connector APIs as Evolving
4287b03 is described below

commit 4287b03a9c564eb2bdb4dfd93ea78728e3a9e440
Author: Wenchen Fan 
AuthorDate: Sun Mar 8 11:41:09 2020 -0700

[SPARK-31053][SQL] mark connector APIs as Evolving

### What changes were proposed in this pull request?

The newly added catalog APIs are marked as Experimental but other DS v2 
APIs are marked as Evolving.

This PR makes it consistent and mark all Connector APIs as Evolving.

### Why are the changes needed?

For consistency.

### Does this PR introduce any user-facing change?

no

### How was this patch tested?

N/A

Closes #27811 from cloud-fan/tag.

Authored-by: Wenchen Fan 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit 1aa184763aa49d70907669b2d8af5a713ee0d7fa)
Signed-off-by: Dongjoon Hyun 
---
 .../java/org/apache/spark/sql/connector/catalog/CatalogExtension.java | 4 ++--
 .../java/org/apache/spark/sql/connector/catalog/CatalogPlugin.java| 4 ++--
 .../spark/sql/connector/catalog/DelegatingCatalogExtension.java   | 4 ++--
 .../main/java/org/apache/spark/sql/connector/catalog/Identifier.java  | 4 ++--
 .../java/org/apache/spark/sql/connector/catalog/IdentifierImpl.java   | 4 ++--
 .../java/org/apache/spark/sql/connector/catalog/NamespaceChange.java  | 4 ++--
 .../main/java/org/apache/spark/sql/connector/catalog/StagedTable.java | 4 ++--
 .../org/apache/spark/sql/connector/catalog/StagingTableCatalog.java   | 4 ++--
 .../java/org/apache/spark/sql/connector/catalog/SupportsDelete.java   | 4 ++--
 .../org/apache/spark/sql/connector/catalog/SupportsNamespaces.java| 4 ++--
 .../java/org/apache/spark/sql/connector/catalog/SupportsRead.java | 4 ++--
 .../java/org/apache/spark/sql/connector/catalog/SupportsWrite.java| 4 ++--
 .../java/org/apache/spark/sql/connector/catalog/TableCapability.java  | 4 ++--
 .../java/org/apache/spark/sql/connector/catalog/TableCatalog.java | 4 ++--
 .../main/java/org/apache/spark/sql/connector/catalog/TableChange.java | 4 ++--
 .../java/org/apache/spark/sql/connector/expressions/Expression.java   | 4 ++--
 .../java/org/apache/spark/sql/connector/expressions/Expressions.java  | 4 ++--
 .../main/java/org/apache/spark/sql/connector/expressions/Literal.java | 4 ++--
 .../org/apache/spark/sql/connector/expressions/NamedReference.java| 4 ++--
 .../java/org/apache/spark/sql/connector/expressions/Transform.java| 4 ++--
 .../apache/spark/sql/connector/write/SupportsDynamicOverwrite.java| 3 +++
 .../java/org/apache/spark/sql/connector/write/SupportsOverwrite.java  | 2 ++
 .../java/org/apache/spark/sql/connector/write/SupportsTruncate.java   | 3 +++
 23 files changed, 48 insertions(+), 40 deletions(-)

diff --git 
a/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/CatalogExtension.java
 
b/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/CatalogExtension.java
index 61cb83c..155dca5 100644
--- 
a/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/CatalogExtension.java
+++ 
b/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/CatalogExtension.java
@@ -17,7 +17,7 @@
 
 package org.apache.spark.sql.connector.catalog;
 
-import org.apache.spark.annotation.Experimental;
+import org.apache.spark.annotation.Evolving;
 import org.apache.spark.sql.util.CaseInsensitiveStringMap;
 
 /**
@@ -29,7 +29,7 @@ import org.apache.spark.sql.util.CaseInsensitiveStringMap;
  *
  * @since 3.0.0
  */
-@Experimental
+@Evolving
 public interface CatalogExtension extends TableCatalog, SupportsNamespaces {
 
   /**
diff --git 
a/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/CatalogPlugin.java
 
b/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/CatalogPlugin.java
index 2958538..8ca4f56 100644
--- 
a/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/CatalogPlugin.java
+++ 
b/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/CatalogPlugin.java
@@ -17,7 +17,7 @@
 
 package org.apache.spark.sql.connector.catalog;
 
-import org.apache.spark.annotation.Experimental;
+import org.apache.spark.annotation.Evolving;
 import org.apache.spark.sql.internal.SQLConf;
 import org.apache.spark.sql.util.CaseInsensitiveStringMap;
 
@@ -41,7 +41,7 @@ import org.apache.spark.sql.util.CaseInsensitiveStringMap;
  *
  * @since 3.0.0
  */
-@Experimental
+@Evolving
 public interface CatalogPlugin {
   /**
* Called to initialize configuration.
diff --git 
a/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/DelegatingCatalogExtension.java
 
b/sql/catalyst/src/main/java/org/apache/spark/sql/connector

[spark] branch branch-3.0 updated: [SPARK-31053][SQL] mark connector APIs as Evolving

2020-03-08 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 4287b03  [SPARK-31053][SQL] mark connector APIs as Evolving
4287b03 is described below

commit 4287b03a9c564eb2bdb4dfd93ea78728e3a9e440
Author: Wenchen Fan 
AuthorDate: Sun Mar 8 11:41:09 2020 -0700

[SPARK-31053][SQL] mark connector APIs as Evolving

### What changes were proposed in this pull request?

The newly added catalog APIs are marked as Experimental but other DS v2 
APIs are marked as Evolving.

This PR makes it consistent and mark all Connector APIs as Evolving.

### Why are the changes needed?

For consistency.

### Does this PR introduce any user-facing change?

no

### How was this patch tested?

N/A

Closes #27811 from cloud-fan/tag.

Authored-by: Wenchen Fan 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit 1aa184763aa49d70907669b2d8af5a713ee0d7fa)
Signed-off-by: Dongjoon Hyun 
---
 .../java/org/apache/spark/sql/connector/catalog/CatalogExtension.java | 4 ++--
 .../java/org/apache/spark/sql/connector/catalog/CatalogPlugin.java| 4 ++--
 .../spark/sql/connector/catalog/DelegatingCatalogExtension.java   | 4 ++--
 .../main/java/org/apache/spark/sql/connector/catalog/Identifier.java  | 4 ++--
 .../java/org/apache/spark/sql/connector/catalog/IdentifierImpl.java   | 4 ++--
 .../java/org/apache/spark/sql/connector/catalog/NamespaceChange.java  | 4 ++--
 .../main/java/org/apache/spark/sql/connector/catalog/StagedTable.java | 4 ++--
 .../org/apache/spark/sql/connector/catalog/StagingTableCatalog.java   | 4 ++--
 .../java/org/apache/spark/sql/connector/catalog/SupportsDelete.java   | 4 ++--
 .../org/apache/spark/sql/connector/catalog/SupportsNamespaces.java| 4 ++--
 .../java/org/apache/spark/sql/connector/catalog/SupportsRead.java | 4 ++--
 .../java/org/apache/spark/sql/connector/catalog/SupportsWrite.java| 4 ++--
 .../java/org/apache/spark/sql/connector/catalog/TableCapability.java  | 4 ++--
 .../java/org/apache/spark/sql/connector/catalog/TableCatalog.java | 4 ++--
 .../main/java/org/apache/spark/sql/connector/catalog/TableChange.java | 4 ++--
 .../java/org/apache/spark/sql/connector/expressions/Expression.java   | 4 ++--
 .../java/org/apache/spark/sql/connector/expressions/Expressions.java  | 4 ++--
 .../main/java/org/apache/spark/sql/connector/expressions/Literal.java | 4 ++--
 .../org/apache/spark/sql/connector/expressions/NamedReference.java| 4 ++--
 .../java/org/apache/spark/sql/connector/expressions/Transform.java| 4 ++--
 .../apache/spark/sql/connector/write/SupportsDynamicOverwrite.java| 3 +++
 .../java/org/apache/spark/sql/connector/write/SupportsOverwrite.java  | 2 ++
 .../java/org/apache/spark/sql/connector/write/SupportsTruncate.java   | 3 +++
 23 files changed, 48 insertions(+), 40 deletions(-)

diff --git 
a/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/CatalogExtension.java
 
b/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/CatalogExtension.java
index 61cb83c..155dca5 100644
--- 
a/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/CatalogExtension.java
+++ 
b/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/CatalogExtension.java
@@ -17,7 +17,7 @@
 
 package org.apache.spark.sql.connector.catalog;
 
-import org.apache.spark.annotation.Experimental;
+import org.apache.spark.annotation.Evolving;
 import org.apache.spark.sql.util.CaseInsensitiveStringMap;
 
 /**
@@ -29,7 +29,7 @@ import org.apache.spark.sql.util.CaseInsensitiveStringMap;
  *
  * @since 3.0.0
  */
-@Experimental
+@Evolving
 public interface CatalogExtension extends TableCatalog, SupportsNamespaces {
 
   /**
diff --git 
a/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/CatalogPlugin.java
 
b/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/CatalogPlugin.java
index 2958538..8ca4f56 100644
--- 
a/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/CatalogPlugin.java
+++ 
b/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/CatalogPlugin.java
@@ -17,7 +17,7 @@
 
 package org.apache.spark.sql.connector.catalog;
 
-import org.apache.spark.annotation.Experimental;
+import org.apache.spark.annotation.Evolving;
 import org.apache.spark.sql.internal.SQLConf;
 import org.apache.spark.sql.util.CaseInsensitiveStringMap;
 
@@ -41,7 +41,7 @@ import org.apache.spark.sql.util.CaseInsensitiveStringMap;
  *
  * @since 3.0.0
  */
-@Experimental
+@Evolving
 public interface CatalogPlugin {
   /**
* Called to initialize configuration.
diff --git 
a/sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/DelegatingCatalogExtension.java
 
b/sql/catalyst/src/main/java/org/apache/spark/sql/connector

[spark] branch master updated (1aa1847 -> 068bdd4)

2020-03-08 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 1aa1847  [SPARK-31053][SQL] mark connector APIs as Evolving
 add 068bdd4  [SPARK-31073][WEBUI] Add "shuffle write time" to task metrics 
summary in StagePage

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/ui/static/stagepage.js| 41 ++
 .../spark/ui/static/stagespage-template.html   |  2 +-
 .../scala/org/apache/spark/ui/jobs/StagePage.scala |  2 +-
 3 files changed, 29 insertions(+), 16 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-30941][PYSPARK] Add a note to asDict to document its behavior when there are duplicate fields

2020-03-09 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new d21aab4  [SPARK-30941][PYSPARK] Add a note to asDict to document its 
behavior when there are duplicate fields
d21aab4 is described below

commit d21aab403a0a32e8b705b38874c0b335e703bd5d
Author: Liang-Chi Hsieh 
AuthorDate: Mon Mar 9 11:06:45 2020 -0700

[SPARK-30941][PYSPARK] Add a note to asDict to document its behavior when 
there are duplicate fields

### What changes were proposed in this pull request?

Adding a note to document `Row.asDict` behavior when there are duplicate 
fields.

### Why are the changes needed?

When a row contains duplicate fields, `asDict` and `_get_item_` behaves 
differently. We should document it to let users know the difference explicitly.

### Does this PR introduce any user-facing change?

No. Only document change.

### How was this patch tested?

Existing test.

Closes #27853 from viirya/SPARK-30941.

Authored-by: Liang-Chi Hsieh 
Signed-off-by: Dongjoon Hyun 
---
 python/pyspark/sql/types.py | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/python/pyspark/sql/types.py b/python/pyspark/sql/types.py
index a5302e7..320a68d 100644
--- a/python/pyspark/sql/types.py
+++ b/python/pyspark/sql/types.py
@@ -1528,6 +1528,12 @@ class Row(tuple):
 
 :param recursive: turns the nested Rows to dict (default: False).
 
+.. note:: If a row contains duplicate field names, e.g., the rows of a 
join
+between two :class:`DataFrame` that both have the fields of same 
names,
+one of the duplicate fields will be selected by ``asDict``. 
``__getitem__``
+will also return one of the duplicate fields, however returned 
value might
+be different to ``asDict``.
+
 >>> Row(name="Alice", age=11).asDict() == {'name': 'Alice', 'age': 11}
 True
 >>> row = Row(key=1, value=Row(name='a', age=2))


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (b6b0343 -> d21aab4)

2020-03-09 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from b6b0343  [SPARK-30929][ML] ML, GraphX 3.0 QA: API: New Scala APIs, docs
 add d21aab4  [SPARK-30941][PYSPARK] Add a note to asDict to document its 
behavior when there are duplicate fields

No new revisions were added by this update.

Summary of changes:
 python/pyspark/sql/types.py | 6 ++
 1 file changed, 6 insertions(+)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-30941][PYSPARK] Add a note to asDict to document its behavior when there are duplicate fields

2020-03-09 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 2e0d2b9  [SPARK-30941][PYSPARK] Add a note to asDict to document its 
behavior when there are duplicate fields
2e0d2b9 is described below

commit 2e0d2b96195b0a3772225501a703fb02304aa346
Author: Liang-Chi Hsieh 
AuthorDate: Mon Mar 9 11:06:45 2020 -0700

[SPARK-30941][PYSPARK] Add a note to asDict to document its behavior when 
there are duplicate fields

### What changes were proposed in this pull request?

Adding a note to document `Row.asDict` behavior when there are duplicate 
fields.

### Why are the changes needed?

When a row contains duplicate fields, `asDict` and `_get_item_` behaves 
differently. We should document it to let users know the difference explicitly.

### Does this PR introduce any user-facing change?

No. Only document change.

### How was this patch tested?

Existing test.

Closes #27853 from viirya/SPARK-30941.

Authored-by: Liang-Chi Hsieh 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit d21aab403a0a32e8b705b38874c0b335e703bd5d)
Signed-off-by: Dongjoon Hyun 
---
 python/pyspark/sql/types.py | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/python/pyspark/sql/types.py b/python/pyspark/sql/types.py
index a5302e7..320a68d 100644
--- a/python/pyspark/sql/types.py
+++ b/python/pyspark/sql/types.py
@@ -1528,6 +1528,12 @@ class Row(tuple):
 
 :param recursive: turns the nested Rows to dict (default: False).
 
+.. note:: If a row contains duplicate field names, e.g., the rows of a 
join
+between two :class:`DataFrame` that both have the fields of same 
names,
+one of the duplicate fields will be selected by ``asDict``. 
``__getitem__``
+will also return one of the duplicate fields, however returned 
value might
+be different to ``asDict``.
+
 >>> Row(name="Alice", age=11).asDict() == {'name': 'Alice', 'age': 11}
 True
 >>> row = Row(key=1, value=Row(name='a', age=2))


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated: [SPARK-30941][PYSPARK] Add a note to asDict to document its behavior when there are duplicate fields

2020-03-09 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new f378c7f  [SPARK-30941][PYSPARK] Add a note to asDict to document its 
behavior when there are duplicate fields
f378c7f is described below

commit f378c7fba29368ca32142a3b7fc169dabe6cb37f
Author: Liang-Chi Hsieh 
AuthorDate: Mon Mar 9 11:06:45 2020 -0700

[SPARK-30941][PYSPARK] Add a note to asDict to document its behavior when 
there are duplicate fields

### What changes were proposed in this pull request?

Adding a note to document `Row.asDict` behavior when there are duplicate 
fields.

### Why are the changes needed?

When a row contains duplicate fields, `asDict` and `_get_item_` behaves 
differently. We should document it to let users know the difference explicitly.

### Does this PR introduce any user-facing change?

No. Only document change.

### How was this patch tested?

Existing test.

Closes #27853 from viirya/SPARK-30941.

Authored-by: Liang-Chi Hsieh 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit d21aab403a0a32e8b705b38874c0b335e703bd5d)
Signed-off-by: Dongjoon Hyun 
---
 python/pyspark/sql/types.py | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/python/pyspark/sql/types.py b/python/pyspark/sql/types.py
index 1d24c40..0d73963 100644
--- a/python/pyspark/sql/types.py
+++ b/python/pyspark/sql/types.py
@@ -1466,6 +1466,12 @@ class Row(tuple):
 
 :param recursive: turns the nested Row as dict (default: False).
 
+.. note:: If a row contains duplicate field names, e.g., the rows of a 
join
+between two :class:`DataFrame` that both have the fields of same 
names,
+one of the duplicate fields will be selected by ``asDict``. 
``__getitem__``
+will also return one of the duplicate fields, however returned 
value might
+be different to ``asDict``.
+
 >>> Row(name="Alice", age=11).asDict() == {'name': 'Alice', 'age': 11}
 True
 >>> row = Row(key=1, value=Row(name='a', age=2))


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-30941][PYSPARK] Add a note to asDict to document its behavior when there are duplicate fields

2020-03-09 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 2e0d2b9  [SPARK-30941][PYSPARK] Add a note to asDict to document its 
behavior when there are duplicate fields
2e0d2b9 is described below

commit 2e0d2b96195b0a3772225501a703fb02304aa346
Author: Liang-Chi Hsieh 
AuthorDate: Mon Mar 9 11:06:45 2020 -0700

[SPARK-30941][PYSPARK] Add a note to asDict to document its behavior when 
there are duplicate fields

### What changes were proposed in this pull request?

Adding a note to document `Row.asDict` behavior when there are duplicate 
fields.

### Why are the changes needed?

When a row contains duplicate fields, `asDict` and `_get_item_` behaves 
differently. We should document it to let users know the difference explicitly.

### Does this PR introduce any user-facing change?

No. Only document change.

### How was this patch tested?

Existing test.

Closes #27853 from viirya/SPARK-30941.

Authored-by: Liang-Chi Hsieh 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit d21aab403a0a32e8b705b38874c0b335e703bd5d)
Signed-off-by: Dongjoon Hyun 
---
 python/pyspark/sql/types.py | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/python/pyspark/sql/types.py b/python/pyspark/sql/types.py
index a5302e7..320a68d 100644
--- a/python/pyspark/sql/types.py
+++ b/python/pyspark/sql/types.py
@@ -1528,6 +1528,12 @@ class Row(tuple):
 
 :param recursive: turns the nested Rows to dict (default: False).
 
+.. note:: If a row contains duplicate field names, e.g., the rows of a 
join
+between two :class:`DataFrame` that both have the fields of same 
names,
+one of the duplicate fields will be selected by ``asDict``. 
``__getitem__``
+will also return one of the duplicate fields, however returned 
value might
+be different to ``asDict``.
+
 >>> Row(name="Alice", age=11).asDict() == {'name': 'Alice', 'age': 11}
 True
 >>> row = Row(key=1, value=Row(name='a', age=2))


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated: [SPARK-30941][PYSPARK] Add a note to asDict to document its behavior when there are duplicate fields

2020-03-09 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new f378c7f  [SPARK-30941][PYSPARK] Add a note to asDict to document its 
behavior when there are duplicate fields
f378c7f is described below

commit f378c7fba29368ca32142a3b7fc169dabe6cb37f
Author: Liang-Chi Hsieh 
AuthorDate: Mon Mar 9 11:06:45 2020 -0700

[SPARK-30941][PYSPARK] Add a note to asDict to document its behavior when 
there are duplicate fields

### What changes were proposed in this pull request?

Adding a note to document `Row.asDict` behavior when there are duplicate 
fields.

### Why are the changes needed?

When a row contains duplicate fields, `asDict` and `_get_item_` behaves 
differently. We should document it to let users know the difference explicitly.

### Does this PR introduce any user-facing change?

No. Only document change.

### How was this patch tested?

Existing test.

Closes #27853 from viirya/SPARK-30941.

Authored-by: Liang-Chi Hsieh 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit d21aab403a0a32e8b705b38874c0b335e703bd5d)
Signed-off-by: Dongjoon Hyun 
---
 python/pyspark/sql/types.py | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/python/pyspark/sql/types.py b/python/pyspark/sql/types.py
index 1d24c40..0d73963 100644
--- a/python/pyspark/sql/types.py
+++ b/python/pyspark/sql/types.py
@@ -1466,6 +1466,12 @@ class Row(tuple):
 
 :param recursive: turns the nested Row as dict (default: False).
 
+.. note:: If a row contains duplicate field names, e.g., the rows of a 
join
+between two :class:`DataFrame` that both have the fields of same 
names,
+one of the duplicate fields will be selected by ``asDict``. 
``__getitem__``
+will also return one of the duplicate fields, however returned 
value might
+be different to ``asDict``.
+
 >>> Row(name="Alice", age=11).asDict() == {'name': 'Alice', 'age': 11}
 True
 >>> row = Row(key=1, value=Row(name='a', age=2))


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated: [SPARK-30941][PYSPARK] Add a note to asDict to document its behavior when there are duplicate fields

2020-03-09 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new f378c7f  [SPARK-30941][PYSPARK] Add a note to asDict to document its 
behavior when there are duplicate fields
f378c7f is described below

commit f378c7fba29368ca32142a3b7fc169dabe6cb37f
Author: Liang-Chi Hsieh 
AuthorDate: Mon Mar 9 11:06:45 2020 -0700

[SPARK-30941][PYSPARK] Add a note to asDict to document its behavior when 
there are duplicate fields

### What changes were proposed in this pull request?

Adding a note to document `Row.asDict` behavior when there are duplicate 
fields.

### Why are the changes needed?

When a row contains duplicate fields, `asDict` and `_get_item_` behaves 
differently. We should document it to let users know the difference explicitly.

### Does this PR introduce any user-facing change?

No. Only document change.

### How was this patch tested?

Existing test.

Closes #27853 from viirya/SPARK-30941.

Authored-by: Liang-Chi Hsieh 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit d21aab403a0a32e8b705b38874c0b335e703bd5d)
Signed-off-by: Dongjoon Hyun 
---
 python/pyspark/sql/types.py | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/python/pyspark/sql/types.py b/python/pyspark/sql/types.py
index 1d24c40..0d73963 100644
--- a/python/pyspark/sql/types.py
+++ b/python/pyspark/sql/types.py
@@ -1466,6 +1466,12 @@ class Row(tuple):
 
 :param recursive: turns the nested Row as dict (default: False).
 
+.. note:: If a row contains duplicate field names, e.g., the rows of a 
join
+between two :class:`DataFrame` that both have the fields of same 
names,
+one of the duplicate fields will be selected by ``asDict``. 
``__getitem__``
+will also return one of the duplicate fields, however returned 
value might
+be different to ``asDict``.
+
 >>> Row(name="Alice", age=11).asDict() == {'name': 'Alice', 'age': 11}
 True
 >>> row = Row(key=1, value=Row(name='a', age=2))


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (d21aab4 -> e807118)

2020-03-09 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from d21aab4  [SPARK-30941][PYSPARK] Add a note to asDict to document its 
behavior when there are duplicate fields
 add e807118  [SPARK-31055][DOCS] Update config docs for shuffle local host 
reads to have dep on external shuffle service

No new revisions were added by this update.

Summary of changes:
 core/src/main/scala/org/apache/spark/internal/config/package.scala | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-31055][DOCS] Update config docs for shuffle local host reads to have dep on external shuffle service

2020-03-09 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 9caf009  [SPARK-31055][DOCS] Update config docs for shuffle local host 
reads to have dep on external shuffle service
9caf009 is described below

commit 9caf009ecd9041e71efe2ef56ca0e75cc94cb56e
Author: Thomas Graves 
AuthorDate: Mon Mar 9 12:17:59 2020 -0700

[SPARK-31055][DOCS] Update config docs for shuffle local host reads to have 
dep on external shuffle service

### What changes were proposed in this pull request?

with SPARK-27651 we now support host local reads for shuffle, but only when 
external shuffle service is enabled. Update the config docs to state that.

### Why are the changes needed?

clarify dependency

### Does this PR introduce any user-facing change?

no

### How was this patch tested?

n/a

Closes #27812 from tgravescs/SPARK-27651-follow.

Authored-by: Thomas Graves 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit e807118eef9e0214170ff62c828524d237bd58e3)
Signed-off-by: Dongjoon Hyun 
---
 core/src/main/scala/org/apache/spark/internal/config/package.scala | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/core/src/main/scala/org/apache/spark/internal/config/package.scala 
b/core/src/main/scala/org/apache/spark/internal/config/package.scala
index 23c31a5..1308a46 100644
--- a/core/src/main/scala/org/apache/spark/internal/config/package.scala
+++ b/core/src/main/scala/org/apache/spark/internal/config/package.scala
@@ -1135,7 +1135,8 @@ package object config {
 
   private[spark] val SHUFFLE_HOST_LOCAL_DISK_READING_ENABLED =
 ConfigBuilder("spark.shuffle.readHostLocalDisk")
-  .doc(s"If enabled (and `${SHUFFLE_USE_OLD_FETCH_PROTOCOL.key}` is 
disabled), shuffle " +
+  .doc(s"If enabled (and `${SHUFFLE_USE_OLD_FETCH_PROTOCOL.key}` is 
disabled and external " +
+s"shuffle `${SHUFFLE_SERVICE_ENABLED.key}` is enabled), shuffle " +
 "blocks requested from those block managers which are running on the 
same host are read " +
 "from the disk directly instead of being fetched as remote blocks over 
the network.")
   .booleanConf


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-31065][SQL] Match schema_of_json to the schema inference of JSON data source

2020-03-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 815c792  [SPARK-31065][SQL] Match schema_of_json to the schema 
inference of JSON data source
815c792 is described below

commit 815c7929c290d6eed86dc5c924f9f7d48cff179d
Author: HyukjinKwon 
AuthorDate: Tue Mar 10 00:33:32 2020 -0700

[SPARK-31065][SQL] Match schema_of_json to the schema inference of JSON 
data source

### What changes were proposed in this pull request?

This PR proposes two things:

1. Convert `null` to `string` type during schema inference of 
`schema_of_json` as JSON datasource does. This is a bug fix as well because 
`null` string is not the proper DDL formatted string and it is unable for SQL 
parser to recognise it as a type string. We should match it to JSON datasource 
and return a string type so `schema_of_json` returns a proper DDL formatted 
string.

2. Let `schema_of_json` respect `dropFieldIfAllNull` option during schema 
inference.

### Why are the changes needed?

To let `schema_of_json` return a proper DDL formatted string, and respect 
`dropFieldIfAllNull` option.

### Does this PR introduce any user-facing change?
Yes, it does.

```scala
import collection.JavaConverters._
import org.apache.spark.sql.functions._

spark.range(1).select(schema_of_json(lit("""{"id": ""}"""))).show()
spark.range(1).select(schema_of_json(lit("""{"id": "a", "drop": {"drop": 
null}}"""), Map("dropFieldIfAllNull" -> "true").asJava)).show(false)
```

**Before:**

```
struct
struct,id:string>
```

**After:**

```
struct
struct
```

### How was this patch tested?

Manually tested, and unittests were added.

Closes #27854 from HyukjinKwon/SPARK-31065.

Authored-by: HyukjinKwon 
Signed-off-by: Dongjoon Hyun 
---
 .../sql/catalyst/expressions/jsonExpressions.scala | 13 +++-
 .../spark/sql/catalyst/json/JsonInferSchema.scala  | 13 
 .../org/apache/spark/sql/JsonFunctionsSuite.scala  | 36 ++
 3 files changed, 54 insertions(+), 8 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
index aa4b464..4c2a511 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
@@ -777,7 +777,18 @@ case class SchemaOfJson(
   override def eval(v: InternalRow): Any = {
 val dt = Utils.tryWithResource(CreateJacksonParser.utf8String(jsonFactory, 
json)) { parser =>
   parser.nextToken()
-  jsonInferSchema.inferField(parser)
+  // To match with schema inference from JSON datasource.
+  jsonInferSchema.inferField(parser) match {
+case st: StructType =>
+  jsonInferSchema.canonicalizeType(st, 
jsonOptions).getOrElse(StructType(Nil))
+case at: ArrayType if at.elementType.isInstanceOf[StructType] =>
+  jsonInferSchema
+.canonicalizeType(at.elementType, jsonOptions)
+.map(ArrayType(_, containsNull = at.containsNull))
+.getOrElse(ArrayType(StructType(Nil), containsNull = 
at.containsNull))
+case other: DataType =>
+  jsonInferSchema.canonicalizeType(other, 
jsonOptions).getOrElse(StringType)
+  }
 }
 
 UTF8String.fromString(dt.catalogString)
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JsonInferSchema.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JsonInferSchema.scala
index 82dd6d0..3dd8694 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JsonInferSchema.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JsonInferSchema.scala
@@ -92,12 +92,10 @@ private[sql] class JsonInferSchema(options: JSONOptions) 
extends Serializable {
 }
 json.sparkContext.runJob(mergedTypesFromPartitions, foldPartition, 
mergeResult)
 
-canonicalizeType(rootType, options) match {
-  case Some(st: StructType) => st
-  case _ =>
-// canonicalizeType erases all empty structs, including the only one 
we want to keep
-StructType(Nil)
-}
+canonicalizeType(rootType, options)
+  .find(_.isInstanceOf[StructType])
+  // canonicalizeType erases all empty structs, including the only one we 
want to keep
+  .getOrElse(StructType(Nil)).asInstanceOf[StructType]

[spark] branch branch-3.0 updated: [SPARK-31065][SQL] Match schema_of_json to the schema inference of JSON data source

2020-03-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 0985f13  [SPARK-31065][SQL] Match schema_of_json to the schema 
inference of JSON data source
0985f13 is described below

commit 0985f13bc66a99319820d0d9ba5b3f2a254f61a4
Author: HyukjinKwon 
AuthorDate: Tue Mar 10 00:33:32 2020 -0700

[SPARK-31065][SQL] Match schema_of_json to the schema inference of JSON 
data source

This PR proposes two things:

1. Convert `null` to `string` type during schema inference of 
`schema_of_json` as JSON datasource does. This is a bug fix as well because 
`null` string is not the proper DDL formatted string and it is unable for SQL 
parser to recognise it as a type string. We should match it to JSON datasource 
and return a string type so `schema_of_json` returns a proper DDL formatted 
string.

2. Let `schema_of_json` respect `dropFieldIfAllNull` option during schema 
inference.

To let `schema_of_json` return a proper DDL formatted string, and respect 
`dropFieldIfAllNull` option.

Yes, it does.

```scala
import collection.JavaConverters._
import org.apache.spark.sql.functions._

spark.range(1).select(schema_of_json(lit("""{"id": ""}"""))).show()
spark.range(1).select(schema_of_json(lit("""{"id": "a", "drop": {"drop": 
null}}"""), Map("dropFieldIfAllNull" -> "true").asJava)).show(false)
```

**Before:**

```
struct
struct,id:string>
```

**After:**

```
struct
struct
```

Manually tested, and unittests were added.

Closes #27854 from HyukjinKwon/SPARK-31065.

Authored-by: HyukjinKwon 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit 815c7929c290d6eed86dc5c924f9f7d48cff179d)
Signed-off-by: Dongjoon Hyun 
---
 .../sql/catalyst/expressions/jsonExpressions.scala | 13 +++-
 .../spark/sql/catalyst/json/JsonInferSchema.scala  | 13 
 .../org/apache/spark/sql/JsonFunctionsSuite.scala  | 35 ++
 3 files changed, 53 insertions(+), 8 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
index 61afdb6..a63e541 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
@@ -773,7 +773,18 @@ case class SchemaOfJson(
   override def eval(v: InternalRow): Any = {
 val dt = Utils.tryWithResource(CreateJacksonParser.utf8String(jsonFactory, 
json)) { parser =>
   parser.nextToken()
-  jsonInferSchema.inferField(parser)
+  // To match with schema inference from JSON datasource.
+  jsonInferSchema.inferField(parser) match {
+case st: StructType =>
+  jsonInferSchema.canonicalizeType(st, 
jsonOptions).getOrElse(StructType(Nil))
+case at: ArrayType if at.elementType.isInstanceOf[StructType] =>
+  jsonInferSchema
+.canonicalizeType(at.elementType, jsonOptions)
+.map(ArrayType(_, containsNull = at.containsNull))
+.getOrElse(ArrayType(StructType(Nil), containsNull = 
at.containsNull))
+case other: DataType =>
+  jsonInferSchema.canonicalizeType(other, 
jsonOptions).getOrElse(StringType)
+  }
 }
 
 UTF8String.fromString(dt.catalogString)
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JsonInferSchema.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JsonInferSchema.scala
index 82dd6d0..3dd8694 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JsonInferSchema.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JsonInferSchema.scala
@@ -92,12 +92,10 @@ private[sql] class JsonInferSchema(options: JSONOptions) 
extends Serializable {
 }
 json.sparkContext.runJob(mergedTypesFromPartitions, foldPartition, 
mergeResult)
 
-canonicalizeType(rootType, options) match {
-  case Some(st: StructType) => st
-  case _ =>
-// canonicalizeType erases all empty structs, including the only one 
we want to keep
-StructType(Nil)
-}
+canonicalizeType(rootType, options)
+  .find(_.isInstanceOf[StructType])
+  // canonicalizeType erases all empty structs, including the only one we 
want to keep
+  .getOrElse(StructType(Nil)).asInstanceOf[StructType]
   }
 
   /**
@@ -198,7 +196,8 @@ private[sql] class JsonInferSchema(options: JSONO

[spark] branch master updated (3bd6ebf -> 34be83e)

2020-03-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 3bd6ebf  [SPARK-30189][SQL] Interval from year-month/date-time string 
should handle whitespaces
 add 34be83e  [SPARK-31037][SQL][FOLLOW-UP] Replace legacy 
ReduceNumShufflePartitions with CoalesceShufflePartitions in comment

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala | 6 +++---
 .../apache/spark/sql/execution/adaptive/OptimizeSkewedJoin.scala| 4 ++--
 2 files changed, 5 insertions(+), 5 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (3bd6ebf -> 34be83e)

2020-03-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 3bd6ebf  [SPARK-30189][SQL] Interval from year-month/date-time string 
should handle whitespaces
 add 34be83e  [SPARK-31037][SQL][FOLLOW-UP] Replace legacy 
ReduceNumShufflePartitions with CoalesceShufflePartitions in comment

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala | 6 +++---
 .../apache/spark/sql/execution/adaptive/OptimizeSkewedJoin.scala| 4 ++--
 2 files changed, 5 insertions(+), 5 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-31037][SQL][FOLLOW-UP] Replace legacy ReduceNumShufflePartitions with CoalesceShufflePartitions in comment

2020-03-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 57bf23c  [SPARK-31037][SQL][FOLLOW-UP] Replace legacy 
ReduceNumShufflePartitions with CoalesceShufflePartitions in comment
57bf23c is described below

commit 57bf23c01b2cffe5011a9d15eb68eff5c28519f4
Author: yi.wu 
AuthorDate: Tue Mar 10 11:09:36 2020 -0700

[SPARK-31037][SQL][FOLLOW-UP] Replace legacy ReduceNumShufflePartitions 
with CoalesceShufflePartitions in comment

### What changes were proposed in this pull request?

Replace legacy `ReduceNumShufflePartitions` with 
`CoalesceShufflePartitions` in comment.

### Why are the changes needed?

Rule `ReduceNumShufflePartitions` has renamed to 
`CoalesceShufflePartitions`, we should update related comment as well.

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

N/A.

Closes #27865 from Ngone51/spark_31037_followup.

Authored-by: yi.wu 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit 34be83e08b6f5313bdd9d165d3e203d06eff677b)
Signed-off-by: Dongjoon Hyun 
---
 .../apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala | 6 +++---
 .../apache/spark/sql/execution/adaptive/OptimizeSkewedJoin.scala| 4 ++--
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala
index fc88a7f..c1486aa 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala
@@ -97,12 +97,12 @@ case class AdaptiveSparkPlanExec(
   @transient private val queryStageOptimizerRules: Seq[Rule[SparkPlan]] = Seq(
 ReuseAdaptiveSubquery(conf, context.subqueryCache),
 // Here the 'OptimizeSkewedJoin' rule should be executed
-// before 'ReduceNumShufflePartitions', as the skewed partition handled
-// in 'OptimizeSkewedJoin' rule, should be omitted in 
'ReduceNumShufflePartitions'.
+// before 'CoalesceShufflePartitions', as the skewed partition handled
+// in 'OptimizeSkewedJoin' rule, should be omitted in 
'CoalesceShufflePartitions'.
 OptimizeSkewedJoin(conf),
 CoalesceShufflePartitions(conf),
 // The rule of 'OptimizeLocalShuffleReader' need to make use of the 
'partitionStartIndices'
-// in 'ReduceNumShufflePartitions' rule. So it must be after 
'ReduceNumShufflePartitions' rule.
+// in 'CoalesceShufflePartitions' rule. So it must be after 
'CoalesceShufflePartitions' rule.
 OptimizeLocalShuffleReader(conf),
 ApplyColumnarRulesAndInsertTransitions(conf, 
context.session.sessionState.columnarRules),
 CollapseCodegenStages(conf)
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/OptimizeSkewedJoin.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/OptimizeSkewedJoin.scala
index c3bcce4..4387409 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/OptimizeSkewedJoin.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/OptimizeSkewedJoin.scala
@@ -52,7 +52,7 @@ import org.apache.spark.sql.internal.SQLConf
  * (L4-1, R4-1), (L4-2, R4-1), (L4-1, R4-2), (L4-2, R4-2)
  *
  * Note that, when this rule is enabled, it also coalesces non-skewed 
partitions like
- * `ReduceNumShufflePartitions` does.
+ * `CoalesceShufflePartitions` does.
  */
 case class OptimizeSkewedJoin(conf: SQLConf) extends Rule[SparkPlan] {
 
@@ -191,7 +191,7 @@ case class OptimizeSkewedJoin(conf: SQLConf) extends 
Rule[SparkPlan] {
   val leftSidePartitions = mutable.ArrayBuffer.empty[ShufflePartitionSpec]
   val rightSidePartitions = mutable.ArrayBuffer.empty[ShufflePartitionSpec]
   // This is used to delay the creation of non-skew partitions so that we 
can potentially
-  // coalesce them like `ReduceNumShufflePartitions` does.
+  // coalesce them like `CoalesceShufflePartitions` does.
   val nonSkewPartitionIndices = mutable.ArrayBuffer.empty[Int]
   val leftSkewDesc = new SkewDesc
   val rightSkewDesc = new SkewDesc


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (3bd6ebf -> 34be83e)

2020-03-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 3bd6ebf  [SPARK-30189][SQL] Interval from year-month/date-time string 
should handle whitespaces
 add 34be83e  [SPARK-31037][SQL][FOLLOW-UP] Replace legacy 
ReduceNumShufflePartitions with CoalesceShufflePartitions in comment

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala | 6 +++---
 .../apache/spark/sql/execution/adaptive/OptimizeSkewedJoin.scala| 4 ++--
 2 files changed, 5 insertions(+), 5 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-31037][SQL][FOLLOW-UP] Replace legacy ReduceNumShufflePartitions with CoalesceShufflePartitions in comment

2020-03-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 57bf23c  [SPARK-31037][SQL][FOLLOW-UP] Replace legacy 
ReduceNumShufflePartitions with CoalesceShufflePartitions in comment
57bf23c is described below

commit 57bf23c01b2cffe5011a9d15eb68eff5c28519f4
Author: yi.wu 
AuthorDate: Tue Mar 10 11:09:36 2020 -0700

[SPARK-31037][SQL][FOLLOW-UP] Replace legacy ReduceNumShufflePartitions 
with CoalesceShufflePartitions in comment

### What changes were proposed in this pull request?

Replace legacy `ReduceNumShufflePartitions` with 
`CoalesceShufflePartitions` in comment.

### Why are the changes needed?

Rule `ReduceNumShufflePartitions` has renamed to 
`CoalesceShufflePartitions`, we should update related comment as well.

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

N/A.

Closes #27865 from Ngone51/spark_31037_followup.

Authored-by: yi.wu 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit 34be83e08b6f5313bdd9d165d3e203d06eff677b)
Signed-off-by: Dongjoon Hyun 
---
 .../apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala | 6 +++---
 .../apache/spark/sql/execution/adaptive/OptimizeSkewedJoin.scala| 4 ++--
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala
index fc88a7f..c1486aa 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala
@@ -97,12 +97,12 @@ case class AdaptiveSparkPlanExec(
   @transient private val queryStageOptimizerRules: Seq[Rule[SparkPlan]] = Seq(
 ReuseAdaptiveSubquery(conf, context.subqueryCache),
 // Here the 'OptimizeSkewedJoin' rule should be executed
-// before 'ReduceNumShufflePartitions', as the skewed partition handled
-// in 'OptimizeSkewedJoin' rule, should be omitted in 
'ReduceNumShufflePartitions'.
+// before 'CoalesceShufflePartitions', as the skewed partition handled
+// in 'OptimizeSkewedJoin' rule, should be omitted in 
'CoalesceShufflePartitions'.
 OptimizeSkewedJoin(conf),
 CoalesceShufflePartitions(conf),
 // The rule of 'OptimizeLocalShuffleReader' need to make use of the 
'partitionStartIndices'
-// in 'ReduceNumShufflePartitions' rule. So it must be after 
'ReduceNumShufflePartitions' rule.
+// in 'CoalesceShufflePartitions' rule. So it must be after 
'CoalesceShufflePartitions' rule.
 OptimizeLocalShuffleReader(conf),
 ApplyColumnarRulesAndInsertTransitions(conf, 
context.session.sessionState.columnarRules),
 CollapseCodegenStages(conf)
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/OptimizeSkewedJoin.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/OptimizeSkewedJoin.scala
index c3bcce4..4387409 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/OptimizeSkewedJoin.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/OptimizeSkewedJoin.scala
@@ -52,7 +52,7 @@ import org.apache.spark.sql.internal.SQLConf
  * (L4-1, R4-1), (L4-2, R4-1), (L4-1, R4-2), (L4-2, R4-2)
  *
  * Note that, when this rule is enabled, it also coalesces non-skewed 
partitions like
- * `ReduceNumShufflePartitions` does.
+ * `CoalesceShufflePartitions` does.
  */
 case class OptimizeSkewedJoin(conf: SQLConf) extends Rule[SparkPlan] {
 
@@ -191,7 +191,7 @@ case class OptimizeSkewedJoin(conf: SQLConf) extends 
Rule[SparkPlan] {
   val leftSidePartitions = mutable.ArrayBuffer.empty[ShufflePartitionSpec]
   val rightSidePartitions = mutable.ArrayBuffer.empty[ShufflePartitionSpec]
   // This is used to delay the creation of non-skew partitions so that we 
can potentially
-  // coalesce them like `ReduceNumShufflePartitions` does.
+  // coalesce them like `CoalesceShufflePartitions` does.
   val nonSkewPartitionIndices = mutable.ArrayBuffer.empty[Int]
   val leftSkewDesc = new SkewDesc
   val rightSkewDesc = new SkewDesc


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-31037][SQL][FOLLOW-UP] Replace legacy ReduceNumShufflePartitions with CoalesceShufflePartitions in comment

2020-03-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 57bf23c  [SPARK-31037][SQL][FOLLOW-UP] Replace legacy 
ReduceNumShufflePartitions with CoalesceShufflePartitions in comment
57bf23c is described below

commit 57bf23c01b2cffe5011a9d15eb68eff5c28519f4
Author: yi.wu 
AuthorDate: Tue Mar 10 11:09:36 2020 -0700

[SPARK-31037][SQL][FOLLOW-UP] Replace legacy ReduceNumShufflePartitions 
with CoalesceShufflePartitions in comment

### What changes were proposed in this pull request?

Replace legacy `ReduceNumShufflePartitions` with 
`CoalesceShufflePartitions` in comment.

### Why are the changes needed?

Rule `ReduceNumShufflePartitions` has renamed to 
`CoalesceShufflePartitions`, we should update related comment as well.

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

N/A.

Closes #27865 from Ngone51/spark_31037_followup.

Authored-by: yi.wu 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit 34be83e08b6f5313bdd9d165d3e203d06eff677b)
Signed-off-by: Dongjoon Hyun 
---
 .../apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala | 6 +++---
 .../apache/spark/sql/execution/adaptive/OptimizeSkewedJoin.scala| 4 ++--
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala
index fc88a7f..c1486aa 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala
@@ -97,12 +97,12 @@ case class AdaptiveSparkPlanExec(
   @transient private val queryStageOptimizerRules: Seq[Rule[SparkPlan]] = Seq(
 ReuseAdaptiveSubquery(conf, context.subqueryCache),
 // Here the 'OptimizeSkewedJoin' rule should be executed
-// before 'ReduceNumShufflePartitions', as the skewed partition handled
-// in 'OptimizeSkewedJoin' rule, should be omitted in 
'ReduceNumShufflePartitions'.
+// before 'CoalesceShufflePartitions', as the skewed partition handled
+// in 'OptimizeSkewedJoin' rule, should be omitted in 
'CoalesceShufflePartitions'.
 OptimizeSkewedJoin(conf),
 CoalesceShufflePartitions(conf),
 // The rule of 'OptimizeLocalShuffleReader' need to make use of the 
'partitionStartIndices'
-// in 'ReduceNumShufflePartitions' rule. So it must be after 
'ReduceNumShufflePartitions' rule.
+// in 'CoalesceShufflePartitions' rule. So it must be after 
'CoalesceShufflePartitions' rule.
 OptimizeLocalShuffleReader(conf),
 ApplyColumnarRulesAndInsertTransitions(conf, 
context.session.sessionState.columnarRules),
 CollapseCodegenStages(conf)
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/OptimizeSkewedJoin.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/OptimizeSkewedJoin.scala
index c3bcce4..4387409 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/OptimizeSkewedJoin.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/OptimizeSkewedJoin.scala
@@ -52,7 +52,7 @@ import org.apache.spark.sql.internal.SQLConf
  * (L4-1, R4-1), (L4-2, R4-1), (L4-1, R4-2), (L4-2, R4-2)
  *
  * Note that, when this rule is enabled, it also coalesces non-skewed 
partitions like
- * `ReduceNumShufflePartitions` does.
+ * `CoalesceShufflePartitions` does.
  */
 case class OptimizeSkewedJoin(conf: SQLConf) extends Rule[SparkPlan] {
 
@@ -191,7 +191,7 @@ case class OptimizeSkewedJoin(conf: SQLConf) extends 
Rule[SparkPlan] {
   val leftSidePartitions = mutable.ArrayBuffer.empty[ShufflePartitionSpec]
   val rightSidePartitions = mutable.ArrayBuffer.empty[ShufflePartitionSpec]
   // This is used to delay the creation of non-skew partitions so that we 
can potentially
-  // coalesce them like `ReduceNumShufflePartitions` does.
+  // coalesce them like `CoalesceShufflePartitions` does.
   val nonSkewPartitionIndices = mutable.ArrayBuffer.empty[Int]
   val leftSkewDesc = new SkewDesc
   val rightSkewDesc = new SkewDesc


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (0f54dc7 -> 93def95)

2020-03-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 0f54dc7  [SPARK-30962][SQL][DOC] Documentation for Alter table command 
phase 2
 add 93def95  [SPARK-31095][BUILD] Upgrade netty-all to 4.1.47.Final

No new revisions were added by this update.

Summary of changes:
 dev/deps/spark-deps-hadoop-2.7-hive-1.2 | 2 +-
 dev/deps/spark-deps-hadoop-2.7-hive-2.3 | 2 +-
 dev/deps/spark-deps-hadoop-3.2-hive-2.3 | 2 +-
 pom.xml | 2 +-
 4 files changed, 4 insertions(+), 4 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-31095][BUILD] Upgrade netty-all to 4.1.47.Final

2020-03-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new d1f5df4  [SPARK-31095][BUILD] Upgrade netty-all to 4.1.47.Final
d1f5df4 is described below

commit d1f5df40cb7687c5fd3145d3d629fb1069227638
Author: Dongjoon Hyun 
AuthorDate: Tue Mar 10 17:50:34 2020 -0700

[SPARK-31095][BUILD] Upgrade netty-all to 4.1.47.Final

### What changes were proposed in this pull request?

This PR aims to bring the bug fixes from the latest netty-all.

### Why are the changes needed?

- 4.1.47.Final: https://github.com/netty/netty/milestone/222?closed=1 (15 
patches or issues)
- 4.1.46.Final: https://github.com/netty/netty/milestone/221?closed=1 (80 
patches or issues)
- 4.1.45.Final: https://github.com/netty/netty/milestone/220?closed=1 (23 
patches or issues)
- 4.1.44.Final: https://github.com/netty/netty/milestone/218?closed=1 (113 
patches or issues)
- 4.1.43.Final: https://github.com/netty/netty/milestone/217?closed=1 (63 
patches or issues)

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Pass the Jenkins with the existing tests.

Closes #27869 from dongjoon-hyun/SPARK-31095.

Authored-by: Dongjoon Hyun 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit 93def95b0801842e0288a77b3a97f84d31b57366)
Signed-off-by: Dongjoon Hyun 
---
 dev/deps/spark-deps-hadoop-2.7-hive-1.2 | 2 +-
 dev/deps/spark-deps-hadoop-2.7-hive-2.3 | 2 +-
 dev/deps/spark-deps-hadoop-3.2-hive-2.3 | 2 +-
 pom.xml | 2 +-
 4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/dev/deps/spark-deps-hadoop-2.7-hive-1.2 
b/dev/deps/spark-deps-hadoop-2.7-hive-1.2
index 828b1a6..39f7262 100644
--- a/dev/deps/spark-deps-hadoop-2.7-hive-1.2
+++ b/dev/deps/spark-deps-hadoop-2.7-hive-1.2
@@ -155,7 +155,7 @@ metrics-jmx/4.1.1//metrics-jmx-4.1.1.jar
 metrics-json/4.1.1//metrics-json-4.1.1.jar
 metrics-jvm/4.1.1//metrics-jvm-4.1.1.jar
 minlog/1.3.0//minlog-1.3.0.jar
-netty-all/4.1.42.Final//netty-all-4.1.42.Final.jar
+netty-all/4.1.47.Final//netty-all-4.1.47.Final.jar
 objenesis/2.5.1//objenesis-2.5.1.jar
 okhttp/3.12.6//okhttp-3.12.6.jar
 okio/1.15.0//okio-1.15.0.jar
diff --git a/dev/deps/spark-deps-hadoop-2.7-hive-2.3 
b/dev/deps/spark-deps-hadoop-2.7-hive-2.3
index 8a65540..26ac30d 100644
--- a/dev/deps/spark-deps-hadoop-2.7-hive-2.3
+++ b/dev/deps/spark-deps-hadoop-2.7-hive-2.3
@@ -170,7 +170,7 @@ metrics-jmx/4.1.1//metrics-jmx-4.1.1.jar
 metrics-json/4.1.1//metrics-json-4.1.1.jar
 metrics-jvm/4.1.1//metrics-jvm-4.1.1.jar
 minlog/1.3.0//minlog-1.3.0.jar
-netty-all/4.1.42.Final//netty-all-4.1.42.Final.jar
+netty-all/4.1.47.Final//netty-all-4.1.47.Final.jar
 objenesis/2.5.1//objenesis-2.5.1.jar
 okhttp/3.12.6//okhttp-3.12.6.jar
 okio/1.15.0//okio-1.15.0.jar
diff --git a/dev/deps/spark-deps-hadoop-3.2-hive-2.3 
b/dev/deps/spark-deps-hadoop-3.2-hive-2.3
index 4dddbba..e908ec8 100644
--- a/dev/deps/spark-deps-hadoop-3.2-hive-2.3
+++ b/dev/deps/spark-deps-hadoop-3.2-hive-2.3
@@ -183,7 +183,7 @@ metrics-json/4.1.1//metrics-json-4.1.1.jar
 metrics-jvm/4.1.1//metrics-jvm-4.1.1.jar
 minlog/1.3.0//minlog-1.3.0.jar
 mssql-jdbc/6.2.1.jre7//mssql-jdbc-6.2.1.jre7.jar
-netty-all/4.1.42.Final//netty-all-4.1.42.Final.jar
+netty-all/4.1.47.Final//netty-all-4.1.47.Final.jar
 nimbus-jose-jwt/4.41.1//nimbus-jose-jwt-4.41.1.jar
 objenesis/2.5.1//objenesis-2.5.1.jar
 okhttp/2.7.5//okhttp-2.7.5.jar
diff --git a/pom.xml b/pom.xml
index 8a46197..262f3ac 100644
--- a/pom.xml
+++ b/pom.xml
@@ -698,7 +698,7 @@
   
 io.netty
 netty-all
-4.1.42.Final
+4.1.47.Final
   
   
 org.apache.derby


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-31095][BUILD] Upgrade netty-all to 4.1.47.Final

2020-03-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new d1f5df4  [SPARK-31095][BUILD] Upgrade netty-all to 4.1.47.Final
d1f5df4 is described below

commit d1f5df40cb7687c5fd3145d3d629fb1069227638
Author: Dongjoon Hyun 
AuthorDate: Tue Mar 10 17:50:34 2020 -0700

[SPARK-31095][BUILD] Upgrade netty-all to 4.1.47.Final

### What changes were proposed in this pull request?

This PR aims to bring the bug fixes from the latest netty-all.

### Why are the changes needed?

- 4.1.47.Final: https://github.com/netty/netty/milestone/222?closed=1 (15 
patches or issues)
- 4.1.46.Final: https://github.com/netty/netty/milestone/221?closed=1 (80 
patches or issues)
- 4.1.45.Final: https://github.com/netty/netty/milestone/220?closed=1 (23 
patches or issues)
- 4.1.44.Final: https://github.com/netty/netty/milestone/218?closed=1 (113 
patches or issues)
- 4.1.43.Final: https://github.com/netty/netty/milestone/217?closed=1 (63 
patches or issues)

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Pass the Jenkins with the existing tests.

Closes #27869 from dongjoon-hyun/SPARK-31095.

Authored-by: Dongjoon Hyun 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit 93def95b0801842e0288a77b3a97f84d31b57366)
Signed-off-by: Dongjoon Hyun 
---
 dev/deps/spark-deps-hadoop-2.7-hive-1.2 | 2 +-
 dev/deps/spark-deps-hadoop-2.7-hive-2.3 | 2 +-
 dev/deps/spark-deps-hadoop-3.2-hive-2.3 | 2 +-
 pom.xml | 2 +-
 4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/dev/deps/spark-deps-hadoop-2.7-hive-1.2 
b/dev/deps/spark-deps-hadoop-2.7-hive-1.2
index 828b1a6..39f7262 100644
--- a/dev/deps/spark-deps-hadoop-2.7-hive-1.2
+++ b/dev/deps/spark-deps-hadoop-2.7-hive-1.2
@@ -155,7 +155,7 @@ metrics-jmx/4.1.1//metrics-jmx-4.1.1.jar
 metrics-json/4.1.1//metrics-json-4.1.1.jar
 metrics-jvm/4.1.1//metrics-jvm-4.1.1.jar
 minlog/1.3.0//minlog-1.3.0.jar
-netty-all/4.1.42.Final//netty-all-4.1.42.Final.jar
+netty-all/4.1.47.Final//netty-all-4.1.47.Final.jar
 objenesis/2.5.1//objenesis-2.5.1.jar
 okhttp/3.12.6//okhttp-3.12.6.jar
 okio/1.15.0//okio-1.15.0.jar
diff --git a/dev/deps/spark-deps-hadoop-2.7-hive-2.3 
b/dev/deps/spark-deps-hadoop-2.7-hive-2.3
index 8a65540..26ac30d 100644
--- a/dev/deps/spark-deps-hadoop-2.7-hive-2.3
+++ b/dev/deps/spark-deps-hadoop-2.7-hive-2.3
@@ -170,7 +170,7 @@ metrics-jmx/4.1.1//metrics-jmx-4.1.1.jar
 metrics-json/4.1.1//metrics-json-4.1.1.jar
 metrics-jvm/4.1.1//metrics-jvm-4.1.1.jar
 minlog/1.3.0//minlog-1.3.0.jar
-netty-all/4.1.42.Final//netty-all-4.1.42.Final.jar
+netty-all/4.1.47.Final//netty-all-4.1.47.Final.jar
 objenesis/2.5.1//objenesis-2.5.1.jar
 okhttp/3.12.6//okhttp-3.12.6.jar
 okio/1.15.0//okio-1.15.0.jar
diff --git a/dev/deps/spark-deps-hadoop-3.2-hive-2.3 
b/dev/deps/spark-deps-hadoop-3.2-hive-2.3
index 4dddbba..e908ec8 100644
--- a/dev/deps/spark-deps-hadoop-3.2-hive-2.3
+++ b/dev/deps/spark-deps-hadoop-3.2-hive-2.3
@@ -183,7 +183,7 @@ metrics-json/4.1.1//metrics-json-4.1.1.jar
 metrics-jvm/4.1.1//metrics-jvm-4.1.1.jar
 minlog/1.3.0//minlog-1.3.0.jar
 mssql-jdbc/6.2.1.jre7//mssql-jdbc-6.2.1.jre7.jar
-netty-all/4.1.42.Final//netty-all-4.1.42.Final.jar
+netty-all/4.1.47.Final//netty-all-4.1.47.Final.jar
 nimbus-jose-jwt/4.41.1//nimbus-jose-jwt-4.41.1.jar
 objenesis/2.5.1//objenesis-2.5.1.jar
 okhttp/2.7.5//okhttp-2.7.5.jar
diff --git a/pom.xml b/pom.xml
index 8a46197..262f3ac 100644
--- a/pom.xml
+++ b/pom.xml
@@ -698,7 +698,7 @@
   
 io.netty
 netty-all
-4.1.42.Final
+4.1.47.Final
   
   
 org.apache.derby


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (5be0d04 -> 8efb710)

2020-03-11 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 5be0d04  [SPARK-31117][SQL][TEST] reduce the test time of 
DateTimeUtilsSuite
 add 8efb710  [SPARK-31091] Revert SPARK-24640 Return `NULL` from 
`size(NULL)` by default

No new revisions were added by this update.

Summary of changes:
 docs/sql-migration-guide.md   | 2 --
 .../apache/spark/sql/catalyst/expressions/collectionOperations.scala  | 4 ++--
 .../src/main/scala/org/apache/spark/sql/internal/SQLConf.scala| 2 +-
 3 files changed, 3 insertions(+), 5 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (5be0d04 -> 8efb710)

2020-03-11 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 5be0d04  [SPARK-31117][SQL][TEST] reduce the test time of 
DateTimeUtilsSuite
 add 8efb710  [SPARK-31091] Revert SPARK-24640 Return `NULL` from 
`size(NULL)` by default

No new revisions were added by this update.

Summary of changes:
 docs/sql-migration-guide.md   | 2 --
 .../apache/spark/sql/catalyst/expressions/collectionOperations.scala  | 4 ++--
 .../src/main/scala/org/apache/spark/sql/internal/SQLConf.scala| 2 +-
 3 files changed, 3 insertions(+), 5 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-31091] Revert SPARK-24640 Return `NULL` from `size(NULL)` by default

2020-03-11 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new c1e6e14  [SPARK-31091] Revert SPARK-24640 Return `NULL` from 
`size(NULL)` by default
c1e6e14 is described below

commit c1e6e1439d1a79560f197e2627a334fcd0bb8a28
Author: Wenchen Fan 
AuthorDate: Wed Mar 11 09:55:24 2020 -0700

[SPARK-31091] Revert SPARK-24640 Return `NULL` from `size(NULL)` by default

### What changes were proposed in this pull request?

This PR reverts https://github.com/apache/spark/pull/26051 and 
https://github.com/apache/spark/pull/26066

### Why are the changes needed?

There is no standard requiring that `size(null)` must return null, and 
returning -1 looks reasonable as well. This is kind of a cosmetic change and we 
should avoid it if it breaks existing queries. This is similar to reverting 
TRIM function parameter order change.

### Does this PR introduce any user-facing change?

Yes, change the behavior of `size(null)` back to be the same as 2.4.

### How was this patch tested?

N/A

Closes #27834 from cloud-fan/revert.

Authored-by: Wenchen Fan 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit 8efb71013d0c9e8d81430aa48f88b91929425bff)
Signed-off-by: Dongjoon Hyun 
---
 docs/sql-migration-guide.md   | 2 --
 .../apache/spark/sql/catalyst/expressions/collectionOperations.scala  | 4 ++--
 .../src/main/scala/org/apache/spark/sql/internal/SQLConf.scala| 2 +-
 3 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/docs/sql-migration-guide.md b/docs/sql-migration-guide.md
index 6c73038..e7ac9f0 100644
--- a/docs/sql-migration-guide.md
+++ b/docs/sql-migration-guide.md
@@ -214,8 +214,6 @@ license: |
 - `now` - current query start time
   For example `SELECT timestamp 'tomorrow';`.
 
-  - Since Spark 3.0, the `size` function returns `NULL` for the `NULL` input. 
In Spark version 2.4 and earlier, this function gives `-1` for the same input. 
To restore the behavior before Spark 3.0, you can set 
`spark.sql.legacy.sizeOfNull` to `true`.
-  
   - Since Spark 3.0, when the `array`/`map` function is called without any 
parameters, it returns an empty collection with `NullType` as element type. In 
Spark version 2.4 and earlier, it returns an empty collection with `StringType` 
as element type. To restore the behavior before Spark 3.0, you can set 
`spark.sql.legacy.createEmptyCollectionUsingStringType` to `true`.
 
   - Since Spark 3.0, the interval literal syntax does not allow multiple 
from-to units anymore. For example, `SELECT INTERVAL '1-1' YEAR TO MONTH '2-2' 
YEAR TO MONTH'` throws parser exception.
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
index cfa877b..6d95909 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
@@ -79,7 +79,7 @@ trait BinaryArrayExpressionWithImplicitCast extends 
BinaryExpression
 _FUNC_(expr) - Returns the size of an array or a map.
 The function returns -1 if its input is null and 
spark.sql.legacy.sizeOfNull is set to true.
 If spark.sql.legacy.sizeOfNull is set to false, the function returns null 
for null input.
-By default, the spark.sql.legacy.sizeOfNull parameter is set to false.
+By default, the spark.sql.legacy.sizeOfNull parameter is set to true.
   """,
   examples = """
 Examples:
@@ -88,7 +88,7 @@ trait BinaryArrayExpressionWithImplicitCast extends 
BinaryExpression
   > SELECT _FUNC_(map('a', 1, 'b', 2));
2
   > SELECT _FUNC_(NULL);
-   NULL
+   -1
   """)
 case class Size(child: Expression, legacySizeOfNull: Boolean)
   extends UnaryExpression with ExpectsInputTypes {
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
index fdaf0ec..644fe89 100644
--- a/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
+++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
@@ -1942,7 +1942,7 @@ object SQLConf {
 .doc("If it is set to true, size of null returns -1. This behavior was 
inherited from Hive. " +
   "The size function returns null for null input if the flag is disabled.")
 .booleanConf
-.createWithDefault(false)
+.createWithDefault(tru

[spark] branch branch-3.0 updated: [SPARK-31091] Revert SPARK-24640 Return `NULL` from `size(NULL)` by default

2020-03-11 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new c1e6e14  [SPARK-31091] Revert SPARK-24640 Return `NULL` from 
`size(NULL)` by default
c1e6e14 is described below

commit c1e6e1439d1a79560f197e2627a334fcd0bb8a28
Author: Wenchen Fan 
AuthorDate: Wed Mar 11 09:55:24 2020 -0700

[SPARK-31091] Revert SPARK-24640 Return `NULL` from `size(NULL)` by default

### What changes were proposed in this pull request?

This PR reverts https://github.com/apache/spark/pull/26051 and 
https://github.com/apache/spark/pull/26066

### Why are the changes needed?

There is no standard requiring that `size(null)` must return null, and 
returning -1 looks reasonable as well. This is kind of a cosmetic change and we 
should avoid it if it breaks existing queries. This is similar to reverting 
TRIM function parameter order change.

### Does this PR introduce any user-facing change?

Yes, change the behavior of `size(null)` back to be the same as 2.4.

### How was this patch tested?

N/A

Closes #27834 from cloud-fan/revert.

Authored-by: Wenchen Fan 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit 8efb71013d0c9e8d81430aa48f88b91929425bff)
Signed-off-by: Dongjoon Hyun 
---
 docs/sql-migration-guide.md   | 2 --
 .../apache/spark/sql/catalyst/expressions/collectionOperations.scala  | 4 ++--
 .../src/main/scala/org/apache/spark/sql/internal/SQLConf.scala| 2 +-
 3 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/docs/sql-migration-guide.md b/docs/sql-migration-guide.md
index 6c73038..e7ac9f0 100644
--- a/docs/sql-migration-guide.md
+++ b/docs/sql-migration-guide.md
@@ -214,8 +214,6 @@ license: |
 - `now` - current query start time
   For example `SELECT timestamp 'tomorrow';`.
 
-  - Since Spark 3.0, the `size` function returns `NULL` for the `NULL` input. 
In Spark version 2.4 and earlier, this function gives `-1` for the same input. 
To restore the behavior before Spark 3.0, you can set 
`spark.sql.legacy.sizeOfNull` to `true`.
-  
   - Since Spark 3.0, when the `array`/`map` function is called without any 
parameters, it returns an empty collection with `NullType` as element type. In 
Spark version 2.4 and earlier, it returns an empty collection with `StringType` 
as element type. To restore the behavior before Spark 3.0, you can set 
`spark.sql.legacy.createEmptyCollectionUsingStringType` to `true`.
 
   - Since Spark 3.0, the interval literal syntax does not allow multiple 
from-to units anymore. For example, `SELECT INTERVAL '1-1' YEAR TO MONTH '2-2' 
YEAR TO MONTH'` throws parser exception.
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
index cfa877b..6d95909 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
@@ -79,7 +79,7 @@ trait BinaryArrayExpressionWithImplicitCast extends 
BinaryExpression
 _FUNC_(expr) - Returns the size of an array or a map.
 The function returns -1 if its input is null and 
spark.sql.legacy.sizeOfNull is set to true.
 If spark.sql.legacy.sizeOfNull is set to false, the function returns null 
for null input.
-By default, the spark.sql.legacy.sizeOfNull parameter is set to false.
+By default, the spark.sql.legacy.sizeOfNull parameter is set to true.
   """,
   examples = """
 Examples:
@@ -88,7 +88,7 @@ trait BinaryArrayExpressionWithImplicitCast extends 
BinaryExpression
   > SELECT _FUNC_(map('a', 1, 'b', 2));
2
   > SELECT _FUNC_(NULL);
-   NULL
+   -1
   """)
 case class Size(child: Expression, legacySizeOfNull: Boolean)
   extends UnaryExpression with ExpectsInputTypes {
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
index fdaf0ec..644fe89 100644
--- a/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
+++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
@@ -1942,7 +1942,7 @@ object SQLConf {
 .doc("If it is set to true, size of null returns -1. This behavior was 
inherited from Hive. " +
   "The size function returns null for null input if the flag is disabled.")
 .booleanConf
-.createWithDefault(false)
+.createWithDefault(tru

[spark] branch branch-2.4 updated (f378c7f -> 8e1021d)

2020-03-11 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git.


from f378c7f  [SPARK-30941][PYSPARK] Add a note to asDict to document its 
behavior when there are duplicate fields
 add 8e1021d  [SPARK-31095][BUILD][2.4] Upgrade netty-all to 4.1.47.Final

No new revisions were added by this update.

Summary of changes:
 dev/deps/spark-deps-hadoop-2.6 | 2 +-
 dev/deps/spark-deps-hadoop-2.7 | 2 +-
 dev/deps/spark-deps-hadoop-3.1 | 2 +-
 pom.xml| 2 +-
 4 files changed, 4 insertions(+), 4 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (2825237 -> 0f0ccda)

2020-03-11 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 2825237  [SPARK-31062][K8S][TESTS] Improve spark decommissioning k8s 
test reliability
 add 0f0ccda  [SPARK-31110][DOCS][SQL] refine sql doc for SELECT

No new revisions were added by this update.

Summary of changes:
 docs/sql-ref-syntax-qry-select-clusterby.md | 18 
 docs/sql-ref-syntax-qry-select-distribute-by.md | 18 
 docs/sql-ref-syntax-qry-select-groupby.md   | 48 ++---
 docs/sql-ref-syntax-qry-select-having.md| 12 +++---
 docs/sql-ref-syntax-qry-select-limit.md | 23 +++
 docs/sql-ref-syntax-qry-select-orderby.md   | 24 +--
 docs/sql-ref-syntax-qry-select-sortby.md| 28 ++---
 docs/sql-ref-syntax-qry-select-where.md | 10 ++---
 docs/sql-ref-syntax-qry-select.md   | 55 ++---
 docs/sql-ref-syntax-qry.md  |  8 ++--
 10 files changed, 126 insertions(+), 118 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-31110][DOCS][SQL] refine sql doc for SELECT

2020-03-11 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new ffcc4a2  [SPARK-31110][DOCS][SQL] refine sql doc for SELECT
ffcc4a2 is described below

commit ffcc4a27041abe97991f4bd14d0b5abf3c50a542
Author: Wenchen Fan 
AuthorDate: Wed Mar 11 16:52:40 2020 -0700

[SPARK-31110][DOCS][SQL] refine sql doc for SELECT

### What changes were proposed in this pull request?

A few improvements to the sql ref SELECT doc:
1. correct the syntax of SELECT query
2. correct the default of null sort order
3. correct the GROUP BY syntax
4. several minor fixes

### Why are the changes needed?

refine document

### Does this PR introduce any user-facing change?

N/A

### How was this patch tested?

N/A

Closes #27866 from cloud-fan/doc.

Authored-by: Wenchen Fan 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit 0f0ccdadb123d5839c34244e25a4ee17dde0fcdc)
Signed-off-by: Dongjoon Hyun 
---
 docs/sql-ref-syntax-qry-select-clusterby.md | 18 
 docs/sql-ref-syntax-qry-select-distribute-by.md | 18 
 docs/sql-ref-syntax-qry-select-groupby.md   | 48 ++---
 docs/sql-ref-syntax-qry-select-having.md| 12 +++---
 docs/sql-ref-syntax-qry-select-limit.md | 23 +++
 docs/sql-ref-syntax-qry-select-orderby.md   | 24 +--
 docs/sql-ref-syntax-qry-select-sortby.md| 28 ++---
 docs/sql-ref-syntax-qry-select-where.md | 10 ++---
 docs/sql-ref-syntax-qry-select.md   | 55 ++---
 docs/sql-ref-syntax-qry.md  |  8 ++--
 10 files changed, 126 insertions(+), 118 deletions(-)

diff --git a/docs/sql-ref-syntax-qry-select-clusterby.md 
b/docs/sql-ref-syntax-qry-select-clusterby.md
index bb60e8b..8f1dc59 100644
--- a/docs/sql-ref-syntax-qry-select-clusterby.md
+++ b/docs/sql-ref-syntax-qry-select-clusterby.md
@@ -9,9 +9,9 @@ license: |
   The ASF licenses this file to You under the Apache License, Version 2.0
   (the "License"); you may not use this file except in compliance with
   the License.  You may obtain a copy of the License at
- 
+
  http://www.apache.org/licenses/LICENSE-2.0
- 
+
   Unless required by applicable law or agreed to in writing, software
   distributed under the License is distributed on an "AS IS" BASIS,
   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
@@ -41,20 +41,20 @@ CLUSTER BY { expression [ , ... ] }
 ### Examples
 {% highlight sql %}
 CREATE TABLE person (name STRING, age INT);
-INSERT INTO person VALUES 
-('Zen Hui', 25), 
-('Anil B', 18), 
-('Shone S', 16), 
+INSERT INTO person VALUES
+('Zen Hui', 25),
+('Anil B', 18),
+('Shone S', 16),
 ('Mike A', 25),
-('John A', 18), 
+('John A', 18),
 ('Jack N', 16);
 
 -- Reduce the number of shuffle partitions to 2 to illustrate the behavior of 
`CLUSTER BY`.
 -- It's easier to see the clustering and sorting behavior with less number of 
partitions.
 SET spark.sql.shuffle.partitions = 2;
-
+
 -- Select the rows with no ordering. Please note that without any sort 
directive, the results
--- of the query is not deterministic. It's included here to show the 
difference in behavior 
+-- of the query is not deterministic. It's included here to show the 
difference in behavior
 -- of a query when `CLUSTER BY` is not used vs when it's used. The query below 
produces rows
 -- where age column is not sorted.
 SELECT age, name FROM person;
diff --git a/docs/sql-ref-syntax-qry-select-distribute-by.md 
b/docs/sql-ref-syntax-qry-select-distribute-by.md
index 5ade9c1..957df9c 100644
--- a/docs/sql-ref-syntax-qry-select-distribute-by.md
+++ b/docs/sql-ref-syntax-qry-select-distribute-by.md
@@ -9,9 +9,9 @@ license: |
   The ASF licenses this file to You under the Apache License, Version 2.0
   (the "License"); you may not use this file except in compliance with
   the License.  You may obtain a copy of the License at
- 
+
  http://www.apache.org/licenses/LICENSE-2.0
- 
+
   Unless required by applicable law or agreed to in writing, software
   distributed under the License is distributed on an "AS IS" BASIS,
   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
@@ -20,7 +20,7 @@ license: |
 ---
 The DISTRIBUTE BY clause is used to repartition the data based
 on the input expressions. Unlike the [CLUSTER 
BY](sql-ref-syntax-qry-select-clusterby.html)
-clause, this does not sort the data within each partition. 
+clause, this does not sort the data within each partitio

[spark] branch branch-3.0 updated: [SPARK-31126][SS] Upgrade Kafka to 2.4.1

2020-03-11 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new b86dc6a  [SPARK-31126][SS] Upgrade Kafka to 2.4.1
b86dc6a is described below

commit b86dc6ab76d945e5c15c12390436ad95c119a493
Author: Dongjoon Hyun 
AuthorDate: Wed Mar 11 19:26:15 2020 -0700

[SPARK-31126][SS] Upgrade Kafka to 2.4.1

### What changes were proposed in this pull request?

This PR (SPARK-31126) aims to upgrade Kafka library to bring a client-side 
bug fix like KAFKA-8933

### Why are the changes needed?

The following is the full release note.
- https://downloads.apache.org/kafka/2.4.1/RELEASE_NOTES.html

### Does this PR introduce any user-facing change?

No

### How was this patch tested?

Pass the Jenkins with the existing test.

Closes #27881 from dongjoon-hyun/SPARK-KAFKA-2.4.1.

Authored-by: Dongjoon Hyun 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit 614323d326db192540c955b4fa9b3b7af7527001)
Signed-off-by: Dongjoon Hyun 
---
 pom.xml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/pom.xml b/pom.xml
index 262f3ac..5aa100e 100644
--- a/pom.xml
+++ b/pom.xml
@@ -132,7 +132,7 @@
 
 2.3
 
-2.4.0
+2.4.1
 10.12.1.1
 1.10.1
 1.5.9


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (bd2b3f9 -> 614323d)

2020-03-11 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from bd2b3f9  [SPARK-30911][CORE][DOC] Add version information to the 
configuration of Status
 add 614323d  [SPARK-31126][SS] Upgrade Kafka to 2.4.1

No new revisions were added by this update.

Summary of changes:
 pom.xml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-31011][CORE] Log better message if SIGPWR is not supported while setting up decommission

2020-03-11 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 3946b24  [SPARK-31011][CORE] Log better message if SIGPWR is not 
supported while setting up decommission
3946b24 is described below

commit 3946b243284fbd3bd98b456115ae194ad49fe8fe
Author: Jungtaek Lim (HeartSaVioR) 
AuthorDate: Wed Mar 11 20:27:00 2020 -0700

[SPARK-31011][CORE] Log better message if SIGPWR is not supported while 
setting up decommission

### What changes were proposed in this pull request?

This patch changes to log better message (at least relevant to 
decommission) when registering signal handler for SIGPWR fails. SIGPWR is 
non-POSIX and not all unix-like OS support it; we can easily find the case, 
macOS.

### Why are the changes needed?

Spark already logs message on failing to register handler for SIGPWR, but 
the error message is too general which doesn't give the information of the 
impact. End users should be noticed that failing to register handler for SIGPWR 
effectively "disables" the feature of decommission.

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Manually tested via running standalone master/worker in macOS 10.14.6, with 
`spark.worker.decommission.enabled= true`, and submit an example application to 
run executors.

(NOTE: the message may be different a bit, as the message can be updated in 
review phase.)

For worker log:

```
20/03/06 17:19:13 INFO Worker: Registering SIGPWR handler to trigger 
decommissioning.
20/03/06 17:19:13 INFO SignalUtils: Registering signal handler for PWR
20/03/06 17:19:13 WARN SignalUtils: Failed to register SIGPWR - disabling 
worker decommission.
java.lang.IllegalArgumentException: Unknown signal: PWR
at java.base/jdk.internal.misc.Signal.(Signal.java:148)
at jdk.unsupported/sun.misc.Signal.(Signal.java:139)
at 
org.apache.spark.util.SignalUtils$.$anonfun$registerSignal$1(SignalUtils.scala:95)
at 
scala.collection.mutable.HashMap.getOrElseUpdate(HashMap.scala:86)
at 
org.apache.spark.util.SignalUtils$.registerSignal(SignalUtils.scala:93)
at org.apache.spark.util.SignalUtils$.register(SignalUtils.scala:81)
at org.apache.spark.deploy.worker.Worker.(Worker.scala:73)
at 
org.apache.spark.deploy.worker.Worker$.startRpcEnvAndEndpoint(Worker.scala:887)
at org.apache.spark.deploy.worker.Worker$.main(Worker.scala:855)
at org.apache.spark.deploy.worker.Worker.main(Worker.scala)
```

For executor:

```
20/03/06 17:21:52 INFO CoarseGrainedExecutorBackend: Registering PWR 
handler.
20/03/06 17:21:52 INFO SignalUtils: Registering signal handler for PWR
20/03/06 17:21:52 WARN SignalUtils: Failed to register SIGPWR - disabling 
decommission feature.
java.lang.IllegalArgumentException: Unknown signal: PWR
at java.base/jdk.internal.misc.Signal.(Signal.java:148)
at jdk.unsupported/sun.misc.Signal.(Signal.java:139)
at 
org.apache.spark.util.SignalUtils$.$anonfun$registerSignal$1(SignalUtils.scala:95)
at 
scala.collection.mutable.HashMap.getOrElseUpdate(HashMap.scala:86)
at 
org.apache.spark.util.SignalUtils$.registerSignal(SignalUtils.scala:93)
at org.apache.spark.util.SignalUtils$.register(SignalUtils.scala:81)
at 
org.apache.spark.executor.CoarseGrainedExecutorBackend.onStart(CoarseGrainedExecutorBackend.scala:86)
at 
org.apache.spark.rpc.netty.Inbox.$anonfun$process$1(Inbox.scala:120)
at org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:203)
at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:100)
at 
org.apache.spark.rpc.netty.MessageLoop.org$apache$spark$rpc$netty$MessageLoop$$receiveLoop(MessageLoop.scala:75)
at 
org.apache.spark.rpc.netty.MessageLoop$$anon$1.run(MessageLoop.scala:41)
at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at 
java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
```

Closes #27832 from HeartSaVioR/SPARK-31011.

Authored-by: Jungtaek Lim (HeartSaVioR) 
Signed-off-by: Dongjoon Hyun 
---
 .../org/apache/spark/deploy/worker/Worker.scala|  3 +-
 .../executor/CoarseGrainedExecutorBackend.scala|  3 +-
 .../scala/org/apache/spark/util/Si

[spark] branch branch-2.4 updated: [SPARK-29295][SQL][2.4] Insert overwrite to Hive external table partition should delete old data

2020-03-12 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new c017422  [SPARK-29295][SQL][2.4] Insert overwrite to Hive external 
table partition should delete old data
c017422 is described below

commit c017422c6582121075738746cf9c7ae2257c658d
Author: Liang-Chi Hsieh 
AuthorDate: Thu Mar 12 03:00:35 2020 -0700

[SPARK-29295][SQL][2.4] Insert overwrite to Hive external table partition 
should delete old data

### What changes were proposed in this pull request?

This patch proposes to delete old Hive external partition directory even 
the partition does not exist in Hive, when insert overwrite Hive external table 
partition.

This is backport of #25979 to branch-2.4.

### Why are the changes needed?

When insert overwrite to a Hive external table partition, if the partition 
does not exist, Hive will not check if the external partition directory exists 
or not before copying files. So if users drop the partition, and then do insert 
overwrite to the same partition, the partition will have both old and new data.

For example:
```scala
withSQLConf(HiveUtils.CONVERT_METASTORE_PARQUET.key -> "false") {
  // test is an external Hive table.
  sql("INSERT OVERWRITE TABLE test PARTITION(name='n1') SELECT 1")
  sql("ALTER TABLE test DROP PARTITION(name='n1')")
  sql("INSERT OVERWRITE TABLE test PARTITION(name='n1') SELECT 2")
  sql("SELECT id FROM test WHERE name = 'n1' ORDER BY id") // Got both 1 
and 2.
}
```

### Does this PR introduce any user-facing change?

Yes. This fix a correctness issue when users drop partition on a Hive 
external table partition and then insert overwrite it.

### How was this patch tested?

Added test.

Closes #27887 from viirya/SPARK-29295-2.4.

Authored-by: Liang-Chi Hsieh 
Signed-off-by: Dongjoon Hyun 
---
 .../sql/hive/execution/InsertIntoHiveTable.scala   | 68 +++---
 .../spark/sql/hive/execution/SQLQuerySuite.scala   | 80 ++
 2 files changed, 139 insertions(+), 9 deletions(-)

diff --git 
a/sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala
 
b/sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala
index 0ed464d..1365737 100644
--- 
a/sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala
+++ 
b/sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala
@@ -24,7 +24,7 @@ import org.apache.hadoop.hive.ql.plan.TableDesc
 
 import org.apache.spark.SparkException
 import org.apache.spark.sql.{AnalysisException, Row, SparkSession}
-import org.apache.spark.sql.catalyst.catalog.{CatalogTable, ExternalCatalog}
+import org.apache.spark.sql.catalyst.catalog.{CatalogTable, CatalogTableType, 
ExternalCatalog, ExternalCatalogUtils}
 import org.apache.spark.sql.catalyst.expressions.Attribute
 import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan
 import org.apache.spark.sql.execution.SparkPlan
@@ -192,7 +192,7 @@ case class InsertIntoHiveTable(
   }.asInstanceOf[Attribute]
 }
 
-saveAsHiveFile(
+val writtenParts = saveAsHiveFile(
   sparkSession = sparkSession,
   plan = child,
   hadoopConf = hadoopConf,
@@ -202,6 +202,42 @@ case class InsertIntoHiveTable(
 
 if (partition.nonEmpty) {
   if (numDynamicPartitions > 0) {
+if (overwrite && table.tableType == CatalogTableType.EXTERNAL) {
+  // SPARK-29295: When insert overwrite to a Hive external table 
partition, if the
+  // partition does not exist, Hive will not check if the external 
partition directory
+  // exists or not before copying files. So if users drop the 
partition, and then do
+  // insert overwrite to the same partition, the partition will have 
both old and new
+  // data. We construct partition path. If the path exists, we delete 
it manually.
+  writtenParts.foreach { partPath =>
+val dpMap = partPath.split("/").map { part =>
+  val splitPart = part.split("=")
+  assert(splitPart.size == 2, s"Invalid written partition path: 
$part")
+  ExternalCatalogUtils.unescapePathName(splitPart(0)) ->
+ExternalCatalogUtils.unescapePathName(splitPart(1))
+}.toMap
+
+val updatedPartitionSpec = partition.map {
+  case (key, Some(value)) => key -> value
+  case (key, None) if dpMap.contains(key) => key -> dpMap(key)
+  case (key, _) =>
+throw new Spa

[spark] branch branch-2.4 updated: [SPARK-29295][SQL][2.4] Insert overwrite to Hive external table partition should delete old data

2020-03-12 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new c017422  [SPARK-29295][SQL][2.4] Insert overwrite to Hive external 
table partition should delete old data
c017422 is described below

commit c017422c6582121075738746cf9c7ae2257c658d
Author: Liang-Chi Hsieh 
AuthorDate: Thu Mar 12 03:00:35 2020 -0700

[SPARK-29295][SQL][2.4] Insert overwrite to Hive external table partition 
should delete old data

### What changes were proposed in this pull request?

This patch proposes to delete old Hive external partition directory even 
the partition does not exist in Hive, when insert overwrite Hive external table 
partition.

This is backport of #25979 to branch-2.4.

### Why are the changes needed?

When insert overwrite to a Hive external table partition, if the partition 
does not exist, Hive will not check if the external partition directory exists 
or not before copying files. So if users drop the partition, and then do insert 
overwrite to the same partition, the partition will have both old and new data.

For example:
```scala
withSQLConf(HiveUtils.CONVERT_METASTORE_PARQUET.key -> "false") {
  // test is an external Hive table.
  sql("INSERT OVERWRITE TABLE test PARTITION(name='n1') SELECT 1")
  sql("ALTER TABLE test DROP PARTITION(name='n1')")
  sql("INSERT OVERWRITE TABLE test PARTITION(name='n1') SELECT 2")
  sql("SELECT id FROM test WHERE name = 'n1' ORDER BY id") // Got both 1 
and 2.
}
```

### Does this PR introduce any user-facing change?

Yes. This fix a correctness issue when users drop partition on a Hive 
external table partition and then insert overwrite it.

### How was this patch tested?

Added test.

Closes #27887 from viirya/SPARK-29295-2.4.

Authored-by: Liang-Chi Hsieh 
Signed-off-by: Dongjoon Hyun 
---
 .../sql/hive/execution/InsertIntoHiveTable.scala   | 68 +++---
 .../spark/sql/hive/execution/SQLQuerySuite.scala   | 80 ++
 2 files changed, 139 insertions(+), 9 deletions(-)

diff --git 
a/sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala
 
b/sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala
index 0ed464d..1365737 100644
--- 
a/sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala
+++ 
b/sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala
@@ -24,7 +24,7 @@ import org.apache.hadoop.hive.ql.plan.TableDesc
 
 import org.apache.spark.SparkException
 import org.apache.spark.sql.{AnalysisException, Row, SparkSession}
-import org.apache.spark.sql.catalyst.catalog.{CatalogTable, ExternalCatalog}
+import org.apache.spark.sql.catalyst.catalog.{CatalogTable, CatalogTableType, 
ExternalCatalog, ExternalCatalogUtils}
 import org.apache.spark.sql.catalyst.expressions.Attribute
 import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan
 import org.apache.spark.sql.execution.SparkPlan
@@ -192,7 +192,7 @@ case class InsertIntoHiveTable(
   }.asInstanceOf[Attribute]
 }
 
-saveAsHiveFile(
+val writtenParts = saveAsHiveFile(
   sparkSession = sparkSession,
   plan = child,
   hadoopConf = hadoopConf,
@@ -202,6 +202,42 @@ case class InsertIntoHiveTable(
 
 if (partition.nonEmpty) {
   if (numDynamicPartitions > 0) {
+if (overwrite && table.tableType == CatalogTableType.EXTERNAL) {
+  // SPARK-29295: When insert overwrite to a Hive external table 
partition, if the
+  // partition does not exist, Hive will not check if the external 
partition directory
+  // exists or not before copying files. So if users drop the 
partition, and then do
+  // insert overwrite to the same partition, the partition will have 
both old and new
+  // data. We construct partition path. If the path exists, we delete 
it manually.
+  writtenParts.foreach { partPath =>
+val dpMap = partPath.split("/").map { part =>
+  val splitPart = part.split("=")
+  assert(splitPart.size == 2, s"Invalid written partition path: 
$part")
+  ExternalCatalogUtils.unescapePathName(splitPart(0)) ->
+ExternalCatalogUtils.unescapePathName(splitPart(1))
+}.toMap
+
+val updatedPartitionSpec = partition.map {
+  case (key, Some(value)) => key -> value
+  case (key, None) if dpMap.contains(key) => key -> dpMap(key)
+  case (key, _) =>
+throw new Spa

[spark] branch master updated (77c49cb -> 972e23d)

2020-03-12 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 77c49cb  [SPARK-31124][SQL] change the default value of 
minPartitionNum in AQE
 add 972e23d  [SPARK-31130][BUILD] Use the same version of `commons-io` in 
SBT

No new revisions were added by this update.

Summary of changes:
 project/SparkBuild.scala | 1 +
 1 file changed, 1 insertion(+)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-31130][BUILD] Use the same version of `commons-io` in SBT

2020-03-12 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 74cb509  [SPARK-31130][BUILD] Use the same version of `commons-io` in 
SBT
74cb509 is described below

commit 74cb5094ec00c359bb70a456d6490f45bdd5ccd7
Author: Dongjoon Hyun 
AuthorDate: Thu Mar 12 09:06:29 2020 -0700

[SPARK-31130][BUILD] Use the same version of `commons-io` in SBT

### What changes were proposed in this pull request?

This PR (SPARK-31130) aims to pin `Commons IO` version to `2.4` in SBT 
build like Maven build.

### Why are the changes needed?

[HADOOP-15261](https://issues.apache.org/jira/browse/HADOOP-15261) upgraded 
`commons-io` from 2.4 to 2.5 at Apache Hadoop 3.1.

In `Maven`, Apache Spark always uses `Commons IO 2.4` based on `pom.xml`.
```
$ git grep commons-io.version
pom.xml:2.4
pom.xml:${commons-io.version}
```

However, `SBT` choose `2.5`.

**branch-3.0**
```
$ build/sbt -Phadoop-3.2 "core/dependencyTree" | grep commons-io:commons-io 
| head -n1
[info]   | | +-commons-io:commons-io:2.5
```

**branch-2.4**
```
$ build/sbt -Phadoop-3.1 "core/dependencyTree" | grep commons-io:commons-io 
| head -n1
[info]   | | +-commons-io:commons-io:2.5
```

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Pass the Jenkins with `[test-hadoop3.2]` (the default PR Builder is `SBT`) 
and manually do the following locally.
```
build/sbt -Phadoop-3.2 "core/dependencyTree" | grep commons-io:commons-io | 
head -n1
```

    Closes #27886 from dongjoon-hyun/SPARK-31130.

Authored-by: Dongjoon Hyun 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit 972e23d18186c73026ebed95b37a886ca6eecf3e)
Signed-off-by: Dongjoon Hyun 
---
 project/SparkBuild.scala | 1 +
 1 file changed, 1 insertion(+)

diff --git a/project/SparkBuild.scala b/project/SparkBuild.scala
index b606bdd..1a2a7c3 100644
--- a/project/SparkBuild.scala
+++ b/project/SparkBuild.scala
@@ -621,6 +621,7 @@ object KubernetesIntegrationTests {
 object DependencyOverrides {
   lazy val settings = Seq(
 dependencyOverrides += "com.google.guava" % "guava" % "14.0.1",
+dependencyOverrides += "commons-io" % "commons-io" % "2.4",
 dependencyOverrides += "xerces" % "xercesImpl" % "2.12.0",
 dependencyOverrides += "jline" % "jline" % "2.14.6")
 }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-31130][BUILD] Use the same version of `commons-io` in SBT

2020-03-12 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 74cb509  [SPARK-31130][BUILD] Use the same version of `commons-io` in 
SBT
74cb509 is described below

commit 74cb5094ec00c359bb70a456d6490f45bdd5ccd7
Author: Dongjoon Hyun 
AuthorDate: Thu Mar 12 09:06:29 2020 -0700

[SPARK-31130][BUILD] Use the same version of `commons-io` in SBT

### What changes were proposed in this pull request?

This PR (SPARK-31130) aims to pin `Commons IO` version to `2.4` in SBT 
build like Maven build.

### Why are the changes needed?

[HADOOP-15261](https://issues.apache.org/jira/browse/HADOOP-15261) upgraded 
`commons-io` from 2.4 to 2.5 at Apache Hadoop 3.1.

In `Maven`, Apache Spark always uses `Commons IO 2.4` based on `pom.xml`.
```
$ git grep commons-io.version
pom.xml:2.4
pom.xml:${commons-io.version}
```

However, `SBT` choose `2.5`.

**branch-3.0**
```
$ build/sbt -Phadoop-3.2 "core/dependencyTree" | grep commons-io:commons-io 
| head -n1
[info]   | | +-commons-io:commons-io:2.5
```

**branch-2.4**
```
$ build/sbt -Phadoop-3.1 "core/dependencyTree" | grep commons-io:commons-io 
| head -n1
[info]   | | +-commons-io:commons-io:2.5
```

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Pass the Jenkins with `[test-hadoop3.2]` (the default PR Builder is `SBT`) 
and manually do the following locally.
```
build/sbt -Phadoop-3.2 "core/dependencyTree" | grep commons-io:commons-io | 
head -n1
```

    Closes #27886 from dongjoon-hyun/SPARK-31130.

Authored-by: Dongjoon Hyun 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit 972e23d18186c73026ebed95b37a886ca6eecf3e)
Signed-off-by: Dongjoon Hyun 
---
 project/SparkBuild.scala | 1 +
 1 file changed, 1 insertion(+)

diff --git a/project/SparkBuild.scala b/project/SparkBuild.scala
index b606bdd..1a2a7c3 100644
--- a/project/SparkBuild.scala
+++ b/project/SparkBuild.scala
@@ -621,6 +621,7 @@ object KubernetesIntegrationTests {
 object DependencyOverrides {
   lazy val settings = Seq(
 dependencyOverrides += "com.google.guava" % "guava" % "14.0.1",
+dependencyOverrides += "commons-io" % "commons-io" % "2.4",
 dependencyOverrides += "xerces" % "xercesImpl" % "2.12.0",
 dependencyOverrides += "jline" % "jline" % "2.14.6")
 }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated: [SPARK-31130][BUILD] Use the same version of `commons-io` in SBT

2020-03-12 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new e6bcaaa  [SPARK-31130][BUILD] Use the same version of `commons-io` in 
SBT
e6bcaaa is described below

commit e6bcaaa8a78f6c88b5c5276e90cd049e2cffc658
Author: Dongjoon Hyun 
AuthorDate: Thu Mar 12 09:06:29 2020 -0700

[SPARK-31130][BUILD] Use the same version of `commons-io` in SBT

This PR (SPARK-31130) aims to pin `Commons IO` version to `2.4` in SBT 
build like Maven build.

[HADOOP-15261](https://issues.apache.org/jira/browse/HADOOP-15261) upgraded 
`commons-io` from 2.4 to 2.5 at Apache Hadoop 3.1.

In `Maven`, Apache Spark always uses `Commons IO 2.4` based on `pom.xml`.
```
$ git grep commons-io.version
pom.xml:2.4
pom.xml:${commons-io.version}
```

However, `SBT` choose `2.5`.

**branch-3.0**
```
$ build/sbt -Phadoop-3.2 "core/dependencyTree" | grep commons-io:commons-io 
| head -n1
[info]   | | +-commons-io:commons-io:2.5
```

**branch-2.4**
```
$ build/sbt -Phadoop-3.1 "core/dependencyTree" | grep commons-io:commons-io 
| head -n1
[info]   | | +-commons-io:commons-io:2.5
```

No.

Pass the Jenkins with `[test-hadoop3.2]` (the default PR Builder is `SBT`) 
and manually do the following locally.
```
build/sbt -Phadoop-3.2 "core/dependencyTree" | grep commons-io:commons-io | 
head -n1
```

    Closes #27886 from dongjoon-hyun/SPARK-31130.

Authored-by: Dongjoon Hyun 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit 972e23d18186c73026ebed95b37a886ca6eecf3e)
Signed-off-by: Dongjoon Hyun 
---
 project/SparkBuild.scala | 1 +
 1 file changed, 1 insertion(+)

diff --git a/project/SparkBuild.scala b/project/SparkBuild.scala
index 3f85ac6..7ee079c 100644
--- a/project/SparkBuild.scala
+++ b/project/SparkBuild.scala
@@ -552,6 +552,7 @@ object DockerIntegrationTests {
 object DependencyOverrides {
   lazy val settings = Seq(
 dependencyOverrides += "com.google.guava" % "guava" % "14.0.1",
+dependencyOverrides += "commons-io" % "commons-io" % "2.4",
 dependencyOverrides += "com.fasterxml.jackson.core"  % "jackson-databind" 
% "2.6.7.3",
 dependencyOverrides += "jline" % "jline" % "2.14.6")
 }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (972e23d -> 7b4b29e8)

2020-03-12 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 972e23d  [SPARK-31130][BUILD] Use the same version of `commons-io` in 
SBT
 add 7b4b29e8 [SPARK-31131][SQL] Remove the unnecessary config 
spark.sql.legacy.timeParser.enabled

No new revisions were added by this update.

Summary of changes:
 docs/sql-migration-guide.md   |  2 +-
 .../main/scala/org/apache/spark/sql/internal/SQLConf.scala| 11 +--
 2 files changed, 2 insertions(+), 11 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-31131][SQL] Remove the unnecessary config spark.sql.legacy.timeParser.enabled

2020-03-12 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 4bcba6f  [SPARK-31131][SQL] Remove the unnecessary config 
spark.sql.legacy.timeParser.enabled
4bcba6f is described below

commit 4bcba6fa61e4edac3a616403af35d7e2b093fed3
Author: Kent Yao 
AuthorDate: Thu Mar 12 09:24:49 2020 -0700

[SPARK-31131][SQL] Remove the unnecessary config 
spark.sql.legacy.timeParser.enabled

spark.sql.legacy.timeParser.enabled should be removed from SQLConf and the 
migration guide
spark.sql.legacy.timeParsePolicy is the right one

fix doc

no

Pass the jenkins

Closes #27889 from yaooqinn/SPARK-31131.

Authored-by: Kent Yao 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit 7b4b29e8d955b43daa9ad28667e4fadbb9fce49a)
Signed-off-by: Dongjoon Hyun 
---
 docs/sql-migration-guide.md   | 2 +-
 .../src/main/scala/org/apache/spark/sql/internal/SQLConf.scala| 8 
 2 files changed, 1 insertion(+), 9 deletions(-)

diff --git a/docs/sql-migration-guide.md b/docs/sql-migration-guide.md
index e7ac9f0..1081079 100644
--- a/docs/sql-migration-guide.md
+++ b/docs/sql-migration-guide.md
@@ -67,7 +67,7 @@ license: |
 
   - Since Spark 3.0, Proleptic Gregorian calendar is used in parsing, 
formatting, and converting dates and timestamps as well as in extracting 
sub-components like years, days and etc. Spark 3.0 uses Java 8 API classes from 
the java.time packages that based on ISO chronology 
(https://docs.oracle.com/javase/8/docs/api/java/time/chrono/IsoChronology.html).
 In Spark version 2.4 and earlier, those operations are performed by using the 
hybrid calendar (Julian + Gregorian, see https://docs.orac [...]
 
-- Parsing/formatting of timestamp/date strings. This effects on CSV/JSON 
datasources and on the `unix_timestamp`, `date_format`, `to_unix_timestamp`, 
`from_unixtime`, `to_date`, `to_timestamp` functions when patterns specified by 
users is used for parsing and formatting. Since Spark 3.0, the conversions are 
based on `java.time.format.DateTimeFormatter`, see 
https://docs.oracle.com/javase/8/docs/api/java/time/format/DateTimeFormatter.html.
 New implementation performs strict checking o [...]
+- Parsing/formatting of timestamp/date strings. This effects on CSV/JSON 
datasources and on the `unix_timestamp`, `date_format`, `to_unix_timestamp`, 
`from_unixtime`, `to_date`, `to_timestamp` functions when patterns specified by 
users is used for parsing and formatting. Since Spark 3.0, we define our own 
pattern strings in `sql-ref-datetime-pattern.md`, which is implemented via 
`java.time.format.DateTimeFormatter` under the hood. New implementation 
performs strict checking of its in [...]
 
 - The `weekofyear`, `weekday`, `dayofweek`, `date_trunc`, 
`from_utc_timestamp`, `to_utc_timestamp`, and `unix_timestamp` functions use 
java.time API for calculation week number of year, day number of week as well 
for conversion from/to TimestampType values in UTC time zone.
 
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
index 06180f6..ba25a68 100644
--- a/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
+++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
@@ -2234,14 +2234,6 @@ object SQLConf {
   .checkValue(_ > 0, "The value of spark.sql.addPartitionInBatch.size must 
be positive")
   .createWithDefault(100)
 
-  val LEGACY_TIME_PARSER_ENABLED = 
buildConf("spark.sql.legacy.timeParser.enabled")
-.internal()
-.doc("When set to true, java.text.SimpleDateFormat is used for formatting 
and parsing " +
-  "dates/timestamps in a locale-sensitive manner. When set to false, 
classes from " +
-  "java.time.* packages are used for the same purpose.")
-.booleanConf
-.createWithDefault(false)
-
   val LEGACY_ALLOW_HASH_ON_MAPTYPE = 
buildConf("spark.sql.legacy.allowHashOnMapType")
 .doc("When set to true, hash expressions can be applied on elements of 
MapType. Otherwise, " +
   "an analysis exception will be thrown.")


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (7b4b29e8 -> fbc9dc7)

2020-03-12 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 7b4b29e8 [SPARK-31131][SQL] Remove the unnecessary config 
spark.sql.legacy.timeParser.enabled
 add fbc9dc7  [SPARK-31129][SQL][TESTS] Fix IntervalBenchmark and 
DateTimeBenchmark

No new revisions were added by this update.

Summary of changes:
 .../benchmarks/DateTimeBenchmark-jdk11-results.txt | 434 ++---
 sql/core/benchmarks/DateTimeBenchmark-results.txt  | 434 ++---
 .../benchmarks/IntervalBenchmark-jdk11-results.txt |  52 +--
 sql/core/benchmarks/IntervalBenchmark-results.txt  |  52 +--
 .../execution/benchmark/DateTimeBenchmark.scala|   4 +-
 .../execution/benchmark/IntervalBenchmark.scala|   4 +-
 6 files changed, 491 insertions(+), 489 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (7b4b29e8 -> fbc9dc7)

2020-03-12 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 7b4b29e8 [SPARK-31131][SQL] Remove the unnecessary config 
spark.sql.legacy.timeParser.enabled
 add fbc9dc7  [SPARK-31129][SQL][TESTS] Fix IntervalBenchmark and 
DateTimeBenchmark

No new revisions were added by this update.

Summary of changes:
 .../benchmarks/DateTimeBenchmark-jdk11-results.txt | 434 ++---
 sql/core/benchmarks/DateTimeBenchmark-results.txt  | 434 ++---
 .../benchmarks/IntervalBenchmark-jdk11-results.txt |  52 +--
 sql/core/benchmarks/IntervalBenchmark-results.txt  |  52 +--
 .../execution/benchmark/DateTimeBenchmark.scala|   4 +-
 .../execution/benchmark/IntervalBenchmark.scala|   4 +-
 6 files changed, 491 insertions(+), 489 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-31129][SQL][TESTS] Fix IntervalBenchmark and DateTimeBenchmark

2020-03-12 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new fd56924  [SPARK-31129][SQL][TESTS] Fix IntervalBenchmark and 
DateTimeBenchmark
fd56924 is described below

commit fd5692477ca9ba3407a350caba01a6f192d521b2
Author: Kent Yao 
AuthorDate: Thu Mar 12 12:59:29 2020 -0700

[SPARK-31129][SQL][TESTS] Fix IntervalBenchmark and DateTimeBenchmark

### What changes were proposed in this pull request?

This PR aims to recover `IntervalBenchmark` and `DataTimeBenchmark` due to 
banning intervals as output.

### Why are the changes needed?

This PR recovers the benchmark suite.

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Manually, re-run the benchmark.

Closes #27885 from yaooqinn/SPARK-3-2.

Authored-by: Kent Yao 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit fbc9dc7e9dcde8a77673b1782f4f1141e183ff00)
Signed-off-by: Dongjoon Hyun 
---
 .../benchmarks/DateTimeBenchmark-jdk11-results.txt | 434 ++---
 sql/core/benchmarks/DateTimeBenchmark-results.txt  | 434 ++---
 .../benchmarks/IntervalBenchmark-jdk11-results.txt |  52 +--
 sql/core/benchmarks/IntervalBenchmark-results.txt  |  52 +--
 .../execution/benchmark/DateTimeBenchmark.scala|   4 +-
 .../execution/benchmark/IntervalBenchmark.scala|   4 +-
 6 files changed, 491 insertions(+), 489 deletions(-)

diff --git a/sql/core/benchmarks/DateTimeBenchmark-jdk11-results.txt 
b/sql/core/benchmarks/DateTimeBenchmark-jdk11-results.txt
index 7d9b147..883f9de 100644
--- a/sql/core/benchmarks/DateTimeBenchmark-jdk11-results.txt
+++ b/sql/core/benchmarks/DateTimeBenchmark-jdk11-results.txt
@@ -2,428 +2,428 @@
 Extract components
 

 
-OpenJDK 64-Bit Server VM 11.0.6+10-post-Ubuntu-1ubuntu118.04.1 on Linux 
4.15.0-1044-aws
-Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
+Java HotSpot(TM) 64-Bit Server VM 11.0.5+10-LTS on Mac OS X 10.15.3
+Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
 cast to timestamp:Best Time(ms)   Avg Time(ms)   
Stdev(ms)Rate(M/s)   Per Row(ns)   Relative
 

-cast to timestamp wholestage off408445 
 53 24.5  40.8   1.0X
-cast to timestamp wholestage on 401453 
 63 24.9  40.1   1.0X
+cast to timestamp wholestage off221232 
 16 45.3  22.1   1.0X
+cast to timestamp wholestage on 213256 
 71 46.9  21.3   1.0X
 
-OpenJDK 64-Bit Server VM 11.0.6+10-post-Ubuntu-1ubuntu118.04.1 on Linux 
4.15.0-1044-aws
-Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
+Java HotSpot(TM) 64-Bit Server VM 11.0.5+10-LTS on Mac OS X 10.15.3
+Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
 year of timestamp:Best Time(ms)   Avg Time(ms)   
Stdev(ms)Rate(M/s)   Per Row(ns)   Relative
 

-year of timestamp wholestage off   1197   1246 
 69  8.4 119.7   1.0X
-year of timestamp wholestage on   1123 
 10  9.0 111.1   1.1X
+year of timestamp wholestage off863961 
139 11.6  86.3   1.0X
+year of timestamp wholestage on 783821 
 26 12.8  78.3   1.1X
 
-OpenJDK 64-Bit Server VM 11.0.6+10-post-Ubuntu-1ubuntu118.04.1 on Linux 
4.15.0-1044-aws
-Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
+Java HotSpot(TM) 64-Bit Server VM 11.0.5+10-LTS on Mac OS X 10.15.3
+Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
 quarter of timestamp: Best Time(ms)   Avg Time(ms)   
Stdev(ms)Rate(M/s)   Per Row(ns)   Relative
 

-quarter of timestamp wholestage off1451   1462 
 16  6.9 145.1   1.0X
-quarter of timestamp wholestage on 1409   1424 
 13  7.1 140.9   1.0X
+quarter of timestamp wholestage off1008   1013 
  7  9.9 100.8   1.0X
+quarter of timestamp wholestage on  926963 
 36

[spark] branch branch-3.0 updated: [SPARK-31129][SQL][TESTS] Fix IntervalBenchmark and DateTimeBenchmark

2020-03-12 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new fd56924  [SPARK-31129][SQL][TESTS] Fix IntervalBenchmark and 
DateTimeBenchmark
fd56924 is described below

commit fd5692477ca9ba3407a350caba01a6f192d521b2
Author: Kent Yao 
AuthorDate: Thu Mar 12 12:59:29 2020 -0700

[SPARK-31129][SQL][TESTS] Fix IntervalBenchmark and DateTimeBenchmark

### What changes were proposed in this pull request?

This PR aims to recover `IntervalBenchmark` and `DataTimeBenchmark` due to 
banning intervals as output.

### Why are the changes needed?

This PR recovers the benchmark suite.

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Manually, re-run the benchmark.

Closes #27885 from yaooqinn/SPARK-3-2.

Authored-by: Kent Yao 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit fbc9dc7e9dcde8a77673b1782f4f1141e183ff00)
Signed-off-by: Dongjoon Hyun 
---
 .../benchmarks/DateTimeBenchmark-jdk11-results.txt | 434 ++---
 sql/core/benchmarks/DateTimeBenchmark-results.txt  | 434 ++---
 .../benchmarks/IntervalBenchmark-jdk11-results.txt |  52 +--
 sql/core/benchmarks/IntervalBenchmark-results.txt  |  52 +--
 .../execution/benchmark/DateTimeBenchmark.scala|   4 +-
 .../execution/benchmark/IntervalBenchmark.scala|   4 +-
 6 files changed, 491 insertions(+), 489 deletions(-)

diff --git a/sql/core/benchmarks/DateTimeBenchmark-jdk11-results.txt 
b/sql/core/benchmarks/DateTimeBenchmark-jdk11-results.txt
index 7d9b147..883f9de 100644
--- a/sql/core/benchmarks/DateTimeBenchmark-jdk11-results.txt
+++ b/sql/core/benchmarks/DateTimeBenchmark-jdk11-results.txt
@@ -2,428 +2,428 @@
 Extract components
 

 
-OpenJDK 64-Bit Server VM 11.0.6+10-post-Ubuntu-1ubuntu118.04.1 on Linux 
4.15.0-1044-aws
-Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
+Java HotSpot(TM) 64-Bit Server VM 11.0.5+10-LTS on Mac OS X 10.15.3
+Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
 cast to timestamp:Best Time(ms)   Avg Time(ms)   
Stdev(ms)Rate(M/s)   Per Row(ns)   Relative
 

-cast to timestamp wholestage off408445 
 53 24.5  40.8   1.0X
-cast to timestamp wholestage on 401453 
 63 24.9  40.1   1.0X
+cast to timestamp wholestage off221232 
 16 45.3  22.1   1.0X
+cast to timestamp wholestage on 213256 
 71 46.9  21.3   1.0X
 
-OpenJDK 64-Bit Server VM 11.0.6+10-post-Ubuntu-1ubuntu118.04.1 on Linux 
4.15.0-1044-aws
-Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
+Java HotSpot(TM) 64-Bit Server VM 11.0.5+10-LTS on Mac OS X 10.15.3
+Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
 year of timestamp:Best Time(ms)   Avg Time(ms)   
Stdev(ms)Rate(M/s)   Per Row(ns)   Relative
 

-year of timestamp wholestage off   1197   1246 
 69  8.4 119.7   1.0X
-year of timestamp wholestage on   1123 
 10  9.0 111.1   1.1X
+year of timestamp wholestage off863961 
139 11.6  86.3   1.0X
+year of timestamp wholestage on 783821 
 26 12.8  78.3   1.1X
 
-OpenJDK 64-Bit Server VM 11.0.6+10-post-Ubuntu-1ubuntu118.04.1 on Linux 
4.15.0-1044-aws
-Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
+Java HotSpot(TM) 64-Bit Server VM 11.0.5+10-LTS on Mac OS X 10.15.3
+Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
 quarter of timestamp: Best Time(ms)   Avg Time(ms)   
Stdev(ms)Rate(M/s)   Per Row(ns)   Relative
 

-quarter of timestamp wholestage off1451   1462 
 16  6.9 145.1   1.0X
-quarter of timestamp wholestage on 1409   1424 
 13  7.1 140.9   1.0X
+quarter of timestamp wholestage off1008   1013 
  7  9.9 100.8   1.0X
+quarter of timestamp wholestage on  926963 
 36

[spark] branch branch-2.4 updated (e6bcaaa -> 51ccb6f)

2020-03-13 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git.


from e6bcaaa  [SPARK-31130][BUILD] Use the same version of `commons-io` in 
SBT
 add 51ccb6f  [SPARK-31144][SQL][2.4] Wrap Error with 
QueryExecutionException to notify QueryExecutionListener

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/DataFrameWriter.scala |  2 +-
 .../main/scala/org/apache/spark/sql/Dataset.scala  |  2 +-
 .../spark/sql/util/QueryExecutionListener.scala| 15 --
 .../spark/sql/util/DataFrameCallbackSuite.scala| 34 --
 4 files changed, 46 insertions(+), 7 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (2a4fed0 -> 1ddf44d)

2020-03-13 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 2a4fed0  [SPARK-30654][WEBUI] Bootstrap4 WebUI upgrade
 add 1ddf44d  [SPARK-31144][SQL] Wrap Error with QueryExecutionException to 
notify QueryExecutionListener

No new revisions were added by this update.

Summary of changes:
 project/MimaExcludes.scala |  4 --
 .../spark/sql/util/QueryExecutionListener.scala| 18 +--
 .../apache/spark/sql/DataFrameWriterV2Suite.scala  |  2 +-
 .../org/apache/spark/sql/SessionStateSuite.scala   |  2 +-
 .../spark/sql/TestQueryExecutionListener.scala |  2 +-
 .../test/scala/org/apache/spark/sql/UDFSuite.scala |  2 +-
 .../sql/connector/DataSourceV2DataFrameSuite.scala |  2 +-
 .../connector/FileDataSourceV2FallBackSuite.scala  |  6 +--
 .../connector/SupportsCatalogOptionsSuite.scala|  2 +-
 .../sql/test/DataFrameReaderWriterSuite.scala  |  2 +-
 .../spark/sql/util/DataFrameCallbackSuite.scala| 62 --
 .../sql/util/ExecutionListenerManagerSuite.scala   |  2 +-
 .../sql/hive/thriftserver/DummyListeners.scala |  2 +-
 13 files changed, 71 insertions(+), 37 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-31144][SQL] Wrap Error with QueryExecutionException to notify QueryExecutionListener

2020-03-13 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 339e4dd  [SPARK-31144][SQL] Wrap Error with QueryExecutionException to 
notify QueryExecutionListener
339e4dd is described below

commit 339e4dd3a3daf6c11670e5ca7786c54f68a86bfa
Author: Shixiong Zhu 
AuthorDate: Fri Mar 13 15:55:29 2020 -0700

[SPARK-31144][SQL] Wrap Error with QueryExecutionException to notify 
QueryExecutionListener

### What changes were proposed in this pull request?

This PR manually reverts changes in #25292 and then wraps java.lang.Error 
with `QueryExecutionException` to notify `QueryExecutionListener` to send it to 
`QueryExecutionListener.onFailure` which only accepts `Exception`.

The bug fix PR for 2.4 is #27904. It needs a separate PR because the 
touched codes were changed a lot.

### Why are the changes needed?

Avoid API changes and fix a bug.

### Does this PR introduce any user-facing change?

Yes. Reverting an API change happening in 3.0. QueryExecutionListener APIs 
will be the same as 2.4.

### How was this patch tested?

The new added test.

Closes #27907 from zsxwing/SPARK-31144.

Authored-by: Shixiong Zhu 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit 1ddf44dfcaff53e870a3c9608e31a60805e50c29)
Signed-off-by: Dongjoon Hyun 
---
 project/MimaExcludes.scala |  4 --
 .../spark/sql/util/QueryExecutionListener.scala| 18 +--
 .../apache/spark/sql/DataFrameWriterV2Suite.scala  |  2 +-
 .../org/apache/spark/sql/SessionStateSuite.scala   |  2 +-
 .../spark/sql/TestQueryExecutionListener.scala |  2 +-
 .../test/scala/org/apache/spark/sql/UDFSuite.scala |  2 +-
 .../sql/connector/DataSourceV2DataFrameSuite.scala |  2 +-
 .../connector/FileDataSourceV2FallBackSuite.scala  |  6 +--
 .../connector/SupportsCatalogOptionsSuite.scala|  2 +-
 .../sql/test/DataFrameReaderWriterSuite.scala  |  2 +-
 .../spark/sql/util/DataFrameCallbackSuite.scala| 62 --
 .../sql/util/ExecutionListenerManagerSuite.scala   |  2 +-
 .../sql/hive/thriftserver/DummyListeners.scala |  2 +-
 13 files changed, 71 insertions(+), 37 deletions(-)

diff --git a/project/MimaExcludes.scala b/project/MimaExcludes.scala
index 7f66577..f8ad60b 100644
--- a/project/MimaExcludes.scala
+++ b/project/MimaExcludes.scala
@@ -419,10 +419,6 @@ object MimaExcludes {
 
ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.sql.streaming.ProcessingTime"),
 
ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.sql.streaming.ProcessingTime$"),
 
-// [SPARK-28556][SQL] QueryExecutionListener should also notify Error
-
ProblemFilters.exclude[IncompatibleMethTypeProblem]("org.apache.spark.sql.util.QueryExecutionListener.onFailure"),
-
ProblemFilters.exclude[ReversedMissingMethodProblem]("org.apache.spark.sql.util.QueryExecutionListener.onFailure"),
-
 // [SPARK-25382][SQL][PYSPARK] Remove ImageSchema.readImages in 3.0
 
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.ml.image.ImageSchema.readImages"),
 
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/util/QueryExecutionListener.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/util/QueryExecutionListener.scala
index 01f8182..0b5951e 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/util/QueryExecutionListener.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/util/QueryExecutionListener.scala
@@ -23,7 +23,7 @@ import org.apache.spark.annotation.DeveloperApi
 import org.apache.spark.internal.Logging
 import org.apache.spark.scheduler.{SparkListener, SparkListenerEvent}
 import org.apache.spark.sql.SparkSession
-import org.apache.spark.sql.execution.QueryExecution
+import org.apache.spark.sql.execution.{QueryExecution, QueryExecutionException}
 import org.apache.spark.sql.execution.ui.SparkListenerSQLExecutionEnd
 import org.apache.spark.sql.internal.StaticSQLConf._
 import org.apache.spark.util.{ListenerBus, Utils}
@@ -55,12 +55,13 @@ trait QueryExecutionListener {
* @param funcName the name of the action that triggered this query.
* @param qe the QueryExecution object that carries detail information like 
logical plan,
*   physical plan, etc.
-   * @param error the error that failed this query.
-   *
+   * @param exception the exception that failed this query. If 
`java.lang.Error` is thrown during
+   *  execution, it will be wrapped with an `Exception` and it 
can be accessed by
+   *  `exception.getCause`.
* @note This can be invoked by multiple different threads.
*/
   @DeveloperApi
-  def onFailure(funcName: String, qe: QueryExecution,

[spark] branch branch-3.0 updated: [SPARK-31144][SQL] Wrap Error with QueryExecutionException to notify QueryExecutionListener

2020-03-13 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 339e4dd  [SPARK-31144][SQL] Wrap Error with QueryExecutionException to 
notify QueryExecutionListener
339e4dd is described below

commit 339e4dd3a3daf6c11670e5ca7786c54f68a86bfa
Author: Shixiong Zhu 
AuthorDate: Fri Mar 13 15:55:29 2020 -0700

[SPARK-31144][SQL] Wrap Error with QueryExecutionException to notify 
QueryExecutionListener

### What changes were proposed in this pull request?

This PR manually reverts changes in #25292 and then wraps java.lang.Error 
with `QueryExecutionException` to notify `QueryExecutionListener` to send it to 
`QueryExecutionListener.onFailure` which only accepts `Exception`.

The bug fix PR for 2.4 is #27904. It needs a separate PR because the 
touched codes were changed a lot.

### Why are the changes needed?

Avoid API changes and fix a bug.

### Does this PR introduce any user-facing change?

Yes. Reverting an API change happening in 3.0. QueryExecutionListener APIs 
will be the same as 2.4.

### How was this patch tested?

The new added test.

Closes #27907 from zsxwing/SPARK-31144.

Authored-by: Shixiong Zhu 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit 1ddf44dfcaff53e870a3c9608e31a60805e50c29)
Signed-off-by: Dongjoon Hyun 
---
 project/MimaExcludes.scala |  4 --
 .../spark/sql/util/QueryExecutionListener.scala| 18 +--
 .../apache/spark/sql/DataFrameWriterV2Suite.scala  |  2 +-
 .../org/apache/spark/sql/SessionStateSuite.scala   |  2 +-
 .../spark/sql/TestQueryExecutionListener.scala |  2 +-
 .../test/scala/org/apache/spark/sql/UDFSuite.scala |  2 +-
 .../sql/connector/DataSourceV2DataFrameSuite.scala |  2 +-
 .../connector/FileDataSourceV2FallBackSuite.scala  |  6 +--
 .../connector/SupportsCatalogOptionsSuite.scala|  2 +-
 .../sql/test/DataFrameReaderWriterSuite.scala  |  2 +-
 .../spark/sql/util/DataFrameCallbackSuite.scala| 62 --
 .../sql/util/ExecutionListenerManagerSuite.scala   |  2 +-
 .../sql/hive/thriftserver/DummyListeners.scala |  2 +-
 13 files changed, 71 insertions(+), 37 deletions(-)

diff --git a/project/MimaExcludes.scala b/project/MimaExcludes.scala
index 7f66577..f8ad60b 100644
--- a/project/MimaExcludes.scala
+++ b/project/MimaExcludes.scala
@@ -419,10 +419,6 @@ object MimaExcludes {
 
ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.sql.streaming.ProcessingTime"),
 
ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.sql.streaming.ProcessingTime$"),
 
-// [SPARK-28556][SQL] QueryExecutionListener should also notify Error
-
ProblemFilters.exclude[IncompatibleMethTypeProblem]("org.apache.spark.sql.util.QueryExecutionListener.onFailure"),
-
ProblemFilters.exclude[ReversedMissingMethodProblem]("org.apache.spark.sql.util.QueryExecutionListener.onFailure"),
-
 // [SPARK-25382][SQL][PYSPARK] Remove ImageSchema.readImages in 3.0
 
ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.ml.image.ImageSchema.readImages"),
 
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/util/QueryExecutionListener.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/util/QueryExecutionListener.scala
index 01f8182..0b5951e 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/util/QueryExecutionListener.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/util/QueryExecutionListener.scala
@@ -23,7 +23,7 @@ import org.apache.spark.annotation.DeveloperApi
 import org.apache.spark.internal.Logging
 import org.apache.spark.scheduler.{SparkListener, SparkListenerEvent}
 import org.apache.spark.sql.SparkSession
-import org.apache.spark.sql.execution.QueryExecution
+import org.apache.spark.sql.execution.{QueryExecution, QueryExecutionException}
 import org.apache.spark.sql.execution.ui.SparkListenerSQLExecutionEnd
 import org.apache.spark.sql.internal.StaticSQLConf._
 import org.apache.spark.util.{ListenerBus, Utils}
@@ -55,12 +55,13 @@ trait QueryExecutionListener {
* @param funcName the name of the action that triggered this query.
* @param qe the QueryExecution object that carries detail information like 
logical plan,
*   physical plan, etc.
-   * @param error the error that failed this query.
-   *
+   * @param exception the exception that failed this query. If 
`java.lang.Error` is thrown during
+   *  execution, it will be wrapped with an `Exception` and it 
can be accessed by
+   *  `exception.getCause`.
* @note This can be invoked by multiple different threads.
*/
   @DeveloperApi
-  def onFailure(funcName: String, qe: QueryExecution,

[spark] branch branch-3.0 updated: [MINOR][DOCS] Fix [[...]] to `...` and ... in documentation

2020-03-13 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 1fc9833  [MINOR][DOCS] Fix [[...]] to `...` and ... in 
documentation
1fc9833 is described below

commit 1fc98336cfc8390139f2548a1f496d40a6a7f784
Author: HyukjinKwon 
AuthorDate: Fri Mar 13 16:44:23 2020 -0700

[MINOR][DOCS] Fix [[...]] to `...` and ... in documentation

### What changes were proposed in this pull request?

Before:

- ![Screen Shot 2020-03-13 at 1 19 12 
PM](https://user-images.githubusercontent.com/6477701/76589452-7c34f300-652d-11ea-9da7-3754f8575796.png)
- ![Screen Shot 2020-03-13 at 1 19 24 
PM](https://user-images.githubusercontent.com/6477701/76589455-7d662000-652d-11ea-9dbe-f5fe10d1e7ad.png)
- ![Screen Shot 2020-03-13 at 1 19 03 
PM](https://user-images.githubusercontent.com/6477701/76589449-7b03c600-652d-11ea-8e99-dbe47f561f9c.png)

After:

- ![Screen Shot 2020-03-13 at 1 17 37 
PM](https://user-images.githubusercontent.com/6477701/76589437-74754e80-652d-11ea-99f5-14fb4761f915.png)
- ![Screen Shot 2020-03-13 at 1 17 46 
PM](https://user-images.githubusercontent.com/6477701/76589442-76d7a880-652d-11ea-8c10-53e595421081.png)
- ![Screen Shot 2020-03-13 at 1 18 15 
PM](https://user-images.githubusercontent.com/6477701/76589443-7808d580-652d-11ea-9b1b-e5d11d638335.png)

### Why are the changes needed?
To render the code block properly in the documentation

### Does this PR introduce any user-facing change?
Yes, code rendering in documentation.

### How was this patch tested?

Manually built the doc via `SKIP_API=1 jekyll build`.

Closes #27899 from HyukjinKwon/minor-docss.

Authored-by: HyukjinKwon 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit 9628aca68ba0821b8f3fa934ed4872cabb2a5d7d)
Signed-off-by: Dongjoon Hyun 
---
 docs/monitoring.md  | 6 +++---
 docs/quick-start.md | 2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/docs/monitoring.md b/docs/monitoring.md
index 4cba15b..ba3f1dc 100644
--- a/docs/monitoring.md
+++ b/docs/monitoring.md
@@ -595,7 +595,7 @@ A list of the available metrics, with a short description:
   
   
 inputMetrics.*
-Metrics related to reading data from 
[[org.apache.spark.rdd.HadoopRDD]] 
+Metrics related to reading data from 
org.apache.spark.rdd.HadoopRDD
 or from persisted data.
   
   
@@ -779,11 +779,11 @@ A list of the available metrics, with a short description:
   
   
 .DirectPoolMemory
-Peak memory that the JVM is using for direct buffer pool 
([[java.lang.management.BufferPoolMXBean]])
+Peak memory that the JVM is using for direct buffer pool 
(java.lang.management.BufferPoolMXBean)
   
   
 .MappedPoolMemory
-Peak memory that the JVM is using for mapped buffer pool 
([[java.lang.management.BufferPoolMXBean]])
+Peak memory that the JVM is using for mapped buffer pool 
(java.lang.management.BufferPoolMXBean)
   
   
 .ProcessTreeJVMVMemory
diff --git a/docs/quick-start.md b/docs/quick-start.md
index 86ba2c4..e7a16a3 100644
--- a/docs/quick-start.md
+++ b/docs/quick-start.md
@@ -264,7 +264,7 @@ Spark README. Note that you'll need to replace 
YOUR_SPARK_HOME with the location
 installed. Unlike the earlier examples with the Spark shell, which initializes 
its own SparkSession,
 we initialize a SparkSession as part of the program.
 
-We call `SparkSession.builder` to construct a [[SparkSession]], then set the 
application name, and finally call `getOrCreate` to get the [[SparkSession]] 
instance.
+We call `SparkSession.builder` to construct a `SparkSession`, then set the 
application name, and finally call `getOrCreate` to get the `SparkSession` 
instance.
 
 Our application depends on the Spark API, so we'll also include an sbt 
configuration file,
 `build.sbt`, which explains that Spark is a dependency. This file also adds a 
repository that


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (1ddf44d -> 9628aca)

2020-03-13 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 1ddf44d  [SPARK-31144][SQL] Wrap Error with QueryExecutionException to 
notify QueryExecutionListener
 add 9628aca  [MINOR][DOCS] Fix [[...]] to `...` and ... in 
documentation

No new revisions were added by this update.

Summary of changes:
 docs/monitoring.md  | 6 +++---
 docs/quick-start.md | 2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (08bdc9c -> b0d2956)

2020-03-15 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 08bdc9c  [SPARK-31068][SQL] Avoid IllegalArgumentException in 
broadcast exchange
 add b0d2956  [SPARK-31135][BUILD][TESTS] Upgrdade docker-client version to 
8.14.1

No new revisions were added by this update.

Summary of changes:
 external/docker-integration-tests/pom.xml | 1 +
 pom.xml   | 3 ++-
 2 files changed, 3 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-31135][BUILD][TESTS] Upgrdade docker-client version to 8.14.1

2020-03-15 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new b0d2956  [SPARK-31135][BUILD][TESTS] Upgrdade docker-client version to 
8.14.1
b0d2956 is described below

commit b0d2956a359f00e703d5ebe9a58fb9fec869721e
Author: Gabor Somogyi 
AuthorDate: Sun Mar 15 23:55:04 2020 -0700

[SPARK-31135][BUILD][TESTS] Upgrdade docker-client version to 8.14.1

### What changes were proposed in this pull request?
Upgrdade `docker-client` version.

### Why are the changes needed?
`docker-client` what Spark uses is super old. Snippet from the project page:
```
Spotify no longer uses recent versions of this project internally.
The version of docker-client we're using is whatever helios has in its 
pom.xml. => 8.14.1
```

### Does this PR introduce any user-facing change?
No.

### How was this patch tested?
```
build/mvn install -DskipTests
build/mvn -Pdocker-integration-tests -pl 
:spark-docker-integration-tests_2.12 -Dtest=none 
-DwildcardSuites=org.apache.spark.sql.jdbc.DB2IntegrationSuite test`
build/mvn -Pdocker-integration-tests -pl 
:spark-docker-integration-tests_2.12 -Dtest=none 
-DwildcardSuites=org.apache.spark.sql.jdbc.MsSqlServerIntegrationSuite test`
build/mvn -Pdocker-integration-tests -pl 
:spark-docker-integration-tests_2.12 -Dtest=none 
-DwildcardSuites=org.apache.spark.sql.jdbc.PostgresIntegrationSuite test`
```

Closes #27892 from gaborgsomogyi/docker-client.

Authored-by: Gabor Somogyi 
Signed-off-by: Dongjoon Hyun 
---
 external/docker-integration-tests/pom.xml | 1 +
 pom.xml   | 3 ++-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/external/docker-integration-tests/pom.xml 
b/external/docker-integration-tests/pom.xml
index c357a2f..8743d72 100644
--- a/external/docker-integration-tests/pom.xml
+++ b/external/docker-integration-tests/pom.xml
@@ -50,6 +50,7 @@
   com.spotify
   docker-client
   test
+  shaded
 
 
   org.apache.httpcomponents
diff --git a/pom.xml b/pom.xml
index a335759..c90ac68 100644
--- a/pom.xml
+++ b/pom.xml
@@ -931,8 +931,9 @@
   
 com.spotify
 docker-client
-5.0.2
+8.14.1
 test
+shaded
 
   
 guava


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-31135][BUILD][TESTS] Upgrdade docker-client version to 8.14.1

2020-03-15 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new aad1f5a  [SPARK-31135][BUILD][TESTS] Upgrdade docker-client version to 
8.14.1
aad1f5a is described below

commit aad1f5aa2d3e281dde2a019c1c4975533c908b66
Author: Gabor Somogyi 
AuthorDate: Sun Mar 15 23:55:04 2020 -0700

[SPARK-31135][BUILD][TESTS] Upgrdade docker-client version to 8.14.1

### What changes were proposed in this pull request?
Upgrdade `docker-client` version.

### Why are the changes needed?
`docker-client` what Spark uses is super old. Snippet from the project page:
```
Spotify no longer uses recent versions of this project internally.
The version of docker-client we're using is whatever helios has in its 
pom.xml. => 8.14.1
```

### Does this PR introduce any user-facing change?
No.

### How was this patch tested?
```
build/mvn install -DskipTests
build/mvn -Pdocker-integration-tests -pl 
:spark-docker-integration-tests_2.12 -Dtest=none 
-DwildcardSuites=org.apache.spark.sql.jdbc.DB2IntegrationSuite test`
build/mvn -Pdocker-integration-tests -pl 
:spark-docker-integration-tests_2.12 -Dtest=none 
-DwildcardSuites=org.apache.spark.sql.jdbc.MsSqlServerIntegrationSuite test`
build/mvn -Pdocker-integration-tests -pl 
:spark-docker-integration-tests_2.12 -Dtest=none 
-DwildcardSuites=org.apache.spark.sql.jdbc.PostgresIntegrationSuite test`
```

Closes #27892 from gaborgsomogyi/docker-client.

Authored-by: Gabor Somogyi 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit b0d2956a359f00e703d5ebe9a58fb9fec869721e)
Signed-off-by: Dongjoon Hyun 
---
 external/docker-integration-tests/pom.xml | 1 +
 pom.xml   | 3 ++-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/external/docker-integration-tests/pom.xml 
b/external/docker-integration-tests/pom.xml
index aff79b8..cdf76e9 100644
--- a/external/docker-integration-tests/pom.xml
+++ b/external/docker-integration-tests/pom.xml
@@ -50,6 +50,7 @@
   com.spotify
   docker-client
   test
+  shaded
 
 
   org.apache.httpcomponents
diff --git a/pom.xml b/pom.xml
index 5aa100e..978127e 100644
--- a/pom.xml
+++ b/pom.xml
@@ -931,8 +931,9 @@
   
 com.spotify
 docker-client
-5.0.2
+8.14.1
 test
+shaded
 
   
 guava


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-31135][BUILD][TESTS] Upgrdade docker-client version to 8.14.1

2020-03-15 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new aad1f5a  [SPARK-31135][BUILD][TESTS] Upgrdade docker-client version to 
8.14.1
aad1f5a is described below

commit aad1f5aa2d3e281dde2a019c1c4975533c908b66
Author: Gabor Somogyi 
AuthorDate: Sun Mar 15 23:55:04 2020 -0700

[SPARK-31135][BUILD][TESTS] Upgrdade docker-client version to 8.14.1

### What changes were proposed in this pull request?
Upgrdade `docker-client` version.

### Why are the changes needed?
`docker-client` what Spark uses is super old. Snippet from the project page:
```
Spotify no longer uses recent versions of this project internally.
The version of docker-client we're using is whatever helios has in its 
pom.xml. => 8.14.1
```

### Does this PR introduce any user-facing change?
No.

### How was this patch tested?
```
build/mvn install -DskipTests
build/mvn -Pdocker-integration-tests -pl 
:spark-docker-integration-tests_2.12 -Dtest=none 
-DwildcardSuites=org.apache.spark.sql.jdbc.DB2IntegrationSuite test`
build/mvn -Pdocker-integration-tests -pl 
:spark-docker-integration-tests_2.12 -Dtest=none 
-DwildcardSuites=org.apache.spark.sql.jdbc.MsSqlServerIntegrationSuite test`
build/mvn -Pdocker-integration-tests -pl 
:spark-docker-integration-tests_2.12 -Dtest=none 
-DwildcardSuites=org.apache.spark.sql.jdbc.PostgresIntegrationSuite test`
```

Closes #27892 from gaborgsomogyi/docker-client.

Authored-by: Gabor Somogyi 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit b0d2956a359f00e703d5ebe9a58fb9fec869721e)
Signed-off-by: Dongjoon Hyun 
---
 external/docker-integration-tests/pom.xml | 1 +
 pom.xml   | 3 ++-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/external/docker-integration-tests/pom.xml 
b/external/docker-integration-tests/pom.xml
index aff79b8..cdf76e9 100644
--- a/external/docker-integration-tests/pom.xml
+++ b/external/docker-integration-tests/pom.xml
@@ -50,6 +50,7 @@
   com.spotify
   docker-client
   test
+  shaded
 
 
   org.apache.httpcomponents
diff --git a/pom.xml b/pom.xml
index 5aa100e..978127e 100644
--- a/pom.xml
+++ b/pom.xml
@@ -931,8 +931,9 @@
   
 com.spotify
 docker-client
-5.0.2
+8.14.1
 test
+shaded
 
   
 guava


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (21c02ee -> e736c62)

2020-03-16 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 21c02ee  [SPARK-30864][SQL][DOC] add the user guide for Adaptive Query 
Execution
 add e736c62  [SPARK-31116][SQL] Fix nested schema case-sensitivity in 
ParquetRowConverter

No new revisions were added by this update.

Summary of changes:
 .../datasources/parquet/ParquetRowConverter.scala  | 12 +--
 .../spark/sql/FileBasedDataSourceSuite.scala   | 40 ++
 2 files changed, 50 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-31116][SQL] Fix nested schema case-sensitivity in ParquetRowConverter

2020-03-16 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new da1f95b  [SPARK-31116][SQL] Fix nested schema case-sensitivity in 
ParquetRowConverter
da1f95b is described below

commit da1f95be6b9af59a91a14e01613bdc4e8ac35374
Author: Tae-kyeom, Kim 
AuthorDate: Mon Mar 16 10:31:56 2020 -0700

[SPARK-31116][SQL] Fix nested schema case-sensitivity in ParquetRowConverter

### What changes were proposed in this pull request?

This PR (SPARK-31116) add caseSensitive parameter to ParquetRowConverter so 
that it handle materialize parquet properly with respect to case sensitivity

### Why are the changes needed?

From spark 3.0.0, below statement throws IllegalArgumentException in 
caseInsensitive mode because of explicit field index searching in 
ParquetRowConverter. As we already constructed parquet requested schema and 
catalyst requested schema during schema clipping in ParquetReadSupport, just 
follow these behavior.

```scala
val path = "/some/temp/path"

spark
  .range(1L)
  .selectExpr("NAMED_STRUCT('lowercase', id, 'camelCase', id + 1) AS 
StructColumn")
  .write.parquet(path)

val caseInsensitiveSchema = new StructType()
  .add(
"StructColumn",
new StructType()
  .add("LowerCase", LongType)
  .add("camelcase", LongType))

spark.read.schema(caseInsensitiveSchema).parquet(path).show()
```

### Does this PR introduce any user-facing change?

No. The changes are only in unreleased branches (`master` and `branch-3.0`).

### How was this patch tested?

Passed new test cases that check parquet column selection with respect to 
schemas and case sensitivities

Closes #27888 from kimtkyeom/parquet_row_converter_case_sensitivity.

Authored-by: Tae-kyeom, Kim 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit e736c62764137b2c3af90d2dc8a77e391891200a)
Signed-off-by: Dongjoon Hyun 
---
 .../datasources/parquet/ParquetRowConverter.scala  | 12 +--
 .../spark/sql/FileBasedDataSourceSuite.scala   | 40 ++
 2 files changed, 50 insertions(+), 2 deletions(-)

diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetRowConverter.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetRowConverter.scala
index 850adae..22422c0 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetRowConverter.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetRowConverter.scala
@@ -33,8 +33,9 @@ import 
org.apache.parquet.schema.PrimitiveType.PrimitiveTypeName.{BINARY, DOUBLE
 import org.apache.spark.internal.Logging
 import org.apache.spark.sql.catalyst.InternalRow
 import org.apache.spark.sql.catalyst.expressions._
-import org.apache.spark.sql.catalyst.util.{ArrayBasedMapData, DateTimeUtils, 
GenericArrayData}
+import org.apache.spark.sql.catalyst.util.{ArrayBasedMapData, 
CaseInsensitiveMap, DateTimeUtils, GenericArrayData}
 import org.apache.spark.sql.catalyst.util.DateTimeUtils.SQLTimestamp
+import org.apache.spark.sql.internal.SQLConf
 import org.apache.spark.sql.types._
 import org.apache.spark.unsafe.types.UTF8String
 
@@ -178,8 +179,15 @@ private[parquet] class ParquetRowConverter(
 
   // Converters for each field.
   private[this] val fieldConverters: Array[Converter with 
HasParentContainerUpdater] = {
+// (SPARK-31116) Use case insensitive map if spark.sql.caseSensitive is 
false
+// to prevent throwing IllegalArgumentException when searching catalyst 
type's field index
+val catalystFieldNameToIndex = if (SQLConf.get.caseSensitiveAnalysis) {
+  catalystType.fieldNames.zipWithIndex.toMap
+} else {
+  CaseInsensitiveMap(catalystType.fieldNames.zipWithIndex.toMap)
+}
 parquetType.getFields.asScala.map { parquetField =>
-  val fieldIndex = catalystType.fieldIndex(parquetField.getName)
+  val fieldIndex = catalystFieldNameToIndex(parquetField.getName)
   val catalystField = catalystType(fieldIndex)
   // Converted field value should be set to the `fieldIndex`-th cell of 
`currentRow`
   newConverter(parquetField, catalystField.dataType, new 
RowUpdater(currentRow, fieldIndex))
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/FileBasedDataSourceSuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/FileBasedDataSourceSuite.scala
index c870958..cb410b4 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/sql/FileBasedDataSourceSuite.scala
+++ 
b/sql/core/src/test/scala/org/apache/spark/sql/FileBasedDataSourceSui

[spark] branch branch-3.0 updated: [SPARK-31116][SQL] Fix nested schema case-sensitivity in ParquetRowConverter

2020-03-16 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new da1f95b  [SPARK-31116][SQL] Fix nested schema case-sensitivity in 
ParquetRowConverter
da1f95b is described below

commit da1f95be6b9af59a91a14e01613bdc4e8ac35374
Author: Tae-kyeom, Kim 
AuthorDate: Mon Mar 16 10:31:56 2020 -0700

[SPARK-31116][SQL] Fix nested schema case-sensitivity in ParquetRowConverter

### What changes were proposed in this pull request?

This PR (SPARK-31116) add caseSensitive parameter to ParquetRowConverter so 
that it handle materialize parquet properly with respect to case sensitivity

### Why are the changes needed?

From spark 3.0.0, below statement throws IllegalArgumentException in 
caseInsensitive mode because of explicit field index searching in 
ParquetRowConverter. As we already constructed parquet requested schema and 
catalyst requested schema during schema clipping in ParquetReadSupport, just 
follow these behavior.

```scala
val path = "/some/temp/path"

spark
  .range(1L)
  .selectExpr("NAMED_STRUCT('lowercase', id, 'camelCase', id + 1) AS 
StructColumn")
  .write.parquet(path)

val caseInsensitiveSchema = new StructType()
  .add(
"StructColumn",
new StructType()
  .add("LowerCase", LongType)
  .add("camelcase", LongType))

spark.read.schema(caseInsensitiveSchema).parquet(path).show()
```

### Does this PR introduce any user-facing change?

No. The changes are only in unreleased branches (`master` and `branch-3.0`).

### How was this patch tested?

Passed new test cases that check parquet column selection with respect to 
schemas and case sensitivities

Closes #27888 from kimtkyeom/parquet_row_converter_case_sensitivity.

Authored-by: Tae-kyeom, Kim 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit e736c62764137b2c3af90d2dc8a77e391891200a)
Signed-off-by: Dongjoon Hyun 
---
 .../datasources/parquet/ParquetRowConverter.scala  | 12 +--
 .../spark/sql/FileBasedDataSourceSuite.scala   | 40 ++
 2 files changed, 50 insertions(+), 2 deletions(-)

diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetRowConverter.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetRowConverter.scala
index 850adae..22422c0 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetRowConverter.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetRowConverter.scala
@@ -33,8 +33,9 @@ import 
org.apache.parquet.schema.PrimitiveType.PrimitiveTypeName.{BINARY, DOUBLE
 import org.apache.spark.internal.Logging
 import org.apache.spark.sql.catalyst.InternalRow
 import org.apache.spark.sql.catalyst.expressions._
-import org.apache.spark.sql.catalyst.util.{ArrayBasedMapData, DateTimeUtils, 
GenericArrayData}
+import org.apache.spark.sql.catalyst.util.{ArrayBasedMapData, 
CaseInsensitiveMap, DateTimeUtils, GenericArrayData}
 import org.apache.spark.sql.catalyst.util.DateTimeUtils.SQLTimestamp
+import org.apache.spark.sql.internal.SQLConf
 import org.apache.spark.sql.types._
 import org.apache.spark.unsafe.types.UTF8String
 
@@ -178,8 +179,15 @@ private[parquet] class ParquetRowConverter(
 
   // Converters for each field.
   private[this] val fieldConverters: Array[Converter with 
HasParentContainerUpdater] = {
+// (SPARK-31116) Use case insensitive map if spark.sql.caseSensitive is 
false
+// to prevent throwing IllegalArgumentException when searching catalyst 
type's field index
+val catalystFieldNameToIndex = if (SQLConf.get.caseSensitiveAnalysis) {
+  catalystType.fieldNames.zipWithIndex.toMap
+} else {
+  CaseInsensitiveMap(catalystType.fieldNames.zipWithIndex.toMap)
+}
 parquetType.getFields.asScala.map { parquetField =>
-  val fieldIndex = catalystType.fieldIndex(parquetField.getName)
+  val fieldIndex = catalystFieldNameToIndex(parquetField.getName)
   val catalystField = catalystType(fieldIndex)
   // Converted field value should be set to the `fieldIndex`-th cell of 
`currentRow`
   newConverter(parquetField, catalystField.dataType, new 
RowUpdater(currentRow, fieldIndex))
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/FileBasedDataSourceSuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/FileBasedDataSourceSuite.scala
index c870958..cb410b4 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/sql/FileBasedDataSourceSuite.scala
+++ 
b/sql/core/src/test/scala/org/apache/spark/sql/FileBasedDataSourceSui

< 1 2 3 4 5 6 7 8 9 10 >

201 - 300 of 2023 matches

Mail list logo