date:20201015

[spark] branch branch-3.0 updated (37d6b3c -> 698ac6a)

2020-10-15 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 37d6b3c  [SPARK-32761][SQL][3.0] Allow aggregating multiple foldable 
distinct expressions
 add 698ac6a  [SPARK-33165][SQL][TESTS][FOLLOW-UP] Use scala.Predef.assert 
instead

No new revisions were added by this update.

Summary of changes:
 core/src/test/scala/org/apache/spark/benchmark/Benchmark.scala | 4 
 1 file changed, 4 insertions(+)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated (37d6b3c -> 698ac6a)

2020-10-15 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 37d6b3c  [SPARK-32761][SQL][3.0] Allow aggregating multiple foldable 
distinct expressions
 add 698ac6a  [SPARK-33165][SQL][TESTS][FOLLOW-UP] Use scala.Predef.assert 
instead

No new revisions were added by this update.

Summary of changes:
 core/src/test/scala/org/apache/spark/benchmark/Benchmark.scala | 4 
 1 file changed, 4 insertions(+)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (8f4fc22 -> bf52fa8)

2020-10-15 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 8f4fc22  [SPARK-33088][CORE] Enhance ExecutorPlugin API to include 
callbacks on task start and end events
 add bf52fa8  [SPARK-33165][SQL][TESTS][FOLLOW-UP] Use scala.Predef.assert 
instead

No new revisions were added by this update.

Summary of changes:
 core/src/test/scala/org/apache/spark/benchmark/Benchmark.scala | 4 
 1 file changed, 4 insertions(+)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated (37d6b3c -> 698ac6a)

2020-10-15 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 37d6b3c  [SPARK-32761][SQL][3.0] Allow aggregating multiple foldable 
distinct expressions
 add 698ac6a  [SPARK-33165][SQL][TESTS][FOLLOW-UP] Use scala.Predef.assert 
instead

No new revisions were added by this update.

Summary of changes:
 core/src/test/scala/org/apache/spark/benchmark/Benchmark.scala | 4 
 1 file changed, 4 insertions(+)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (8f4fc22 -> bf52fa8)

2020-10-15 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 8f4fc22  [SPARK-33088][CORE] Enhance ExecutorPlugin API to include 
callbacks on task start and end events
 add bf52fa8  [SPARK-33165][SQL][TESTS][FOLLOW-UP] Use scala.Predef.assert 
instead

No new revisions were added by this update.

Summary of changes:
 core/src/test/scala/org/apache/spark/benchmark/Benchmark.scala | 4 
 1 file changed, 4 insertions(+)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated (37d6b3c -> 698ac6a)

2020-10-15 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 37d6b3c  [SPARK-32761][SQL][3.0] Allow aggregating multiple foldable 
distinct expressions
 add 698ac6a  [SPARK-33165][SQL][TESTS][FOLLOW-UP] Use scala.Predef.assert 
instead

No new revisions were added by this update.

Summary of changes:
 core/src/test/scala/org/apache/spark/benchmark/Benchmark.scala | 4 
 1 file changed, 4 insertions(+)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (8f4fc22 -> bf52fa8)

2020-10-15 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 8f4fc22  [SPARK-33088][CORE] Enhance ExecutorPlugin API to include 
callbacks on task start and end events
 add bf52fa8  [SPARK-33165][SQL][TESTS][FOLLOW-UP] Use scala.Predef.assert 
instead

No new revisions were added by this update.

Summary of changes:
 core/src/test/scala/org/apache/spark/benchmark/Benchmark.scala | 4 
 1 file changed, 4 insertions(+)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated (37d6b3c -> 698ac6a)

2020-10-15 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 37d6b3c  [SPARK-32761][SQL][3.0] Allow aggregating multiple foldable 
distinct expressions
 add 698ac6a  [SPARK-33165][SQL][TESTS][FOLLOW-UP] Use scala.Predef.assert 
instead

No new revisions were added by this update.

Summary of changes:
 core/src/test/scala/org/apache/spark/benchmark/Benchmark.scala | 4 
 1 file changed, 4 insertions(+)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (8f4fc22 -> bf52fa8)

2020-10-15 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 8f4fc22  [SPARK-33088][CORE] Enhance ExecutorPlugin API to include 
callbacks on task start and end events
 add bf52fa8  [SPARK-33165][SQL][TESTS][FOLLOW-UP] Use scala.Predef.assert 
instead

No new revisions were added by this update.

Summary of changes:
 core/src/test/scala/org/apache/spark/benchmark/Benchmark.scala | 4 
 1 file changed, 4 insertions(+)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (8f4fc22 -> bf52fa8)

2020-10-15 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 8f4fc22  [SPARK-33088][CORE] Enhance ExecutorPlugin API to include 
callbacks on task start and end events
 add bf52fa8  [SPARK-33165][SQL][TESTS][FOLLOW-UP] Use scala.Predef.assert 
instead

No new revisions were added by this update.

Summary of changes:
 core/src/test/scala/org/apache/spark/benchmark/Benchmark.scala | 4 
 1 file changed, 4 insertions(+)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-32761][SQL][3.0] Allow aggregating multiple foldable distinct expressions

2020-10-15 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 37d6b3c  [SPARK-32761][SQL][3.0] Allow aggregating multiple foldable 
distinct expressions
37d6b3c is described below

commit 37d6b3c0fafa98922ed1ecf4f8634d962f5bb9d9
Author: Linhong Liu 
AuthorDate: Fri Oct 16 03:36:21 2020 +

[SPARK-32761][SQL][3.0] Allow aggregating multiple foldable distinct 
expressions

### What changes were proposed in this pull request?
For queries with multiple foldable distinct columns, since they will be 
eliminated during
execution, it's not mandatory to let `RewriteDistinctAggregates` handle 
this case. And
in the current code, `RewriteDistinctAggregates` *dose* miss some 
"aggregating with
multiple foldable distinct expressions" cases.
For example: `select count(distinct 2), count(distinct 2, 3)` will be 
missed.

But in the planner, this will trigger an error that "multiple distinct 
expressions" are not allowed.
As the foldable distinct columns can be eliminated finally, we can allow 
this in the aggregation
planner check.

### Why are the changes needed?
bug fix

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
added test case

Authored-by: Linhong Liu 
Signed-off-by: Wenchen Fan 
(cherry picked from commit a410658c9bc244e325702dc926075bd835b669ff)

Closes #30052 from linhongliu-db/SPARK-32761-3.0.

Authored-by: Linhong Liu 
Signed-off-by: Wenchen Fan 
---
 .../main/scala/org/apache/spark/sql/execution/SparkStrategies.scala | 6 --
 sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala   | 4 
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala
index f836deb..689d1eb 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala
@@ -517,7 +517,8 @@ abstract class SparkStrategies extends 
QueryPlanner[SparkPlan] {
 
 val (functionsWithDistinct, functionsWithoutDistinct) =
   aggregateExpressions.partition(_.isDistinct)
-if 
(functionsWithDistinct.map(_.aggregateFunction.children.toSet).distinct.length 
> 1) {
+if (functionsWithDistinct.map(
+  
_.aggregateFunction.children.filterNot(_.foldable).toSet).distinct.length > 1) {
   // This is a sanity check. We should not reach here when we have 
multiple distinct
   // column sets. Our `RewriteDistinctAggregates` should take care 
this case.
   sys.error("You hit a query analyzer bug. Please report your query to 
" +
@@ -548,7 +549,8 @@ abstract class SparkStrategies extends 
QueryPlanner[SparkPlan] {
 // to be [COUNT(DISTINCT foo), MAX(DISTINCT foo)], but
 // [COUNT(DISTINCT bar), COUNT(DISTINCT foo)] is disallowed 
because those two distinct
 // aggregates have different column expressions.
-val distinctExpressions = 
functionsWithDistinct.head.aggregateFunction.children
+val distinctExpressions =
+  
functionsWithDistinct.head.aggregateFunction.children.filterNot(_.foldable)
 val normalizedNamedDistinctExpressions = distinctExpressions.map { 
e =>
   // Ideally this should be done in `NormalizeFloatingNumbers`, 
but we do it here
   // because `distinctExpressions` is not extracted during logical 
phase.
diff --git a/sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala
index 7869005..85cbe45 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala
@@ -2467,6 +2467,10 @@ class DataFrameSuite extends QueryTest
 val df = l.join(r, $"col2" === $"col4", "LeftOuter")
 checkAnswer(df, Row("2", "2"))
   }
+
+  test("SPARK-32761: aggregating multiple distinct CONSTANT columns") {
+ checkAnswer(sql("select count(distinct 2), count(distinct 2,3)"), Row(1, 
1))
+  }
 }
 
 case class GroupByKey(a: Int, b: Int)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-32761][SQL][3.0] Allow aggregating multiple foldable distinct expressions

2020-10-15 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 37d6b3c  [SPARK-32761][SQL][3.0] Allow aggregating multiple foldable 
distinct expressions
37d6b3c is described below

commit 37d6b3c0fafa98922ed1ecf4f8634d962f5bb9d9
Author: Linhong Liu 
AuthorDate: Fri Oct 16 03:36:21 2020 +

[SPARK-32761][SQL][3.0] Allow aggregating multiple foldable distinct 
expressions

### What changes were proposed in this pull request?
For queries with multiple foldable distinct columns, since they will be 
eliminated during
execution, it's not mandatory to let `RewriteDistinctAggregates` handle 
this case. And
in the current code, `RewriteDistinctAggregates` *dose* miss some 
"aggregating with
multiple foldable distinct expressions" cases.
For example: `select count(distinct 2), count(distinct 2, 3)` will be 
missed.

But in the planner, this will trigger an error that "multiple distinct 
expressions" are not allowed.
As the foldable distinct columns can be eliminated finally, we can allow 
this in the aggregation
planner check.

### Why are the changes needed?
bug fix

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
added test case

Authored-by: Linhong Liu 
Signed-off-by: Wenchen Fan 
(cherry picked from commit a410658c9bc244e325702dc926075bd835b669ff)

Closes #30052 from linhongliu-db/SPARK-32761-3.0.

Authored-by: Linhong Liu 
Signed-off-by: Wenchen Fan 
---
 .../main/scala/org/apache/spark/sql/execution/SparkStrategies.scala | 6 --
 sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala   | 4 
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala
index f836deb..689d1eb 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala
@@ -517,7 +517,8 @@ abstract class SparkStrategies extends 
QueryPlanner[SparkPlan] {
 
 val (functionsWithDistinct, functionsWithoutDistinct) =
   aggregateExpressions.partition(_.isDistinct)
-if 
(functionsWithDistinct.map(_.aggregateFunction.children.toSet).distinct.length 
> 1) {
+if (functionsWithDistinct.map(
+  
_.aggregateFunction.children.filterNot(_.foldable).toSet).distinct.length > 1) {
   // This is a sanity check. We should not reach here when we have 
multiple distinct
   // column sets. Our `RewriteDistinctAggregates` should take care 
this case.
   sys.error("You hit a query analyzer bug. Please report your query to 
" +
@@ -548,7 +549,8 @@ abstract class SparkStrategies extends 
QueryPlanner[SparkPlan] {
 // to be [COUNT(DISTINCT foo), MAX(DISTINCT foo)], but
 // [COUNT(DISTINCT bar), COUNT(DISTINCT foo)] is disallowed 
because those two distinct
 // aggregates have different column expressions.
-val distinctExpressions = 
functionsWithDistinct.head.aggregateFunction.children
+val distinctExpressions =
+  
functionsWithDistinct.head.aggregateFunction.children.filterNot(_.foldable)
 val normalizedNamedDistinctExpressions = distinctExpressions.map { 
e =>
   // Ideally this should be done in `NormalizeFloatingNumbers`, 
but we do it here
   // because `distinctExpressions` is not extracted during logical 
phase.
diff --git a/sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala
index 7869005..85cbe45 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala
@@ -2467,6 +2467,10 @@ class DataFrameSuite extends QueryTest
 val df = l.join(r, $"col2" === $"col4", "LeftOuter")
 checkAnswer(df, Row("2", "2"))
   }
+
+  test("SPARK-32761: aggregating multiple distinct CONSTANT columns") {
+ checkAnswer(sql("select count(distinct 2), count(distinct 2,3)"), Row(1, 
1))
+  }
 }
 
 case class GroupByKey(a: Int, b: Int)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-32761][SQL][3.0] Allow aggregating multiple foldable distinct expressions

2020-10-15 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 37d6b3c  [SPARK-32761][SQL][3.0] Allow aggregating multiple foldable 
distinct expressions
37d6b3c is described below

commit 37d6b3c0fafa98922ed1ecf4f8634d962f5bb9d9
Author: Linhong Liu 
AuthorDate: Fri Oct 16 03:36:21 2020 +

[SPARK-32761][SQL][3.0] Allow aggregating multiple foldable distinct 
expressions

### What changes were proposed in this pull request?
For queries with multiple foldable distinct columns, since they will be 
eliminated during
execution, it's not mandatory to let `RewriteDistinctAggregates` handle 
this case. And
in the current code, `RewriteDistinctAggregates` *dose* miss some 
"aggregating with
multiple foldable distinct expressions" cases.
For example: `select count(distinct 2), count(distinct 2, 3)` will be 
missed.

But in the planner, this will trigger an error that "multiple distinct 
expressions" are not allowed.
As the foldable distinct columns can be eliminated finally, we can allow 
this in the aggregation
planner check.

### Why are the changes needed?
bug fix

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
added test case

Authored-by: Linhong Liu 
Signed-off-by: Wenchen Fan 
(cherry picked from commit a410658c9bc244e325702dc926075bd835b669ff)

Closes #30052 from linhongliu-db/SPARK-32761-3.0.

Authored-by: Linhong Liu 
Signed-off-by: Wenchen Fan 
---
 .../main/scala/org/apache/spark/sql/execution/SparkStrategies.scala | 6 --
 sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala   | 4 
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala
index f836deb..689d1eb 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala
@@ -517,7 +517,8 @@ abstract class SparkStrategies extends 
QueryPlanner[SparkPlan] {
 
 val (functionsWithDistinct, functionsWithoutDistinct) =
   aggregateExpressions.partition(_.isDistinct)
-if 
(functionsWithDistinct.map(_.aggregateFunction.children.toSet).distinct.length 
> 1) {
+if (functionsWithDistinct.map(
+  
_.aggregateFunction.children.filterNot(_.foldable).toSet).distinct.length > 1) {
   // This is a sanity check. We should not reach here when we have 
multiple distinct
   // column sets. Our `RewriteDistinctAggregates` should take care 
this case.
   sys.error("You hit a query analyzer bug. Please report your query to 
" +
@@ -548,7 +549,8 @@ abstract class SparkStrategies extends 
QueryPlanner[SparkPlan] {
 // to be [COUNT(DISTINCT foo), MAX(DISTINCT foo)], but
 // [COUNT(DISTINCT bar), COUNT(DISTINCT foo)] is disallowed 
because those two distinct
 // aggregates have different column expressions.
-val distinctExpressions = 
functionsWithDistinct.head.aggregateFunction.children
+val distinctExpressions =
+  
functionsWithDistinct.head.aggregateFunction.children.filterNot(_.foldable)
 val normalizedNamedDistinctExpressions = distinctExpressions.map { 
e =>
   // Ideally this should be done in `NormalizeFloatingNumbers`, 
but we do it here
   // because `distinctExpressions` is not extracted during logical 
phase.
diff --git a/sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala
index 7869005..85cbe45 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala
@@ -2467,6 +2467,10 @@ class DataFrameSuite extends QueryTest
 val df = l.join(r, $"col2" === $"col4", "LeftOuter")
 checkAnswer(df, Row("2", "2"))
   }
+
+  test("SPARK-32761: aggregating multiple distinct CONSTANT columns") {
+ checkAnswer(sql("select count(distinct 2), count(distinct 2,3)"), Row(1, 
1))
+  }
 }
 
 case class GroupByKey(a: Int, b: Int)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-32761][SQL][3.0] Allow aggregating multiple foldable distinct expressions

2020-10-15 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 37d6b3c  [SPARK-32761][SQL][3.0] Allow aggregating multiple foldable 
distinct expressions
37d6b3c is described below

commit 37d6b3c0fafa98922ed1ecf4f8634d962f5bb9d9
Author: Linhong Liu 
AuthorDate: Fri Oct 16 03:36:21 2020 +

[SPARK-32761][SQL][3.0] Allow aggregating multiple foldable distinct 
expressions

### What changes were proposed in this pull request?
For queries with multiple foldable distinct columns, since they will be 
eliminated during
execution, it's not mandatory to let `RewriteDistinctAggregates` handle 
this case. And
in the current code, `RewriteDistinctAggregates` *dose* miss some 
"aggregating with
multiple foldable distinct expressions" cases.
For example: `select count(distinct 2), count(distinct 2, 3)` will be 
missed.

But in the planner, this will trigger an error that "multiple distinct 
expressions" are not allowed.
As the foldable distinct columns can be eliminated finally, we can allow 
this in the aggregation
planner check.

### Why are the changes needed?
bug fix

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
added test case

Authored-by: Linhong Liu 
Signed-off-by: Wenchen Fan 
(cherry picked from commit a410658c9bc244e325702dc926075bd835b669ff)

Closes #30052 from linhongliu-db/SPARK-32761-3.0.

Authored-by: Linhong Liu 
Signed-off-by: Wenchen Fan 
---
 .../main/scala/org/apache/spark/sql/execution/SparkStrategies.scala | 6 --
 sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala   | 4 
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala
index f836deb..689d1eb 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala
@@ -517,7 +517,8 @@ abstract class SparkStrategies extends 
QueryPlanner[SparkPlan] {
 
 val (functionsWithDistinct, functionsWithoutDistinct) =
   aggregateExpressions.partition(_.isDistinct)
-if 
(functionsWithDistinct.map(_.aggregateFunction.children.toSet).distinct.length 
> 1) {
+if (functionsWithDistinct.map(
+  
_.aggregateFunction.children.filterNot(_.foldable).toSet).distinct.length > 1) {
   // This is a sanity check. We should not reach here when we have 
multiple distinct
   // column sets. Our `RewriteDistinctAggregates` should take care 
this case.
   sys.error("You hit a query analyzer bug. Please report your query to 
" +
@@ -548,7 +549,8 @@ abstract class SparkStrategies extends 
QueryPlanner[SparkPlan] {
 // to be [COUNT(DISTINCT foo), MAX(DISTINCT foo)], but
 // [COUNT(DISTINCT bar), COUNT(DISTINCT foo)] is disallowed 
because those two distinct
 // aggregates have different column expressions.
-val distinctExpressions = 
functionsWithDistinct.head.aggregateFunction.children
+val distinctExpressions =
+  
functionsWithDistinct.head.aggregateFunction.children.filterNot(_.foldable)
 val normalizedNamedDistinctExpressions = distinctExpressions.map { 
e =>
   // Ideally this should be done in `NormalizeFloatingNumbers`, 
but we do it here
   // because `distinctExpressions` is not extracted during logical 
phase.
diff --git a/sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala
index 7869005..85cbe45 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala
@@ -2467,6 +2467,10 @@ class DataFrameSuite extends QueryTest
 val df = l.join(r, $"col2" === $"col4", "LeftOuter")
 checkAnswer(df, Row("2", "2"))
   }
+
+  test("SPARK-32761: aggregating multiple distinct CONSTANT columns") {
+ checkAnswer(sql("select count(distinct 2), count(distinct 2,3)"), Row(1, 
1))
+  }
 }
 
 case class GroupByKey(a: Int, b: Int)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-32761][SQL][3.0] Allow aggregating multiple foldable distinct expressions

2020-10-15 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 37d6b3c  [SPARK-32761][SQL][3.0] Allow aggregating multiple foldable 
distinct expressions
37d6b3c is described below

commit 37d6b3c0fafa98922ed1ecf4f8634d962f5bb9d9
Author: Linhong Liu 
AuthorDate: Fri Oct 16 03:36:21 2020 +

[SPARK-32761][SQL][3.0] Allow aggregating multiple foldable distinct 
expressions

### What changes were proposed in this pull request?
For queries with multiple foldable distinct columns, since they will be 
eliminated during
execution, it's not mandatory to let `RewriteDistinctAggregates` handle 
this case. And
in the current code, `RewriteDistinctAggregates` *dose* miss some 
"aggregating with
multiple foldable distinct expressions" cases.
For example: `select count(distinct 2), count(distinct 2, 3)` will be 
missed.

But in the planner, this will trigger an error that "multiple distinct 
expressions" are not allowed.
As the foldable distinct columns can be eliminated finally, we can allow 
this in the aggregation
planner check.

### Why are the changes needed?
bug fix

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
added test case

Authored-by: Linhong Liu 
Signed-off-by: Wenchen Fan 
(cherry picked from commit a410658c9bc244e325702dc926075bd835b669ff)

Closes #30052 from linhongliu-db/SPARK-32761-3.0.

Authored-by: Linhong Liu 
Signed-off-by: Wenchen Fan 
---
 .../main/scala/org/apache/spark/sql/execution/SparkStrategies.scala | 6 --
 sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala   | 4 
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala
index f836deb..689d1eb 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala
@@ -517,7 +517,8 @@ abstract class SparkStrategies extends 
QueryPlanner[SparkPlan] {
 
 val (functionsWithDistinct, functionsWithoutDistinct) =
   aggregateExpressions.partition(_.isDistinct)
-if 
(functionsWithDistinct.map(_.aggregateFunction.children.toSet).distinct.length 
> 1) {
+if (functionsWithDistinct.map(
+  
_.aggregateFunction.children.filterNot(_.foldable).toSet).distinct.length > 1) {
   // This is a sanity check. We should not reach here when we have 
multiple distinct
   // column sets. Our `RewriteDistinctAggregates` should take care 
this case.
   sys.error("You hit a query analyzer bug. Please report your query to 
" +
@@ -548,7 +549,8 @@ abstract class SparkStrategies extends 
QueryPlanner[SparkPlan] {
 // to be [COUNT(DISTINCT foo), MAX(DISTINCT foo)], but
 // [COUNT(DISTINCT bar), COUNT(DISTINCT foo)] is disallowed 
because those two distinct
 // aggregates have different column expressions.
-val distinctExpressions = 
functionsWithDistinct.head.aggregateFunction.children
+val distinctExpressions =
+  
functionsWithDistinct.head.aggregateFunction.children.filterNot(_.foldable)
 val normalizedNamedDistinctExpressions = distinctExpressions.map { 
e =>
   // Ideally this should be done in `NormalizeFloatingNumbers`, 
but we do it here
   // because `distinctExpressions` is not extracted during logical 
phase.
diff --git a/sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala
index 7869005..85cbe45 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala
@@ -2467,6 +2467,10 @@ class DataFrameSuite extends QueryTest
 val df = l.join(r, $"col2" === $"col4", "LeftOuter")
 checkAnswer(df, Row("2", "2"))
   }
+
+  test("SPARK-32761: aggregating multiple distinct CONSTANT columns") {
+ checkAnswer(sql("select count(distinct 2), count(distinct 2,3)"), Row(1, 
1))
+  }
 }
 
 case class GroupByKey(a: Int, b: Int)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-33088][CORE] Enhance ExecutorPlugin API to include callbacks on task start and end events

2020-10-15 Thread mridulm80

This is an automated email from the ASF dual-hosted git repository.

mridulm80 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 8f4fc22  [SPARK-33088][CORE] Enhance ExecutorPlugin API to include 
callbacks on task start and end events
8f4fc22 is described below

commit 8f4fc22dc460eb05c47e0d61facf116c60b1be37
Author: Samuel Souza 
AuthorDate: Thu Oct 15 22:12:41 2020 -0500

[SPARK-33088][CORE] Enhance ExecutorPlugin API to include callbacks on task 
start and end events

### What changes were proposed in this pull request?
Proposing a new set of APIs for ExecutorPlugins, to provide callbacks 
invoked at the start and end of each task of a job. Not very opinionated on the 
shape of the API, tried to be as minimal as possible for now.

### Why are the changes needed?
Changes described in detail on 
[SPARK-33088](https://issues.apache.org/jira/browse/SPARK-33088), but mostly 
this boils down to:

1. This feature was considered when the ExecutorPlugin API was initially 
introduced in #21923, but never implemented.
2. The use-case which **requires** this feature is to propagate tracing 
information from the driver to the executor, such that calls from the same job 
can all be traced.
  a. Tracing frameworks usually are setup in thread locals, therefore it's 
important for the setup to happen in the same thread which runs the tasks.
  b. Executors can be for multiple jobs, therefore it's not sufficient to 
set tracing information at executor startup time -- it needs to happen every 
time a task starts or ends.

### Does this PR introduce _any_ user-facing change?
No. This PR introduces new features for future developers to use.

### How was this patch tested?
Unit tests on `PluginContainerSuite`.

Closes #29977 from fsamuel-bs/SPARK-33088.

Authored-by: Samuel Souza 
Signed-off-by: Mridul Muralidharan gmail.com>
---
 .../apache/spark/api/plugin/ExecutorPlugin.java| 42 +++
 .../scala/org/apache/spark/executor/Executor.scala | 32 --
 .../spark/internal/plugin/PluginContainer.scala| 49 +-
 .../scala/org/apache/spark/scheduler/Task.scala|  6 ++-
 .../internal/plugin/PluginContainerSuite.scala | 47 +
 .../apache/spark/scheduler/TaskContextSuite.scala  |  4 +-
 6 files changed, 163 insertions(+), 17 deletions(-)

diff --git a/core/src/main/java/org/apache/spark/api/plugin/ExecutorPlugin.java 
b/core/src/main/java/org/apache/spark/api/plugin/ExecutorPlugin.java
index 4961308..481bf98 100644
--- a/core/src/main/java/org/apache/spark/api/plugin/ExecutorPlugin.java
+++ b/core/src/main/java/org/apache/spark/api/plugin/ExecutorPlugin.java
@@ -19,6 +19,7 @@ package org.apache.spark.api.plugin;
 
 import java.util.Map;
 
+import org.apache.spark.TaskFailedReason;
 import org.apache.spark.annotation.DeveloperApi;
 
 /**
@@ -54,4 +55,45 @@ public interface ExecutorPlugin {
*/
   default void shutdown() {}
 
+  /**
+   * Perform any action before the task is run.
+   * 
+   * This method is invoked from the same thread the task will be executed.
+   * Task-specific information can be accessed via {@link 
org.apache.spark.TaskContext#get}.
+   * 
+   * Plugin authors should avoid expensive operations here, as this method 
will be called
+   * on every task, and doing something expensive can significantly slow down 
a job.
+   * It is not recommended for a user to call a remote service, for example.
+   * 
+   * Exceptions thrown from this method do not propagate - they're caught,
+   * logged, and suppressed. Therefore exceptions when executing this method 
won't
+   * make the job fail.
+   *
+   * @since 3.1.0
+   */
+  default void onTaskStart() {}
+
+  /**
+   * Perform an action after tasks completes without exceptions.
+   * 
+   * As {@link #onTaskStart() onTaskStart} exceptions are suppressed, this 
method
+   * will still be invoked even if the corresponding {@link #onTaskStart} call 
for this
+   * task failed.
+   * 
+   * Same warnings of {@link #onTaskStart() onTaskStart} apply here.
+   *
+   * @since 3.1.0
+   */
+  default void onTaskSucceeded() {}
+
+  /**
+   * Perform an action after tasks completes with exceptions.
+   * 
+   * Same warnings of {@link #onTaskStart() onTaskStart} apply here.
+   *
+   * @param failureReason the exception thrown from the failed task.
+   *
+   * @since 3.1.0
+   */
+  default void onTaskFailed(TaskFailedReason failureReason) {}
 }
diff --git a/core/src/main/scala/org/apache/spark/executor/Executor.scala 
b/core/src/main/scala/org/apache/spark/executor/Executor.scala
index 27addd8..6653650 100644
--- a/core/src/main/scala/org/apache/spark/executor/Executor.scala
+++ b/core/src/main/scala/org/apache/spark/executor/Executor.scala
@@ -253,7 +253,7 @@ private[spark] class Executor(

[spark] branch branch-3.0 updated (d0f1120 -> 160f458)

2020-10-15 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git.


from d0f1120  [SPARK-33163][SQL][TESTS] Check the metadata key 
'org.apache.spark.legacyDateTime' in Avro/Parquet files
 add 160f458  [SPARK-33165][SQL][TEST] Remove 
dependencies(scalatest,scalactic) from Benchmark

No new revisions were added by this update.

Summary of changes:
 core/src/test/scala/org/apache/spark/benchmark/Benchmark.scala   | 5 -
 .../apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala   | 3 ++-
 2 files changed, 2 insertions(+), 6 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated (d0f1120 -> 160f458)

2020-10-15 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git.


from d0f1120  [SPARK-33163][SQL][TESTS] Check the metadata key 
'org.apache.spark.legacyDateTime' in Avro/Parquet files
 add 160f458  [SPARK-33165][SQL][TEST] Remove 
dependencies(scalatest,scalactic) from Benchmark

No new revisions were added by this update.

Summary of changes:
 core/src/test/scala/org/apache/spark/benchmark/Benchmark.scala   | 5 -
 .../apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala   | 3 ++-
 2 files changed, 2 insertions(+), 6 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (bf594a9 -> a5c17de)

2020-10-15 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from bf594a9  [SPARK-32402][SQL][FOLLOW-UP] Add case sensitivity tests for 
column resolution in ALTER TABLE
 add a5c17de  [SPARK-33165][SQL][TEST] Remove 
dependencies(scalatest,scalactic) from Benchmark

No new revisions were added by this update.

Summary of changes:
 core/src/test/scala/org/apache/spark/benchmark/Benchmark.scala   | 5 -
 .../apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala   | 3 ++-
 2 files changed, 2 insertions(+), 6 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated (d0f1120 -> 160f458)

2020-10-15 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git.


from d0f1120  [SPARK-33163][SQL][TESTS] Check the metadata key 
'org.apache.spark.legacyDateTime' in Avro/Parquet files
 add 160f458  [SPARK-33165][SQL][TEST] Remove 
dependencies(scalatest,scalactic) from Benchmark

No new revisions were added by this update.

Summary of changes:
 core/src/test/scala/org/apache/spark/benchmark/Benchmark.scala   | 5 -
 .../apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala   | 3 ++-
 2 files changed, 2 insertions(+), 6 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (bf594a9 -> a5c17de)

2020-10-15 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from bf594a9  [SPARK-32402][SQL][FOLLOW-UP] Add case sensitivity tests for 
column resolution in ALTER TABLE
 add a5c17de  [SPARK-33165][SQL][TEST] Remove 
dependencies(scalatest,scalactic) from Benchmark

No new revisions were added by this update.

Summary of changes:
 core/src/test/scala/org/apache/spark/benchmark/Benchmark.scala   | 5 -
 .../apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala   | 3 ++-
 2 files changed, 2 insertions(+), 6 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated (d0f1120 -> 160f458)

2020-10-15 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git.


from d0f1120  [SPARK-33163][SQL][TESTS] Check the metadata key 
'org.apache.spark.legacyDateTime' in Avro/Parquet files
 add 160f458  [SPARK-33165][SQL][TEST] Remove 
dependencies(scalatest,scalactic) from Benchmark

No new revisions were added by this update.

Summary of changes:
 core/src/test/scala/org/apache/spark/benchmark/Benchmark.scala   | 5 -
 .../apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala   | 3 ++-
 2 files changed, 2 insertions(+), 6 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (bf594a9 -> a5c17de)

2020-10-15 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from bf594a9  [SPARK-32402][SQL][FOLLOW-UP] Add case sensitivity tests for 
column resolution in ALTER TABLE
 add a5c17de  [SPARK-33165][SQL][TEST] Remove 
dependencies(scalatest,scalactic) from Benchmark

No new revisions were added by this update.

Summary of changes:
 core/src/test/scala/org/apache/spark/benchmark/Benchmark.scala   | 5 -
 .../apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala   | 3 ++-
 2 files changed, 2 insertions(+), 6 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated (d0f1120 -> 160f458)

2020-10-15 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git.


from d0f1120  [SPARK-33163][SQL][TESTS] Check the metadata key 
'org.apache.spark.legacyDateTime' in Avro/Parquet files
 add 160f458  [SPARK-33165][SQL][TEST] Remove 
dependencies(scalatest,scalactic) from Benchmark

No new revisions were added by this update.

Summary of changes:
 core/src/test/scala/org/apache/spark/benchmark/Benchmark.scala   | 5 -
 .../apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala   | 3 ++-
 2 files changed, 2 insertions(+), 6 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (bf594a9 -> a5c17de)

2020-10-15 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from bf594a9  [SPARK-32402][SQL][FOLLOW-UP] Add case sensitivity tests for 
column resolution in ALTER TABLE
 add a5c17de  [SPARK-33165][SQL][TEST] Remove 
dependencies(scalatest,scalactic) from Benchmark

No new revisions were added by this update.

Summary of changes:
 core/src/test/scala/org/apache/spark/benchmark/Benchmark.scala   | 5 -
 .../apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala   | 3 ++-
 2 files changed, 2 insertions(+), 6 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (bf594a9 -> a5c17de)

2020-10-15 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from bf594a9  [SPARK-32402][SQL][FOLLOW-UP] Add case sensitivity tests for 
column resolution in ALTER TABLE
 add a5c17de  [SPARK-33165][SQL][TEST] Remove 
dependencies(scalatest,scalactic) from Benchmark

No new revisions were added by this update.

Summary of changes:
 core/src/test/scala/org/apache/spark/benchmark/Benchmark.scala   | 5 -
 .../apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala   | 3 ++-
 2 files changed, 2 insertions(+), 6 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (38c05af -> bf594a9)

2020-10-15 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 38c05af  [SPARK-33163][SQL][TESTS] Check the metadata key 
'org.apache.spark.legacyDateTime' in Avro/Parquet files
 add bf594a9  [SPARK-32402][SQL][FOLLOW-UP] Add case sensitivity tests for 
column resolution in ALTER TABLE

No new revisions were added by this update.

Summary of changes:
 .../v2/jdbc/JDBCTableCatalogSuite.scala| 155 +++--
 1 file changed, 114 insertions(+), 41 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (38c05af -> bf594a9)

2020-10-15 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 38c05af  [SPARK-33163][SQL][TESTS] Check the metadata key 
'org.apache.spark.legacyDateTime' in Avro/Parquet files
 add bf594a9  [SPARK-32402][SQL][FOLLOW-UP] Add case sensitivity tests for 
column resolution in ALTER TABLE

No new revisions were added by this update.

Summary of changes:
 .../v2/jdbc/JDBCTableCatalogSuite.scala| 155 +++--
 1 file changed, 114 insertions(+), 41 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (38c05af -> bf594a9)

2020-10-15 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 38c05af  [SPARK-33163][SQL][TESTS] Check the metadata key 
'org.apache.spark.legacyDateTime' in Avro/Parquet files
 add bf594a9  [SPARK-32402][SQL][FOLLOW-UP] Add case sensitivity tests for 
column resolution in ALTER TABLE

No new revisions were added by this update.

Summary of changes:
 .../v2/jdbc/JDBCTableCatalogSuite.scala| 155 +++--
 1 file changed, 114 insertions(+), 41 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (38c05af -> bf594a9)

2020-10-15 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 38c05af  [SPARK-33163][SQL][TESTS] Check the metadata key 
'org.apache.spark.legacyDateTime' in Avro/Parquet files
 add bf594a9  [SPARK-32402][SQL][FOLLOW-UP] Add case sensitivity tests for 
column resolution in ALTER TABLE

No new revisions were added by this update.

Summary of changes:
 .../v2/jdbc/JDBCTableCatalogSuite.scala| 155 +++--
 1 file changed, 114 insertions(+), 41 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (38c05af -> bf594a9)

2020-10-15 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 38c05af  [SPARK-33163][SQL][TESTS] Check the metadata key 
'org.apache.spark.legacyDateTime' in Avro/Parquet files
 add bf594a9  [SPARK-32402][SQL][FOLLOW-UP] Add case sensitivity tests for 
column resolution in ALTER TABLE

No new revisions were added by this update.

Summary of changes:
 .../v2/jdbc/JDBCTableCatalogSuite.scala| 155 +++--
 1 file changed, 114 insertions(+), 41 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-33163][SQL][TESTS] Check the metadata key 'org.apache.spark.legacyDateTime' in Avro/Parquet files

2020-10-15 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new d0f1120  [SPARK-33163][SQL][TESTS] Check the metadata key 
'org.apache.spark.legacyDateTime' in Avro/Parquet files
d0f1120 is described below

commit d0f1120f3fb524a52df71e03c3d28ac82f76c1a3
Author: Max Gekk 
AuthorDate: Fri Oct 16 10:28:15 2020 +0900

[SPARK-33163][SQL][TESTS] Check the metadata key 
'org.apache.spark.legacyDateTime' in Avro/Parquet files

### What changes were proposed in this pull request?
Added a couple tests to `AvroSuite` and to `ParquetIOSuite` to check that 
the metadata key 'org.apache.spark.legacyDateTime' is written correctly 
depending on the SQL configs:
- spark.sql.legacy.avro.datetimeRebaseModeInWrite
- spark.sql.legacy.parquet.datetimeRebaseModeInWrite

This is a follow up https://github.com/apache/spark/pull/28137.

### Why are the changes needed?
1. To improve test coverage
2. To make sure that the metadata key is actually saved to Avro/Parquet 
files

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
By running the added tests:
```
$ build/sbt "testOnly 
org.apache.spark.sql.execution.datasources.parquet.ParquetIOSuite"
$ build/sbt "avro/test:testOnly org.apache.spark.sql.avro.AvroV1Suite"
$ build/sbt "avro/test:testOnly org.apache.spark.sql.avro.AvroV2Suite"
```

Closes #30061 from MaxGekk/parquet-test-metakey.

Authored-by: Max Gekk 
Signed-off-by: HyukjinKwon 
(cherry picked from commit 38c05af1d5538fc6ad00cdb57c1a90e90d04e25d)
Signed-off-by: HyukjinKwon 
---
 .../org/apache/spark/sql/avro/AvroSuite.scala  | 40 ++---
 .../datasources/parquet/ParquetIOSuite.scala   | 51 +-
 2 files changed, 73 insertions(+), 18 deletions(-)

diff --git 
a/external/avro/src/test/scala/org/apache/spark/sql/avro/AvroSuite.scala 
b/external/avro/src/test/scala/org/apache/spark/sql/avro/AvroSuite.scala
index d2f49ae..5d7d2e4 100644
--- a/external/avro/src/test/scala/org/apache/spark/sql/avro/AvroSuite.scala
+++ b/external/avro/src/test/scala/org/apache/spark/sql/avro/AvroSuite.scala
@@ -1788,15 +1788,19 @@ abstract class AvroSuite extends QueryTest with 
SharedSparkSession {
 }
   }
 
+  private def checkMetaData(path: java.io.File, key: String, expectedValue: 
String): Unit = {
+val avroFiles = path.listFiles()
+  .filter(f => f.isFile && !f.getName.startsWith(".") && 
!f.getName.startsWith("_"))
+assert(avroFiles.length === 1)
+val reader = DataFileReader.openReader(avroFiles(0), new 
GenericDatumReader[GenericRecord]())
+val value = reader.asInstanceOf[DataFileReader[_]].getMetaString(key)
+assert(value === expectedValue)
+  }
+
   test("SPARK-31327: Write Spark version into Avro file metadata") {
 withTempPath { path =>
   
spark.range(1).repartition(1).write.format("avro").save(path.getCanonicalPath)
-  val avroFiles = path.listFiles()
-.filter(f => f.isFile && !f.getName.startsWith(".") && 
!f.getName.startsWith("_"))
-  assert(avroFiles.length === 1)
-  val reader = DataFileReader.openReader(avroFiles(0), new 
GenericDatumReader[GenericRecord]())
-  val version = 
reader.asInstanceOf[DataFileReader[_]].getMetaString(SPARK_VERSION_METADATA_KEY)
-  assert(version === SPARK_VERSION_SHORT)
+  checkMetaData(path, SPARK_VERSION_METADATA_KEY, SPARK_VERSION_SHORT)
 }
   }
 
@@ -1809,6 +1813,30 @@ abstract class AvroSuite extends QueryTest with 
SharedSparkSession {
   spark.read.format("avro").options(conf).load(path)
 }
   }
+
+  test("SPARK-33163: write the metadata key 
'org.apache.spark.legacyDateTime'") {
+def saveTs(dir: java.io.File): Unit = {
+  Seq(Timestamp.valueOf("2020-10-15 01:02:03")).toDF()
+.repartition(1)
+.write
+.format("avro")
+.save(dir.getAbsolutePath)
+}
+withSQLConf(SQLConf.LEGACY_AVRO_REBASE_MODE_IN_WRITE.key -> 
LEGACY.toString) {
+  withTempPath { dir =>
+saveTs(dir)
+checkMetaData(dir, SPARK_LEGACY_DATETIME, "")
+  }
+}
+Seq(CORRECTED, EXCEPTION).foreach { mode =>
+  withSQLConf(SQLConf.LEGACY_AVRO_REBASE_MODE_IN_WRITE.key -> 
mode.toString) {
+withTempPath { dir =>
+  saveTs(dir)
+  checkMetaData(dir, SPARK_LEGACY_DATETIME, null)
+}
+  }
+}
+  }
 }
 
 class AvroV1Suite extends AvroSuite {
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetIOSuite.scala
 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetIOSuite.scala
index 2dc8a06..ff406f7 100644
---

[spark] branch branch-3.0 updated: [SPARK-33163][SQL][TESTS] Check the metadata key 'org.apache.spark.legacyDateTime' in Avro/Parquet files

2020-10-15 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new d0f1120  [SPARK-33163][SQL][TESTS] Check the metadata key 
'org.apache.spark.legacyDateTime' in Avro/Parquet files
d0f1120 is described below

commit d0f1120f3fb524a52df71e03c3d28ac82f76c1a3
Author: Max Gekk 
AuthorDate: Fri Oct 16 10:28:15 2020 +0900

[SPARK-33163][SQL][TESTS] Check the metadata key 
'org.apache.spark.legacyDateTime' in Avro/Parquet files

### What changes were proposed in this pull request?
Added a couple tests to `AvroSuite` and to `ParquetIOSuite` to check that 
the metadata key 'org.apache.spark.legacyDateTime' is written correctly 
depending on the SQL configs:
- spark.sql.legacy.avro.datetimeRebaseModeInWrite
- spark.sql.legacy.parquet.datetimeRebaseModeInWrite

This is a follow up https://github.com/apache/spark/pull/28137.

### Why are the changes needed?
1. To improve test coverage
2. To make sure that the metadata key is actually saved to Avro/Parquet 
files

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
By running the added tests:
```
$ build/sbt "testOnly 
org.apache.spark.sql.execution.datasources.parquet.ParquetIOSuite"
$ build/sbt "avro/test:testOnly org.apache.spark.sql.avro.AvroV1Suite"
$ build/sbt "avro/test:testOnly org.apache.spark.sql.avro.AvroV2Suite"
```

Closes #30061 from MaxGekk/parquet-test-metakey.

Authored-by: Max Gekk 
Signed-off-by: HyukjinKwon 
(cherry picked from commit 38c05af1d5538fc6ad00cdb57c1a90e90d04e25d)
Signed-off-by: HyukjinKwon 
---
 .../org/apache/spark/sql/avro/AvroSuite.scala  | 40 ++---
 .../datasources/parquet/ParquetIOSuite.scala   | 51 +-
 2 files changed, 73 insertions(+), 18 deletions(-)

diff --git 
a/external/avro/src/test/scala/org/apache/spark/sql/avro/AvroSuite.scala 
b/external/avro/src/test/scala/org/apache/spark/sql/avro/AvroSuite.scala
index d2f49ae..5d7d2e4 100644
--- a/external/avro/src/test/scala/org/apache/spark/sql/avro/AvroSuite.scala
+++ b/external/avro/src/test/scala/org/apache/spark/sql/avro/AvroSuite.scala
@@ -1788,15 +1788,19 @@ abstract class AvroSuite extends QueryTest with 
SharedSparkSession {
 }
   }
 
+  private def checkMetaData(path: java.io.File, key: String, expectedValue: 
String): Unit = {
+val avroFiles = path.listFiles()
+  .filter(f => f.isFile && !f.getName.startsWith(".") && 
!f.getName.startsWith("_"))
+assert(avroFiles.length === 1)
+val reader = DataFileReader.openReader(avroFiles(0), new 
GenericDatumReader[GenericRecord]())
+val value = reader.asInstanceOf[DataFileReader[_]].getMetaString(key)
+assert(value === expectedValue)
+  }
+
   test("SPARK-31327: Write Spark version into Avro file metadata") {
 withTempPath { path =>
   
spark.range(1).repartition(1).write.format("avro").save(path.getCanonicalPath)
-  val avroFiles = path.listFiles()
-.filter(f => f.isFile && !f.getName.startsWith(".") && 
!f.getName.startsWith("_"))
-  assert(avroFiles.length === 1)
-  val reader = DataFileReader.openReader(avroFiles(0), new 
GenericDatumReader[GenericRecord]())
-  val version = 
reader.asInstanceOf[DataFileReader[_]].getMetaString(SPARK_VERSION_METADATA_KEY)
-  assert(version === SPARK_VERSION_SHORT)
+  checkMetaData(path, SPARK_VERSION_METADATA_KEY, SPARK_VERSION_SHORT)
 }
   }
 
@@ -1809,6 +1813,30 @@ abstract class AvroSuite extends QueryTest with 
SharedSparkSession {
   spark.read.format("avro").options(conf).load(path)
 }
   }
+
+  test("SPARK-33163: write the metadata key 
'org.apache.spark.legacyDateTime'") {
+def saveTs(dir: java.io.File): Unit = {
+  Seq(Timestamp.valueOf("2020-10-15 01:02:03")).toDF()
+.repartition(1)
+.write
+.format("avro")
+.save(dir.getAbsolutePath)
+}
+withSQLConf(SQLConf.LEGACY_AVRO_REBASE_MODE_IN_WRITE.key -> 
LEGACY.toString) {
+  withTempPath { dir =>
+saveTs(dir)
+checkMetaData(dir, SPARK_LEGACY_DATETIME, "")
+  }
+}
+Seq(CORRECTED, EXCEPTION).foreach { mode =>
+  withSQLConf(SQLConf.LEGACY_AVRO_REBASE_MODE_IN_WRITE.key -> 
mode.toString) {
+withTempPath { dir =>
+  saveTs(dir)
+  checkMetaData(dir, SPARK_LEGACY_DATETIME, null)
+}
+  }
+}
+  }
 }
 
 class AvroV1Suite extends AvroSuite {
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetIOSuite.scala
 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetIOSuite.scala
index 2dc8a06..ff406f7 100644
---

[spark] branch master updated (9f5eff0 -> 38c05af)

2020-10-15 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 9f5eff0  [SPARK-33162][INFRA] Use pre-built image at GitHub Action 
PySpark jobs
 add 38c05af  [SPARK-33163][SQL][TESTS] Check the metadata key 
'org.apache.spark.legacyDateTime' in Avro/Parquet files

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/avro/AvroSuite.scala  | 40 ++---
 .../datasources/parquet/ParquetIOSuite.scala   | 51 +-
 2 files changed, 73 insertions(+), 18 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-33163][SQL][TESTS] Check the metadata key 'org.apache.spark.legacyDateTime' in Avro/Parquet files

2020-10-15 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new d0f1120  [SPARK-33163][SQL][TESTS] Check the metadata key 
'org.apache.spark.legacyDateTime' in Avro/Parquet files
d0f1120 is described below

commit d0f1120f3fb524a52df71e03c3d28ac82f76c1a3
Author: Max Gekk 
AuthorDate: Fri Oct 16 10:28:15 2020 +0900

[SPARK-33163][SQL][TESTS] Check the metadata key 
'org.apache.spark.legacyDateTime' in Avro/Parquet files

### What changes were proposed in this pull request?
Added a couple tests to `AvroSuite` and to `ParquetIOSuite` to check that 
the metadata key 'org.apache.spark.legacyDateTime' is written correctly 
depending on the SQL configs:
- spark.sql.legacy.avro.datetimeRebaseModeInWrite
- spark.sql.legacy.parquet.datetimeRebaseModeInWrite

This is a follow up https://github.com/apache/spark/pull/28137.

### Why are the changes needed?
1. To improve test coverage
2. To make sure that the metadata key is actually saved to Avro/Parquet 
files

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
By running the added tests:
```
$ build/sbt "testOnly 
org.apache.spark.sql.execution.datasources.parquet.ParquetIOSuite"
$ build/sbt "avro/test:testOnly org.apache.spark.sql.avro.AvroV1Suite"
$ build/sbt "avro/test:testOnly org.apache.spark.sql.avro.AvroV2Suite"
```

Closes #30061 from MaxGekk/parquet-test-metakey.

Authored-by: Max Gekk 
Signed-off-by: HyukjinKwon 
(cherry picked from commit 38c05af1d5538fc6ad00cdb57c1a90e90d04e25d)
Signed-off-by: HyukjinKwon 
---
 .../org/apache/spark/sql/avro/AvroSuite.scala  | 40 ++---
 .../datasources/parquet/ParquetIOSuite.scala   | 51 +-
 2 files changed, 73 insertions(+), 18 deletions(-)

diff --git 
a/external/avro/src/test/scala/org/apache/spark/sql/avro/AvroSuite.scala 
b/external/avro/src/test/scala/org/apache/spark/sql/avro/AvroSuite.scala
index d2f49ae..5d7d2e4 100644
--- a/external/avro/src/test/scala/org/apache/spark/sql/avro/AvroSuite.scala
+++ b/external/avro/src/test/scala/org/apache/spark/sql/avro/AvroSuite.scala
@@ -1788,15 +1788,19 @@ abstract class AvroSuite extends QueryTest with 
SharedSparkSession {
 }
   }
 
+  private def checkMetaData(path: java.io.File, key: String, expectedValue: 
String): Unit = {
+val avroFiles = path.listFiles()
+  .filter(f => f.isFile && !f.getName.startsWith(".") && 
!f.getName.startsWith("_"))
+assert(avroFiles.length === 1)
+val reader = DataFileReader.openReader(avroFiles(0), new 
GenericDatumReader[GenericRecord]())
+val value = reader.asInstanceOf[DataFileReader[_]].getMetaString(key)
+assert(value === expectedValue)
+  }
+
   test("SPARK-31327: Write Spark version into Avro file metadata") {
 withTempPath { path =>
   
spark.range(1).repartition(1).write.format("avro").save(path.getCanonicalPath)
-  val avroFiles = path.listFiles()
-.filter(f => f.isFile && !f.getName.startsWith(".") && 
!f.getName.startsWith("_"))
-  assert(avroFiles.length === 1)
-  val reader = DataFileReader.openReader(avroFiles(0), new 
GenericDatumReader[GenericRecord]())
-  val version = 
reader.asInstanceOf[DataFileReader[_]].getMetaString(SPARK_VERSION_METADATA_KEY)
-  assert(version === SPARK_VERSION_SHORT)
+  checkMetaData(path, SPARK_VERSION_METADATA_KEY, SPARK_VERSION_SHORT)
 }
   }
 
@@ -1809,6 +1813,30 @@ abstract class AvroSuite extends QueryTest with 
SharedSparkSession {
   spark.read.format("avro").options(conf).load(path)
 }
   }
+
+  test("SPARK-33163: write the metadata key 
'org.apache.spark.legacyDateTime'") {
+def saveTs(dir: java.io.File): Unit = {
+  Seq(Timestamp.valueOf("2020-10-15 01:02:03")).toDF()
+.repartition(1)
+.write
+.format("avro")
+.save(dir.getAbsolutePath)
+}
+withSQLConf(SQLConf.LEGACY_AVRO_REBASE_MODE_IN_WRITE.key -> 
LEGACY.toString) {
+  withTempPath { dir =>
+saveTs(dir)
+checkMetaData(dir, SPARK_LEGACY_DATETIME, "")
+  }
+}
+Seq(CORRECTED, EXCEPTION).foreach { mode =>
+  withSQLConf(SQLConf.LEGACY_AVRO_REBASE_MODE_IN_WRITE.key -> 
mode.toString) {
+withTempPath { dir =>
+  saveTs(dir)
+  checkMetaData(dir, SPARK_LEGACY_DATETIME, null)
+}
+  }
+}
+  }
 }
 
 class AvroV1Suite extends AvroSuite {
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetIOSuite.scala
 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetIOSuite.scala
index 2dc8a06..ff406f7 100644
---

[spark] branch master updated (9f5eff0 -> 38c05af)

2020-10-15 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 9f5eff0  [SPARK-33162][INFRA] Use pre-built image at GitHub Action 
PySpark jobs
 add 38c05af  [SPARK-33163][SQL][TESTS] Check the metadata key 
'org.apache.spark.legacyDateTime' in Avro/Parquet files

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/avro/AvroSuite.scala  | 40 ++---
 .../datasources/parquet/ParquetIOSuite.scala   | 51 +-
 2 files changed, 73 insertions(+), 18 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-33163][SQL][TESTS] Check the metadata key 'org.apache.spark.legacyDateTime' in Avro/Parquet files

2020-10-15 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new d0f1120  [SPARK-33163][SQL][TESTS] Check the metadata key 
'org.apache.spark.legacyDateTime' in Avro/Parquet files
d0f1120 is described below

commit d0f1120f3fb524a52df71e03c3d28ac82f76c1a3
Author: Max Gekk 
AuthorDate: Fri Oct 16 10:28:15 2020 +0900

[SPARK-33163][SQL][TESTS] Check the metadata key 
'org.apache.spark.legacyDateTime' in Avro/Parquet files

### What changes were proposed in this pull request?
Added a couple tests to `AvroSuite` and to `ParquetIOSuite` to check that 
the metadata key 'org.apache.spark.legacyDateTime' is written correctly 
depending on the SQL configs:
- spark.sql.legacy.avro.datetimeRebaseModeInWrite
- spark.sql.legacy.parquet.datetimeRebaseModeInWrite

This is a follow up https://github.com/apache/spark/pull/28137.

### Why are the changes needed?
1. To improve test coverage
2. To make sure that the metadata key is actually saved to Avro/Parquet 
files

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
By running the added tests:
```
$ build/sbt "testOnly 
org.apache.spark.sql.execution.datasources.parquet.ParquetIOSuite"
$ build/sbt "avro/test:testOnly org.apache.spark.sql.avro.AvroV1Suite"
$ build/sbt "avro/test:testOnly org.apache.spark.sql.avro.AvroV2Suite"
```

Closes #30061 from MaxGekk/parquet-test-metakey.

Authored-by: Max Gekk 
Signed-off-by: HyukjinKwon 
(cherry picked from commit 38c05af1d5538fc6ad00cdb57c1a90e90d04e25d)
Signed-off-by: HyukjinKwon 
---
 .../org/apache/spark/sql/avro/AvroSuite.scala  | 40 ++---
 .../datasources/parquet/ParquetIOSuite.scala   | 51 +-
 2 files changed, 73 insertions(+), 18 deletions(-)

diff --git 
a/external/avro/src/test/scala/org/apache/spark/sql/avro/AvroSuite.scala 
b/external/avro/src/test/scala/org/apache/spark/sql/avro/AvroSuite.scala
index d2f49ae..5d7d2e4 100644
--- a/external/avro/src/test/scala/org/apache/spark/sql/avro/AvroSuite.scala
+++ b/external/avro/src/test/scala/org/apache/spark/sql/avro/AvroSuite.scala
@@ -1788,15 +1788,19 @@ abstract class AvroSuite extends QueryTest with 
SharedSparkSession {
 }
   }
 
+  private def checkMetaData(path: java.io.File, key: String, expectedValue: 
String): Unit = {
+val avroFiles = path.listFiles()
+  .filter(f => f.isFile && !f.getName.startsWith(".") && 
!f.getName.startsWith("_"))
+assert(avroFiles.length === 1)
+val reader = DataFileReader.openReader(avroFiles(0), new 
GenericDatumReader[GenericRecord]())
+val value = reader.asInstanceOf[DataFileReader[_]].getMetaString(key)
+assert(value === expectedValue)
+  }
+
   test("SPARK-31327: Write Spark version into Avro file metadata") {
 withTempPath { path =>
   
spark.range(1).repartition(1).write.format("avro").save(path.getCanonicalPath)
-  val avroFiles = path.listFiles()
-.filter(f => f.isFile && !f.getName.startsWith(".") && 
!f.getName.startsWith("_"))
-  assert(avroFiles.length === 1)
-  val reader = DataFileReader.openReader(avroFiles(0), new 
GenericDatumReader[GenericRecord]())
-  val version = 
reader.asInstanceOf[DataFileReader[_]].getMetaString(SPARK_VERSION_METADATA_KEY)
-  assert(version === SPARK_VERSION_SHORT)
+  checkMetaData(path, SPARK_VERSION_METADATA_KEY, SPARK_VERSION_SHORT)
 }
   }
 
@@ -1809,6 +1813,30 @@ abstract class AvroSuite extends QueryTest with 
SharedSparkSession {
   spark.read.format("avro").options(conf).load(path)
 }
   }
+
+  test("SPARK-33163: write the metadata key 
'org.apache.spark.legacyDateTime'") {
+def saveTs(dir: java.io.File): Unit = {
+  Seq(Timestamp.valueOf("2020-10-15 01:02:03")).toDF()
+.repartition(1)
+.write
+.format("avro")
+.save(dir.getAbsolutePath)
+}
+withSQLConf(SQLConf.LEGACY_AVRO_REBASE_MODE_IN_WRITE.key -> 
LEGACY.toString) {
+  withTempPath { dir =>
+saveTs(dir)
+checkMetaData(dir, SPARK_LEGACY_DATETIME, "")
+  }
+}
+Seq(CORRECTED, EXCEPTION).foreach { mode =>
+  withSQLConf(SQLConf.LEGACY_AVRO_REBASE_MODE_IN_WRITE.key -> 
mode.toString) {
+withTempPath { dir =>
+  saveTs(dir)
+  checkMetaData(dir, SPARK_LEGACY_DATETIME, null)
+}
+  }
+}
+  }
 }
 
 class AvroV1Suite extends AvroSuite {
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetIOSuite.scala
 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetIOSuite.scala
index 2dc8a06..ff406f7 100644
---

[spark] branch master updated (9f5eff0 -> 38c05af)

2020-10-15 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 9f5eff0  [SPARK-33162][INFRA] Use pre-built image at GitHub Action 
PySpark jobs
 add 38c05af  [SPARK-33163][SQL][TESTS] Check the metadata key 
'org.apache.spark.legacyDateTime' in Avro/Parquet files

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/avro/AvroSuite.scala  | 40 ++---
 .../datasources/parquet/ParquetIOSuite.scala   | 51 +-
 2 files changed, 73 insertions(+), 18 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-33163][SQL][TESTS] Check the metadata key 'org.apache.spark.legacyDateTime' in Avro/Parquet files

2020-10-15 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new d0f1120  [SPARK-33163][SQL][TESTS] Check the metadata key 
'org.apache.spark.legacyDateTime' in Avro/Parquet files
d0f1120 is described below

commit d0f1120f3fb524a52df71e03c3d28ac82f76c1a3
Author: Max Gekk 
AuthorDate: Fri Oct 16 10:28:15 2020 +0900

[SPARK-33163][SQL][TESTS] Check the metadata key 
'org.apache.spark.legacyDateTime' in Avro/Parquet files

### What changes were proposed in this pull request?
Added a couple tests to `AvroSuite` and to `ParquetIOSuite` to check that 
the metadata key 'org.apache.spark.legacyDateTime' is written correctly 
depending on the SQL configs:
- spark.sql.legacy.avro.datetimeRebaseModeInWrite
- spark.sql.legacy.parquet.datetimeRebaseModeInWrite

This is a follow up https://github.com/apache/spark/pull/28137.

### Why are the changes needed?
1. To improve test coverage
2. To make sure that the metadata key is actually saved to Avro/Parquet 
files

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
By running the added tests:
```
$ build/sbt "testOnly 
org.apache.spark.sql.execution.datasources.parquet.ParquetIOSuite"
$ build/sbt "avro/test:testOnly org.apache.spark.sql.avro.AvroV1Suite"
$ build/sbt "avro/test:testOnly org.apache.spark.sql.avro.AvroV2Suite"
```

Closes #30061 from MaxGekk/parquet-test-metakey.

Authored-by: Max Gekk 
Signed-off-by: HyukjinKwon 
(cherry picked from commit 38c05af1d5538fc6ad00cdb57c1a90e90d04e25d)
Signed-off-by: HyukjinKwon 
---
 .../org/apache/spark/sql/avro/AvroSuite.scala  | 40 ++---
 .../datasources/parquet/ParquetIOSuite.scala   | 51 +-
 2 files changed, 73 insertions(+), 18 deletions(-)

diff --git 
a/external/avro/src/test/scala/org/apache/spark/sql/avro/AvroSuite.scala 
b/external/avro/src/test/scala/org/apache/spark/sql/avro/AvroSuite.scala
index d2f49ae..5d7d2e4 100644
--- a/external/avro/src/test/scala/org/apache/spark/sql/avro/AvroSuite.scala
+++ b/external/avro/src/test/scala/org/apache/spark/sql/avro/AvroSuite.scala
@@ -1788,15 +1788,19 @@ abstract class AvroSuite extends QueryTest with 
SharedSparkSession {
 }
   }
 
+  private def checkMetaData(path: java.io.File, key: String, expectedValue: 
String): Unit = {
+val avroFiles = path.listFiles()
+  .filter(f => f.isFile && !f.getName.startsWith(".") && 
!f.getName.startsWith("_"))
+assert(avroFiles.length === 1)
+val reader = DataFileReader.openReader(avroFiles(0), new 
GenericDatumReader[GenericRecord]())
+val value = reader.asInstanceOf[DataFileReader[_]].getMetaString(key)
+assert(value === expectedValue)
+  }
+
   test("SPARK-31327: Write Spark version into Avro file metadata") {
 withTempPath { path =>
   
spark.range(1).repartition(1).write.format("avro").save(path.getCanonicalPath)
-  val avroFiles = path.listFiles()
-.filter(f => f.isFile && !f.getName.startsWith(".") && 
!f.getName.startsWith("_"))
-  assert(avroFiles.length === 1)
-  val reader = DataFileReader.openReader(avroFiles(0), new 
GenericDatumReader[GenericRecord]())
-  val version = 
reader.asInstanceOf[DataFileReader[_]].getMetaString(SPARK_VERSION_METADATA_KEY)
-  assert(version === SPARK_VERSION_SHORT)
+  checkMetaData(path, SPARK_VERSION_METADATA_KEY, SPARK_VERSION_SHORT)
 }
   }
 
@@ -1809,6 +1813,30 @@ abstract class AvroSuite extends QueryTest with 
SharedSparkSession {
   spark.read.format("avro").options(conf).load(path)
 }
   }
+
+  test("SPARK-33163: write the metadata key 
'org.apache.spark.legacyDateTime'") {
+def saveTs(dir: java.io.File): Unit = {
+  Seq(Timestamp.valueOf("2020-10-15 01:02:03")).toDF()
+.repartition(1)
+.write
+.format("avro")
+.save(dir.getAbsolutePath)
+}
+withSQLConf(SQLConf.LEGACY_AVRO_REBASE_MODE_IN_WRITE.key -> 
LEGACY.toString) {
+  withTempPath { dir =>
+saveTs(dir)
+checkMetaData(dir, SPARK_LEGACY_DATETIME, "")
+  }
+}
+Seq(CORRECTED, EXCEPTION).foreach { mode =>
+  withSQLConf(SQLConf.LEGACY_AVRO_REBASE_MODE_IN_WRITE.key -> 
mode.toString) {
+withTempPath { dir =>
+  saveTs(dir)
+  checkMetaData(dir, SPARK_LEGACY_DATETIME, null)
+}
+  }
+}
+  }
 }
 
 class AvroV1Suite extends AvroSuite {
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetIOSuite.scala
 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetIOSuite.scala
index 2dc8a06..ff406f7 100644
---

[spark] branch master updated (9f5eff0 -> 38c05af)

2020-10-15 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 9f5eff0  [SPARK-33162][INFRA] Use pre-built image at GitHub Action 
PySpark jobs
 add 38c05af  [SPARK-33163][SQL][TESTS] Check the metadata key 
'org.apache.spark.legacyDateTime' in Avro/Parquet files

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/avro/AvroSuite.scala  | 40 ++---
 .../datasources/parquet/ParquetIOSuite.scala   | 51 +-
 2 files changed, 73 insertions(+), 18 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (9f5eff0 -> 38c05af)

2020-10-15 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 9f5eff0  [SPARK-33162][INFRA] Use pre-built image at GitHub Action 
PySpark jobs
 add 38c05af  [SPARK-33163][SQL][TESTS] Check the metadata key 
'org.apache.spark.legacyDateTime' in Avro/Parquet files

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/avro/AvroSuite.scala  | 40 ++---
 .../datasources/parquet/ParquetIOSuite.scala   | 51 +-
 2 files changed, 73 insertions(+), 18 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (81d3a8e -> 9f5eff0)

2020-10-15 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 81d3a8e  [MINOR][PYTHON] Fix the typo in the docstring of method agg()
 add 9f5eff0  [SPARK-33162][INFRA] Use pre-built image at GitHub Action 
PySpark jobs

No new revisions were added by this update.

Summary of changes:
 .github/workflows/build_and_test.yml | 119 ++-
 1 file changed, 89 insertions(+), 30 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (81d3a8e -> 9f5eff0)

2020-10-15 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 81d3a8e  [MINOR][PYTHON] Fix the typo in the docstring of method agg()
 add 9f5eff0  [SPARK-33162][INFRA] Use pre-built image at GitHub Action 
PySpark jobs

No new revisions were added by this update.

Summary of changes:
 .github/workflows/build_and_test.yml | 119 ++-
 1 file changed, 89 insertions(+), 30 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (81d3a8e -> 9f5eff0)

2020-10-15 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 81d3a8e  [MINOR][PYTHON] Fix the typo in the docstring of method agg()
 add 9f5eff0  [SPARK-33162][INFRA] Use pre-built image at GitHub Action 
PySpark jobs

No new revisions were added by this update.

Summary of changes:
 .github/workflows/build_and_test.yml | 119 ++-
 1 file changed, 89 insertions(+), 30 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (81d3a8e -> 9f5eff0)

2020-10-15 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 81d3a8e  [MINOR][PYTHON] Fix the typo in the docstring of method agg()
 add 9f5eff0  [SPARK-33162][INFRA] Use pre-built image at GitHub Action 
PySpark jobs

No new revisions were added by this update.

Summary of changes:
 .github/workflows/build_and_test.yml | 119 ++-
 1 file changed, 89 insertions(+), 30 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (81d3a8e -> 9f5eff0)

2020-10-15 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 81d3a8e  [MINOR][PYTHON] Fix the typo in the docstring of method agg()
 add 9f5eff0  [SPARK-33162][INFRA] Use pre-built image at GitHub Action 
PySpark jobs

No new revisions were added by this update.

Summary of changes:
 .github/workflows/build_and_test.yml | 119 ++-
 1 file changed, 89 insertions(+), 30 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (ba69d68 -> 81d3a8e)

2020-10-15 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from ba69d68  [SPARK-33080][BUILD] Replace fatal warnings snippet
 add 81d3a8e  [MINOR][PYTHON] Fix the typo in the docstring of method agg()

No new revisions were added by this update.

Summary of changes:
 python/pyspark/sql/dataframe.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (ba69d68 -> 81d3a8e)

2020-10-15 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from ba69d68  [SPARK-33080][BUILD] Replace fatal warnings snippet
 add 81d3a8e  [MINOR][PYTHON] Fix the typo in the docstring of method agg()

No new revisions were added by this update.

Summary of changes:
 python/pyspark/sql/dataframe.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (ba69d68 -> 81d3a8e)

2020-10-15 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from ba69d68  [SPARK-33080][BUILD] Replace fatal warnings snippet
 add 81d3a8e  [MINOR][PYTHON] Fix the typo in the docstring of method agg()

No new revisions were added by this update.

Summary of changes:
 python/pyspark/sql/dataframe.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (ba69d68 -> 81d3a8e)

2020-10-15 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from ba69d68  [SPARK-33080][BUILD] Replace fatal warnings snippet
 add 81d3a8e  [MINOR][PYTHON] Fix the typo in the docstring of method agg()

No new revisions were added by this update.

Summary of changes:
 python/pyspark/sql/dataframe.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (ba69d68 -> 81d3a8e)

2020-10-15 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from ba69d68  [SPARK-33080][BUILD] Replace fatal warnings snippet
 add 81d3a8e  [MINOR][PYTHON] Fix the typo in the docstring of method agg()

No new revisions were added by this update.

Summary of changes:
 python/pyspark/sql/dataframe.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[GitHub] [spark-website] srowen closed pull request #295: Replace test-only to testOnly in Developer tools page

2020-10-15 Thread GitBox



srowen closed pull request #295:
URL: https://github.com/apache/spark-website/pull/295


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark-website] branch asf-site updated: Replace test-only to testOnly in Developer tools page

2020-10-15 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/spark-website.git


The following commit(s) were added to refs/heads/asf-site by this push:
 new fe3e503  Replace test-only to testOnly in Developer tools page
fe3e503 is described below

commit fe3e5037d2eef83da136b9f8c66e7e2d6904d2d4
Author: HyukjinKwon 
AuthorDate: Thu Oct 15 18:15:03 2020 -0500

Replace test-only to testOnly in Developer tools page

See also https://github.com/apache/spark/pull/30028. After SBT was upgraded 
to 1.3, `test-only` should be `testOnly`.

Author: HyukjinKwon 

Closes #295 from HyukjinKwon/test-only-sbt-upgrade.
---
 developer-tools.md| 2 +-
 site/developer-tools.html | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/developer-tools.md b/developer-tools.md
index 0078538..9d82a25 100644
--- a/developer-tools.md
+++ b/developer-tools.md
@@ -267,7 +267,7 @@ it's due to a classpath issue (some classes were probably 
not compiled). To fix
 sufficient to run a test from the command line:
 
 ```
-build/sbt "test-only org.apache.spark.rdd.SortingSuite"
+build/sbt "testOnly org.apache.spark.rdd.SortingSuite"
 ```
 
 Running Different Test Permutations on Jenkins
diff --git a/site/developer-tools.html b/site/developer-tools.html
index 86918d8..b9ecb5e 100644
--- a/site/developer-tools.html
+++ b/site/developer-tools.html
@@ -447,7 +447,7 @@ java.lang.NullPointerException
 its due to a classpath issue (some classes were probably not compiled). 
To fix this, it 
 sufficient to run a test from the command line:
 
-build/sbt "test-only org.apache.spark.rdd.SortingSuite"
+build/sbt "testOnly org.apache.spark.rdd.SortingSuite"
 
 
 Running Different Test Permutations on Jenkins


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated: [SPARK-30894][SQL][2.4] Make Size's nullable independent from SQL config changes

2020-10-15 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new 4353f7d  [SPARK-30894][SQL][2.4] Make Size's nullable independent from 
SQL config changes
4353f7d is described below

commit 4353f7d961aba7f1f65066245215b08817663701
Author: Maxim Gekk 
AuthorDate: Thu Oct 15 14:00:38 2020 -0700

[SPARK-30894][SQL][2.4] Make Size's nullable independent from SQL config 
changes

This is a backport of https://github.com/apache/spark/pull/27658

### What changes were proposed in this pull request?

In the PR, I propose to add the `legacySizeOfNull ` parameter to the `Size` 
expression, and pass the value of `spark.sql.legacy.sizeOfNull` if 
`legacySizeOfNull` is not provided on creation of `Size`.

### Why are the changes needed?

This allows to avoid the issue when the configuration change between 
different phases of planning, and this can silently break a query plan which 
can lead to crashes or data corruption.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

By `CollectionExpressionsSuite`.

Closes #30058 from anuragmantri/SPARK-30894-2.4.

Authored-by: Maxim Gekk 
Signed-off-by: Dongjoon Hyun 
---
 .../spark/sql/catalyst/expressions/collectionOperations.scala| 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
index 6d74f45..c8bc1e7 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
@@ -89,9 +89,10 @@ trait BinaryArrayExpressionWithImplicitCast extends 
BinaryExpression
   > SELECT _FUNC_(NULL);
-1
   """)
-case class Size(child: Expression) extends UnaryExpression with 
ExpectsInputTypes {
+case class Size(child: Expression, legacySizeOfNull: Boolean)
+  extends UnaryExpression with ExpectsInputTypes {
 
-  val legacySizeOfNull = SQLConf.get.legacySizeOfNull
+  def this(child: Expression) = this(child, SQLConf.get.legacySizeOfNull)
 
   override def dataType: DataType = IntegerType
   override def inputTypes: Seq[AbstractDataType] = 
Seq(TypeCollection(ArrayType, MapType))
@@ -123,6 +124,10 @@ case class Size(child: Expression) extends UnaryExpression 
with ExpectsInputType
   }
 }
 
+object Size {
+  def apply(child: Expression): Size = new Size(child)
+}
+
 /**
  * Returns an unordered array containing the keys of the map.
  */


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated: [SPARK-30894][SQL][2.4] Make Size's nullable independent from SQL config changes

2020-10-15 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new 4353f7d  [SPARK-30894][SQL][2.4] Make Size's nullable independent from 
SQL config changes
4353f7d is described below

commit 4353f7d961aba7f1f65066245215b08817663701
Author: Maxim Gekk 
AuthorDate: Thu Oct 15 14:00:38 2020 -0700

[SPARK-30894][SQL][2.4] Make Size's nullable independent from SQL config 
changes

This is a backport of https://github.com/apache/spark/pull/27658

### What changes were proposed in this pull request?

In the PR, I propose to add the `legacySizeOfNull ` parameter to the `Size` 
expression, and pass the value of `spark.sql.legacy.sizeOfNull` if 
`legacySizeOfNull` is not provided on creation of `Size`.

### Why are the changes needed?

This allows to avoid the issue when the configuration change between 
different phases of planning, and this can silently break a query plan which 
can lead to crashes or data corruption.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

By `CollectionExpressionsSuite`.

Closes #30058 from anuragmantri/SPARK-30894-2.4.

Authored-by: Maxim Gekk 
Signed-off-by: Dongjoon Hyun 
---
 .../spark/sql/catalyst/expressions/collectionOperations.scala| 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
index 6d74f45..c8bc1e7 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
@@ -89,9 +89,10 @@ trait BinaryArrayExpressionWithImplicitCast extends 
BinaryExpression
   > SELECT _FUNC_(NULL);
-1
   """)
-case class Size(child: Expression) extends UnaryExpression with 
ExpectsInputTypes {
+case class Size(child: Expression, legacySizeOfNull: Boolean)
+  extends UnaryExpression with ExpectsInputTypes {
 
-  val legacySizeOfNull = SQLConf.get.legacySizeOfNull
+  def this(child: Expression) = this(child, SQLConf.get.legacySizeOfNull)
 
   override def dataType: DataType = IntegerType
   override def inputTypes: Seq[AbstractDataType] = 
Seq(TypeCollection(ArrayType, MapType))
@@ -123,6 +124,10 @@ case class Size(child: Expression) extends UnaryExpression 
with ExpectsInputType
   }
 }
 
+object Size {
+  def apply(child: Expression): Size = new Size(child)
+}
+
 /**
  * Returns an unordered array containing the keys of the map.
  */


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated: [SPARK-30894][SQL][2.4] Make Size's nullable independent from SQL config changes

2020-10-15 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new 4353f7d  [SPARK-30894][SQL][2.4] Make Size's nullable independent from 
SQL config changes
4353f7d is described below

commit 4353f7d961aba7f1f65066245215b08817663701
Author: Maxim Gekk 
AuthorDate: Thu Oct 15 14:00:38 2020 -0700

[SPARK-30894][SQL][2.4] Make Size's nullable independent from SQL config 
changes

This is a backport of https://github.com/apache/spark/pull/27658

### What changes were proposed in this pull request?

In the PR, I propose to add the `legacySizeOfNull ` parameter to the `Size` 
expression, and pass the value of `spark.sql.legacy.sizeOfNull` if 
`legacySizeOfNull` is not provided on creation of `Size`.

### Why are the changes needed?

This allows to avoid the issue when the configuration change between 
different phases of planning, and this can silently break a query plan which 
can lead to crashes or data corruption.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

By `CollectionExpressionsSuite`.

Closes #30058 from anuragmantri/SPARK-30894-2.4.

Authored-by: Maxim Gekk 
Signed-off-by: Dongjoon Hyun 
---
 .../spark/sql/catalyst/expressions/collectionOperations.scala| 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
index 6d74f45..c8bc1e7 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
@@ -89,9 +89,10 @@ trait BinaryArrayExpressionWithImplicitCast extends 
BinaryExpression
   > SELECT _FUNC_(NULL);
-1
   """)
-case class Size(child: Expression) extends UnaryExpression with 
ExpectsInputTypes {
+case class Size(child: Expression, legacySizeOfNull: Boolean)
+  extends UnaryExpression with ExpectsInputTypes {
 
-  val legacySizeOfNull = SQLConf.get.legacySizeOfNull
+  def this(child: Expression) = this(child, SQLConf.get.legacySizeOfNull)
 
   override def dataType: DataType = IntegerType
   override def inputTypes: Seq[AbstractDataType] = 
Seq(TypeCollection(ArrayType, MapType))
@@ -123,6 +124,10 @@ case class Size(child: Expression) extends UnaryExpression 
with ExpectsInputType
   }
 }
 
+object Size {
+  def apply(child: Expression): Size = new Size(child)
+}
+
 /**
  * Returns an unordered array containing the keys of the map.
  */


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated: [SPARK-30894][SQL][2.4] Make Size's nullable independent from SQL config changes

2020-10-15 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new 4353f7d  [SPARK-30894][SQL][2.4] Make Size's nullable independent from 
SQL config changes
4353f7d is described below

commit 4353f7d961aba7f1f65066245215b08817663701
Author: Maxim Gekk 
AuthorDate: Thu Oct 15 14:00:38 2020 -0700

[SPARK-30894][SQL][2.4] Make Size's nullable independent from SQL config 
changes

This is a backport of https://github.com/apache/spark/pull/27658

### What changes were proposed in this pull request?

In the PR, I propose to add the `legacySizeOfNull ` parameter to the `Size` 
expression, and pass the value of `spark.sql.legacy.sizeOfNull` if 
`legacySizeOfNull` is not provided on creation of `Size`.

### Why are the changes needed?

This allows to avoid the issue when the configuration change between 
different phases of planning, and this can silently break a query plan which 
can lead to crashes or data corruption.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

By `CollectionExpressionsSuite`.

Closes #30058 from anuragmantri/SPARK-30894-2.4.

Authored-by: Maxim Gekk 
Signed-off-by: Dongjoon Hyun 
---
 .../spark/sql/catalyst/expressions/collectionOperations.scala| 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
index 6d74f45..c8bc1e7 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
@@ -89,9 +89,10 @@ trait BinaryArrayExpressionWithImplicitCast extends 
BinaryExpression
   > SELECT _FUNC_(NULL);
-1
   """)
-case class Size(child: Expression) extends UnaryExpression with 
ExpectsInputTypes {
+case class Size(child: Expression, legacySizeOfNull: Boolean)
+  extends UnaryExpression with ExpectsInputTypes {
 
-  val legacySizeOfNull = SQLConf.get.legacySizeOfNull
+  def this(child: Expression) = this(child, SQLConf.get.legacySizeOfNull)
 
   override def dataType: DataType = IntegerType
   override def inputTypes: Seq[AbstractDataType] = 
Seq(TypeCollection(ArrayType, MapType))
@@ -123,6 +124,10 @@ case class Size(child: Expression) extends UnaryExpression 
with ExpectsInputType
   }
 }
 
+object Size {
+  def apply(child: Expression): Size = new Size(child)
+}
+
 /**
  * Returns an unordered array containing the keys of the map.
  */


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated: [SPARK-30894][SQL][2.4] Make Size's nullable independent from SQL config changes

2020-10-15 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new 4353f7d  [SPARK-30894][SQL][2.4] Make Size's nullable independent from 
SQL config changes
4353f7d is described below

commit 4353f7d961aba7f1f65066245215b08817663701
Author: Maxim Gekk 
AuthorDate: Thu Oct 15 14:00:38 2020 -0700

[SPARK-30894][SQL][2.4] Make Size's nullable independent from SQL config 
changes

This is a backport of https://github.com/apache/spark/pull/27658

### What changes were proposed in this pull request?

In the PR, I propose to add the `legacySizeOfNull ` parameter to the `Size` 
expression, and pass the value of `spark.sql.legacy.sizeOfNull` if 
`legacySizeOfNull` is not provided on creation of `Size`.

### Why are the changes needed?

This allows to avoid the issue when the configuration change between 
different phases of planning, and this can silently break a query plan which 
can lead to crashes or data corruption.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

By `CollectionExpressionsSuite`.

Closes #30058 from anuragmantri/SPARK-30894-2.4.

Authored-by: Maxim Gekk 
Signed-off-by: Dongjoon Hyun 
---
 .../spark/sql/catalyst/expressions/collectionOperations.scala| 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
index 6d74f45..c8bc1e7 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
@@ -89,9 +89,10 @@ trait BinaryArrayExpressionWithImplicitCast extends 
BinaryExpression
   > SELECT _FUNC_(NULL);
-1
   """)
-case class Size(child: Expression) extends UnaryExpression with 
ExpectsInputTypes {
+case class Size(child: Expression, legacySizeOfNull: Boolean)
+  extends UnaryExpression with ExpectsInputTypes {
 
-  val legacySizeOfNull = SQLConf.get.legacySizeOfNull
+  def this(child: Expression) = this(child, SQLConf.get.legacySizeOfNull)
 
   override def dataType: DataType = IntegerType
   override def inputTypes: Seq[AbstractDataType] = 
Seq(TypeCollection(ArrayType, MapType))
@@ -123,6 +124,10 @@ case class Size(child: Expression) extends UnaryExpression 
with ExpectsInputType
   }
 }
 
+object Size {
+  def apply(child: Expression): Size = new Size(child)
+}
+
 /**
  * Returns an unordered array containing the keys of the map.
  */


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (9e37464 -> ba69d68)

2020-10-15 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 9e37464  [SPARK-33078][SQL] Add config for json expression optimization
 add ba69d68  [SPARK-33080][BUILD] Replace fatal warnings snippet

No new revisions were added by this update.

Summary of changes:
 .../shuffle/HostLocalShuffleReadingSuite.scala |  1 +
 .../apache/spark/storage/BlockManagerSuite.scala   |  4 +-
 project/SparkBuild.scala   | 84 --
 .../sql/catalyst/optimizer/OptimizerSuite.scala|  2 +-
 .../spark/sql/catalyst/util/UnsafeArraySuite.scala |  3 +-
 .../apache/spark/sql/connector/InMemoryTable.scala |  8 +++
 .../spark/sql/streaming/StreamingQuerySuite.scala  |  2 +-
 .../spark/sql/hive/thriftserver/CliSuite.scala |  6 +-
 8 files changed, 62 insertions(+), 48 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (9e37464 -> ba69d68)

2020-10-15 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 9e37464  [SPARK-33078][SQL] Add config for json expression optimization
 add ba69d68  [SPARK-33080][BUILD] Replace fatal warnings snippet

No new revisions were added by this update.

Summary of changes:
 .../shuffle/HostLocalShuffleReadingSuite.scala |  1 +
 .../apache/spark/storage/BlockManagerSuite.scala   |  4 +-
 project/SparkBuild.scala   | 84 --
 .../sql/catalyst/optimizer/OptimizerSuite.scala|  2 +-
 .../spark/sql/catalyst/util/UnsafeArraySuite.scala |  3 +-
 .../apache/spark/sql/connector/InMemoryTable.scala |  8 +++
 .../spark/sql/streaming/StreamingQuerySuite.scala  |  2 +-
 .../spark/sql/hive/thriftserver/CliSuite.scala |  6 +-
 8 files changed, 62 insertions(+), 48 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (9e37464 -> ba69d68)

2020-10-15 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 9e37464  [SPARK-33078][SQL] Add config for json expression optimization
 add ba69d68  [SPARK-33080][BUILD] Replace fatal warnings snippet

No new revisions were added by this update.

Summary of changes:
 .../shuffle/HostLocalShuffleReadingSuite.scala |  1 +
 .../apache/spark/storage/BlockManagerSuite.scala   |  4 +-
 project/SparkBuild.scala   | 84 --
 .../sql/catalyst/optimizer/OptimizerSuite.scala|  2 +-
 .../spark/sql/catalyst/util/UnsafeArraySuite.scala |  3 +-
 .../apache/spark/sql/connector/InMemoryTable.scala |  8 +++
 .../spark/sql/streaming/StreamingQuerySuite.scala  |  2 +-
 .../spark/sql/hive/thriftserver/CliSuite.scala |  6 +-
 8 files changed, 62 insertions(+), 48 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (9e37464 -> ba69d68)

2020-10-15 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 9e37464  [SPARK-33078][SQL] Add config for json expression optimization
 add ba69d68  [SPARK-33080][BUILD] Replace fatal warnings snippet

No new revisions were added by this update.

Summary of changes:
 .../shuffle/HostLocalShuffleReadingSuite.scala |  1 +
 .../apache/spark/storage/BlockManagerSuite.scala   |  4 +-
 project/SparkBuild.scala   | 84 --
 .../sql/catalyst/optimizer/OptimizerSuite.scala|  2 +-
 .../spark/sql/catalyst/util/UnsafeArraySuite.scala |  3 +-
 .../apache/spark/sql/connector/InMemoryTable.scala |  8 +++
 .../spark/sql/streaming/StreamingQuerySuite.scala  |  2 +-
 .../spark/sql/hive/thriftserver/CliSuite.scala |  6 +-
 8 files changed, 62 insertions(+), 48 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (82eea13 -> 9e37464)

2020-10-15 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 82eea13  [SPARK-32915][CORE] Network-layer and shuffle RPC layer 
changes to support push shuffle blocks
 add 9e37464  [SPARK-33078][SQL] Add config for json expression optimization

No new revisions were added by this update.

Summary of changes:
 .../sql/catalyst/optimizer/OptimizeJsonExprs.scala  |  3 ++-
 .../org/apache/spark/sql/internal/SQLConf.scala | 11 +++
 .../catalyst/optimizer/OptimizeJsonExprsSuite.scala | 21 +
 3 files changed, 34 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (9e37464 -> ba69d68)

2020-10-15 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 9e37464  [SPARK-33078][SQL] Add config for json expression optimization
 add ba69d68  [SPARK-33080][BUILD] Replace fatal warnings snippet

No new revisions were added by this update.

Summary of changes:
 .../shuffle/HostLocalShuffleReadingSuite.scala |  1 +
 .../apache/spark/storage/BlockManagerSuite.scala   |  4 +-
 project/SparkBuild.scala   | 84 --
 .../sql/catalyst/optimizer/OptimizerSuite.scala|  2 +-
 .../spark/sql/catalyst/util/UnsafeArraySuite.scala |  3 +-
 .../apache/spark/sql/connector/InMemoryTable.scala |  8 +++
 .../spark/sql/streaming/StreamingQuerySuite.scala  |  2 +-
 .../spark/sql/hive/thriftserver/CliSuite.scala |  6 +-
 8 files changed, 62 insertions(+), 48 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (82eea13 -> 9e37464)

2020-10-15 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 82eea13  [SPARK-32915][CORE] Network-layer and shuffle RPC layer 
changes to support push shuffle blocks
 add 9e37464  [SPARK-33078][SQL] Add config for json expression optimization

No new revisions were added by this update.

Summary of changes:
 .../sql/catalyst/optimizer/OptimizeJsonExprs.scala  |  3 ++-
 .../org/apache/spark/sql/internal/SQLConf.scala | 11 +++
 .../catalyst/optimizer/OptimizeJsonExprsSuite.scala | 21 +
 3 files changed, 34 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (82eea13 -> 9e37464)

2020-10-15 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 82eea13  [SPARK-32915][CORE] Network-layer and shuffle RPC layer 
changes to support push shuffle blocks
 add 9e37464  [SPARK-33078][SQL] Add config for json expression optimization

No new revisions were added by this update.

Summary of changes:
 .../sql/catalyst/optimizer/OptimizeJsonExprs.scala  |  3 ++-
 .../org/apache/spark/sql/internal/SQLConf.scala | 11 +++
 .../catalyst/optimizer/OptimizeJsonExprsSuite.scala | 21 +
 3 files changed, 34 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (82eea13 -> 9e37464)

2020-10-15 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 82eea13  [SPARK-32915][CORE] Network-layer and shuffle RPC layer 
changes to support push shuffle blocks
 add 9e37464  [SPARK-33078][SQL] Add config for json expression optimization

No new revisions were added by this update.

Summary of changes:
 .../sql/catalyst/optimizer/OptimizeJsonExprs.scala  |  3 ++-
 .../org/apache/spark/sql/internal/SQLConf.scala | 11 +++
 .../catalyst/optimizer/OptimizeJsonExprsSuite.scala | 21 +
 3 files changed, 34 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (82eea13 -> 9e37464)

2020-10-15 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 82eea13  [SPARK-32915][CORE] Network-layer and shuffle RPC layer 
changes to support push shuffle blocks
 add 9e37464  [SPARK-33078][SQL] Add config for json expression optimization

No new revisions were added by this update.

Summary of changes:
 .../sql/catalyst/optimizer/OptimizeJsonExprs.scala  |  3 ++-
 .../org/apache/spark/sql/internal/SQLConf.scala | 11 +++
 .../catalyst/optimizer/OptimizeJsonExprsSuite.scala | 21 +
 3 files changed, 34 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-32915][CORE] Network-layer and shuffle RPC layer changes to support push shuffle blocks

2020-10-15 Thread mridulm80

This is an automated email from the ASF dual-hosted git repository.

mridulm80 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 82eea13  [SPARK-32915][CORE] Network-layer and shuffle RPC layer 
changes to support push shuffle blocks
82eea13 is described below

commit 82eea13c7686fb4bfbe8fb4185db81438d2ea884
Author: Min Shen 
AuthorDate: Thu Oct 15 12:34:52 2020 -0500

[SPARK-32915][CORE] Network-layer and shuffle RPC layer changes to support 
push shuffle blocks

### What changes were proposed in this pull request?

This is the first patch for SPIP SPARK-30602 for push-based shuffle.
Summary of changes:
* Introduce new API in ExternalBlockStoreClient to push blocks to a remote 
shuffle service.
* Leveraging the streaming upload functionality in SPARK-6237, it also 
enables the ExternalBlockHandler to delegate the handling of block push 
requests to MergedShuffleFileManager.
* Propose the API for MergedShuffleFileManager, where the core logic on the 
shuffle service side to handle block push requests is defined. The actual 
implementation of this API is deferred into a later RB to restrict the size of 
this PR.
* Introduce OneForOneBlockPusher to enable pushing blocks to remote shuffle 
services in shuffle RPC layer.
* New protocols in shuffle RPC layer to support the functionalities.

### Why are the changes needed?

Refer to the SPIP in SPARK-30602

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Added unit tests.
The reference PR with the consolidated changes covering the complete 
implementation is also provided in SPARK-30602.
We have already verified the functionality and the improved performance as 
documented in the SPIP doc.

Lead-authored-by: Min Shen 
Co-authored-by: Chandni Singh 
Co-authored-by: Ye Zhou 

Closes #29855 from Victsm/SPARK-32915.

Lead-authored-by: Min Shen 
Co-authored-by: Chandni Singh 
Co-authored-by: Ye Zhou 
Co-authored-by: Chandni Singh 
Co-authored-by: Min Shen 
Signed-off-by: Mridul Muralidharan gmail.com>
---
 common/network-common/pom.xml  |   4 +
 .../apache/spark/network/protocol/Encoders.java|  63 
 common/network-shuffle/pom.xml |   9 ++
 .../spark/network/shuffle/BlockStoreClient.java|  21 +++
 .../apache/spark/network/shuffle/ErrorHandler.java |  85 +++
 .../network/shuffle/ExternalBlockHandler.java  | 104 +-
 .../network/shuffle/ExternalBlockStoreClient.java  |  52 ++-
 .../spark/network/shuffle/MergedBlockMeta.java |  64 +
 .../network/shuffle/MergedShuffleFileManager.java  | 116 +++
 .../network/shuffle/OneForOneBlockPusher.java  | 123 
 .../network/shuffle/RetryingBlockFetcher.java  |  27 +++-
 .../shuffle/protocol/BlockTransferMessage.java |   6 +-
 .../shuffle/protocol/FinalizeShuffleMerge.java |  84 +++
 .../network/shuffle/protocol/MergeStatuses.java| 118 +++
 .../network/shuffle/protocol/PushBlockStream.java  |  95 
 .../spark/network/shuffle/ErrorHandlerSuite.java   |  51 +++
 .../network/shuffle/ExternalBlockHandlerSuite.java |  40 +-
 .../network/shuffle/OneForOneBlockPusherSuite.java | 159 +
 .../ExternalShuffleServiceMetricsSuite.scala   |   3 +-
 .../yarn/YarnShuffleServiceMetricsSuite.scala  |   2 +-
 .../network/yarn/YarnShuffleServiceSuite.scala |   1 +
 21 files changed, 1212 insertions(+), 15 deletions(-)

diff --git a/common/network-common/pom.xml b/common/network-common/pom.xml
index 9d5bc9a..d328a7d 100644
--- a/common/network-common/pom.xml
+++ b/common/network-common/pom.xml
@@ -91,6 +91,10 @@
   org.apache.commons
   commons-crypto
 
+
+  org.roaringbitmap
+  RoaringBitmap
+
 
 
 
diff --git 
a/common/network-common/src/main/java/org/apache/spark/network/protocol/Encoders.java
 
b/common/network-common/src/main/java/org/apache/spark/network/protocol/Encoders.java
index 490915f..4fa191b 100644
--- 
a/common/network-common/src/main/java/org/apache/spark/network/protocol/Encoders.java
+++ 
b/common/network-common/src/main/java/org/apache/spark/network/protocol/Encoders.java
@@ -17,9 +17,11 @@
 
 package org.apache.spark.network.protocol;
 
+import java.io.IOException;
 import java.nio.charset.StandardCharsets;
 
 import io.netty.buffer.ByteBuf;
+import org.roaringbitmap.RoaringBitmap;
 
 /** Provides a canonical set of Encoders for simple types. */
 public class Encoders {
@@ -44,6 +46,40 @@ public class Encoders {
 }
   }
 
+  /** Bitmaps are encoded with their serialization length followed by the 
serialization bytes. */
+  public static class Bitmaps {
+public static int

[spark] branch master updated (31f7097 -> b089fe5)

2020-10-15 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 31f7097  [SPARK-32402][SQL][FOLLOW-UP] Use quoted column name for 
JDBCTableCatalog.alterTable
 add b089fe5  [SPARK-32247][INFRA] Install and test scipy with PyPy in 
GitHub Actions

No new revisions were added by this update.

Summary of changes:
 .github/workflows/build_and_test.yml | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (31f7097 -> b089fe5)

2020-10-15 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 31f7097  [SPARK-32402][SQL][FOLLOW-UP] Use quoted column name for 
JDBCTableCatalog.alterTable
 add b089fe5  [SPARK-32247][INFRA] Install and test scipy with PyPy in 
GitHub Actions

No new revisions were added by this update.

Summary of changes:
 .github/workflows/build_and_test.yml | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (31f7097 -> b089fe5)

2020-10-15 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 31f7097  [SPARK-32402][SQL][FOLLOW-UP] Use quoted column name for 
JDBCTableCatalog.alterTable
 add b089fe5  [SPARK-32247][INFRA] Install and test scipy with PyPy in 
GitHub Actions

No new revisions were added by this update.

Summary of changes:
 .github/workflows/build_and_test.yml | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (31f7097 -> b089fe5)

2020-10-15 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 31f7097  [SPARK-32402][SQL][FOLLOW-UP] Use quoted column name for 
JDBCTableCatalog.alterTable
 add b089fe5  [SPARK-32247][INFRA] Install and test scipy with PyPy in 
GitHub Actions

No new revisions were added by this update.

Summary of changes:
 .github/workflows/build_and_test.yml | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (31f7097 -> b089fe5)

2020-10-15 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 31f7097  [SPARK-32402][SQL][FOLLOW-UP] Use quoted column name for 
JDBCTableCatalog.alterTable
 add b089fe5  [SPARK-32247][INFRA] Install and test scipy with PyPy in 
GitHub Actions

No new revisions were added by this update.

Summary of changes:
 .github/workflows/build_and_test.yml | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (513b6f5 -> 31f7097)

2020-10-15 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 513b6f5  [SPARK-33079][TESTS] Replace the existing Maven job for Scala 
2.13 in Github Actions with SBT job
 add 31f7097  [SPARK-32402][SQL][FOLLOW-UP] Use quoted column name for 
JDBCTableCatalog.alterTable

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/jdbc/DB2Dialect.scala |  5 +++--
 .../org/apache/spark/sql/jdbc/JdbcDialects.scala   | 25 -
 .../org/apache/spark/sql/jdbc/OracleDialect.scala  | 11 +
 .../v2/jdbc/JDBCTableCatalogSuite.scala| 26 +-
 4 files changed, 39 insertions(+), 28 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (513b6f5 -> 31f7097)

2020-10-15 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 513b6f5  [SPARK-33079][TESTS] Replace the existing Maven job for Scala 
2.13 in Github Actions with SBT job
 add 31f7097  [SPARK-32402][SQL][FOLLOW-UP] Use quoted column name for 
JDBCTableCatalog.alterTable

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/jdbc/DB2Dialect.scala |  5 +++--
 .../org/apache/spark/sql/jdbc/JdbcDialects.scala   | 25 -
 .../org/apache/spark/sql/jdbc/OracleDialect.scala  | 11 +
 .../v2/jdbc/JDBCTableCatalogSuite.scala| 26 +-
 4 files changed, 39 insertions(+), 28 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (513b6f5 -> 31f7097)

2020-10-15 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 513b6f5  [SPARK-33079][TESTS] Replace the existing Maven job for Scala 
2.13 in Github Actions with SBT job
 add 31f7097  [SPARK-32402][SQL][FOLLOW-UP] Use quoted column name for 
JDBCTableCatalog.alterTable

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/jdbc/DB2Dialect.scala |  5 +++--
 .../org/apache/spark/sql/jdbc/JdbcDialects.scala   | 25 -
 .../org/apache/spark/sql/jdbc/OracleDialect.scala  | 11 +
 .../v2/jdbc/JDBCTableCatalogSuite.scala| 26 +-
 4 files changed, 39 insertions(+), 28 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (513b6f5 -> 31f7097)

2020-10-15 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 513b6f5  [SPARK-33079][TESTS] Replace the existing Maven job for Scala 
2.13 in Github Actions with SBT job
 add 31f7097  [SPARK-32402][SQL][FOLLOW-UP] Use quoted column name for 
JDBCTableCatalog.alterTable

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/jdbc/DB2Dialect.scala |  5 +++--
 .../org/apache/spark/sql/jdbc/JdbcDialects.scala   | 25 -
 .../org/apache/spark/sql/jdbc/OracleDialect.scala  | 11 +
 .../v2/jdbc/JDBCTableCatalogSuite.scala| 26 +-
 4 files changed, 39 insertions(+), 28 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (513b6f5 -> 31f7097)

2020-10-15 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 513b6f5  [SPARK-33079][TESTS] Replace the existing Maven job for Scala 
2.13 in Github Actions with SBT job
 add 31f7097  [SPARK-32402][SQL][FOLLOW-UP] Use quoted column name for 
JDBCTableCatalog.alterTable

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/jdbc/DB2Dialect.scala |  5 +++--
 .../org/apache/spark/sql/jdbc/JdbcDialects.scala   | 25 -
 .../org/apache/spark/sql/jdbc/OracleDialect.scala  | 11 +
 .../v2/jdbc/JDBCTableCatalogSuite.scala| 26 +-
 4 files changed, 39 insertions(+), 28 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

svn commit: r41940 - /release/spark/KEYS

2020-10-15 Thread srowen

Author: srowen
Date: Thu Oct 15 13:18:43 2020
New Revision: 41940

Log:
Add missing key for Ruifeng to Spark KEYS

Modified:
release/spark/KEYS

Modified: release/spark/KEYS
==
--- release/spark/KEYS (original)
+++ release/spark/KEYS Thu Oct 15 13:18:43 2020
@@ -1413,3 +1413,60 @@ Hy4V/RJiJHCHekSXHCNoxgJz8Jc=
 =+90F
 -END PGP PUBLIC KEY BLOCK-
 
+pub   rsa4096 2020-08-05 [SC]
+  5146FBDC4B90744EA948035795E0EE38CF98F9F4
+uid   [ultimate] Ruifeng Zheng (CODE SIGNING KEY) 
+sub   rsa4096 2020-08-05 [E]
+
+-BEGIN PGP PUBLIC KEY BLOCK-
+
+mQINBF8qcTwBEADNwwXl2aEihlTGLo4uH4CHyF0Et2qJa0widBEj+LkQg1Alsxml
+Eqh/yea5QJObPmtfvIH8qgtUhOUUANH6+GY7XTtTrd4SU2jYupns1Z7HuTHx75IX
+oi2i2kzffWXPS4LMe9b7QjceHWsAIqKpmG2/tY1Wm9m0emwfa+qDNZaKQFAP+tnp
+24CVGUiNQbUyxDDUlpKHszB2Kw+pj/pFsNqAv30x2QweIVfGTYZAhzgzybR3Oid6
+8Bf1BbkWF9UH5at0Y2+Q9dvhMewRxgbW9jonA9OMy4EBfRqRzauYcjz0F7Pzy+Lk
+fd1/9SE4eFIGVts2XTT//AK0IUwoAdjmOT+aq9x1qSqxzrHqgIj5pssn7sPheUAB
+67a0oiM7r92a/URvskU4csI1LxWJz2oqTeRa1K7cmvw/4nxHqkNCizbXhVWNLiGH
+VC3tZZdgHliMCehCKmFFw9/r0F+XM0cJesUhhbfVL0rPLUaA7tZ5zefKaeDUpUDt
+JB/XFv5am02yInlT+n4Er6fxW9Pp0bEYgBVZY3Agr11VxcKFGhS3eb4iDl+obFN9
+UnuG7Vkm7l8j5NWPdkuzMzLG1+wdUbz9EcHhzt3NLutyo0nzt3uZiZjQONagIwhV
+5SvdTG6eS6QWxKPbgGETmqGaEqKMXbumXTnqgEHm82w2P4J9OU72X+rkPQARAQAB
+tDZSdWlmZW5nIFpoZW5nIChDT0RFIFNJR05JTkcgS0VZKSA8cnVpZmVuZ3pAYXBh
+Y2hlLm9yZz6JAk4EEwEKADgWIQRRRvvcS5B0TqlIA1eV4O44z5j59AUCXypxPAIb
+AwULCQgHAgYVCgkICwIEFgIDAQIeAQIXgAAKCRCV4O44z5j59P1rD/4mkpvICxd4
+tg7r5zgaVtQIaBwgjK9OnsStAiWkpe/PzG3Q0aDNGBO8vuwhI6LHhgU9fea3Mw0N
+tpTFB00qwagKckXTAX9hj2EVcjH6KxUEoDlGyEZHLsUgizzGLy8laF2XaHn/Bs8D
+fl41iF+fvl/XYD8y8f5F6eIWaJROx73Bjk22fWhndPJgtO4HeaL5/JOMdUvU12AE
+Ipk22YBm416rDYixJucoGLlGfRuxMAImlaPgM18NAb25biU8Rd15+c3HgDtVBrTI
+0C3XljKcio1cVAY1MyrcC0mKaTLIhsngD+DsjDItWzp8BYg3kHPFfh/8AMDNA960
+3ACcq436UdoqPzHqA/B6dRgw1M3F+dSlX24DzYZ3qz/sn2d2HmdkMO9+4epnk7lz
+gxwz14F0mTPKiH/rx4dXo4A/D/KurFA8Ed1Div4azDwlKkk5au0C8KrjJstEy27u
+5x41GtY5XoyI+lGGydMC6yrvoDPLxGLZaOIUgkN6hkz/BrkTZ/oEFybx4XxLkZg6
+gQVQTrtqsXZXEL5IEMD8mCP5TYrrTFRwBQNW6ngR7L7kYGb0ksB5TwIu0ZntRZIY
+XgVXMbBCM3ehAWdXR0oj25gtkLzRCZSAkPKK1uMaEbksRrb5uuAnX/F8LxAeunQM
+P2jbZ3ydT2pMPi8X1TYWCYa+56TaxjCzAbkCDQRfKnE8ARAAtG+2ME5GIjWPofPR
+KZkhlMnjbwYL6bVcy2vUmfzuM/sM2SjP8W3x/yPZA+HHfe7+FRaeBzcOhCBuYTKF
+K7F+fw1woljDOU1atVtBJu0MH7r47my/MPtuRg0bltT3AE3qJoAQZeDEefJvCcfZ
+TPmZN1jETjjPRe045zkhk9tFt1ZB7d8wk+yo3PWwp0iX2p9LkyiCLvYFBqs0McLW
+wQI4fgmeA5fiyMpJZJohZjR170Qbyk+QQ3Jri8EWeZvwJEfAPVxVMt1DOxPBv3PI
+2AfYM0V8brEVF/2N/Lorpt3LcN+mAhJfASy4RimvE08gj5nJn3+aA98B3uPCZ6AN
+IEOYIZPNWseYCWCqDHbiFFqaRIxnLfxgTygJzw8lvBAoBr15ZG5e6Xe4JRAn3Cvu
+frkMs4xlnqhFR1tzNezWLn/j7+dOVHzSiPTiKGAjwEiLvusaxNhkVKqrDu3QoPFu
+ogvtfyeSPVYcsP6F5IJ2LQzT5Cq8h+H1/+7/tQrhSWd/KAzRw5+rePuoecbaodfr
+VaG9sqSMe/GlCBuhqGG4Y3mFaHnemgZaCj4jm0wvjyPo1ik5V9j4TU6nKPEEOXX3
+x4mHHflEOWslHeT9xX2aG5dnh7bHQnJLbbNbEilJxXtKeeuA/iOyPq6+lHWVDJYf
+cDuYdAKr2Gzjffg3pfmN2zlOla8AEQEAAYkCNgQYAQoAIBYhBFFG+9xLkHROqUgD
+V5Xg7jjPmPn0BQJfKnE8AhsMAAoJEJXg7jjPmPn0N0UQAIZKhyKBnad4A791bx+4
+iHU/zglxq73nUfRoIy1pxt7Sa7YTSG3029Mj6fsHCr5tCHmcSS8leF28CAz8Qs8S
+UHf/i+aDk6wDk20V80jUYa6DkuUaolf2GxGBW3dwJKufq/L2lgPhN0R2MIL2gQM+
+M5EB+tpD+69laGrMVFqztSPcFpJjysnDKDiu5EFVD74zU8F9jn3kDD50DTx3LvrD
+JD/X5y6TaxUw1TAjdUgrkG/PARxJu3za4anHMiMfHah6Y6dz7ROtCKFMjWH25y28
+O8TMJnVUZdp6uLu3PzWjit9bwB7UuVVlBUQX9piMr/A5WtucpucLGwn7G0ejuJyE
+3Bq502QehItW6Ft0nlI8HGoecHXLQK3HUpLSf3BkBlXNz165iImG/RAgZUucbhHb
+u2Bmj4c9bQuZucQ4j3dUsXc3y4M8V14d5V1MXceWZ0sGkUcXEzJQnQcy98yn5b9K
+71zAI0i5UmtKXU/Xjss+WAfInBzpyq0bk9f9pur9UP7/2visiHQw70AfrSutXWiU
+HzpIypF5A8FUA+gcNsUUPkbm4JeTTxTxb0AEb6iBC5eYmDdehhcMeYnNnE/STejM
+5hUDBpGDAkbw0Wgolr/Qpxfxlkzstz8XSy2U6BVxkan1Oji889sTamWhHzLf7Ofo
+eGh3VPV1RM3YCRkGY7/1fheg
+=/4cF
+-END PGP PUBLIC KEY BLOCK-



-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (e85ed8a -> 513b6f5)

2020-10-15 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from e85ed8a  [SPARK-33156][INFRA] Upgrade GithubAction image from 18.04 to 
20.04
 add 513b6f5  [SPARK-33079][TESTS] Replace the existing Maven job for Scala 
2.13 in Github Actions with SBT job

No new revisions were added by this update.

Summary of changes:
 .github/workflows/build_and_test.yml | 16 ++--
 .../spark/streaming/kinesis/KinesisBackedBlockRDD.scala  |  2 +-
 2 files changed, 7 insertions(+), 11 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (e85ed8a -> 513b6f5)

2020-10-15 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from e85ed8a  [SPARK-33156][INFRA] Upgrade GithubAction image from 18.04 to 
20.04
 add 513b6f5  [SPARK-33079][TESTS] Replace the existing Maven job for Scala 
2.13 in Github Actions with SBT job

No new revisions were added by this update.

Summary of changes:
 .github/workflows/build_and_test.yml | 16 ++--
 .../spark/streaming/kinesis/KinesisBackedBlockRDD.scala  |  2 +-
 2 files changed, 7 insertions(+), 11 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (e85ed8a -> 513b6f5)

2020-10-15 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from e85ed8a  [SPARK-33156][INFRA] Upgrade GithubAction image from 18.04 to 
20.04
 add 513b6f5  [SPARK-33079][TESTS] Replace the existing Maven job for Scala 
2.13 in Github Actions with SBT job

No new revisions were added by this update.

Summary of changes:
 .github/workflows/build_and_test.yml | 16 ++--
 .../spark/streaming/kinesis/KinesisBackedBlockRDD.scala  |  2 +-
 2 files changed, 7 insertions(+), 11 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (e85ed8a -> 513b6f5)

2020-10-15 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from e85ed8a  [SPARK-33156][INFRA] Upgrade GithubAction image from 18.04 to 
20.04
 add 513b6f5  [SPARK-33079][TESTS] Replace the existing Maven job for Scala 
2.13 in Github Actions with SBT job

No new revisions were added by this update.

Summary of changes:
 .github/workflows/build_and_test.yml | 16 ++--
 .../spark/streaming/kinesis/KinesisBackedBlockRDD.scala  |  2 +-
 2 files changed, 7 insertions(+), 11 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[GitHub] [spark-website] ScrapCodes commented on pull request #295: Replace test-only to testOnly in Developer tools page

2020-10-15 Thread GitBox



ScrapCodes commented on pull request #295:
URL: https://github.com/apache/spark-website/pull/295#issuecomment-709241862


   Thanks, @HyukjinKwon. I should have done it. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (e85ed8a -> 513b6f5)

2020-10-15 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from e85ed8a  [SPARK-33156][INFRA] Upgrade GithubAction image from 18.04 to 
20.04
 add 513b6f5  [SPARK-33079][TESTS] Replace the existing Maven job for Scala 
2.13 in Github Actions with SBT job

No new revisions were added by this update.

Summary of changes:
 .github/workflows/build_and_test.yml | 16 ++--
 .../spark/streaming/kinesis/KinesisBackedBlockRDD.scala  |  2 +-
 2 files changed, 7 insertions(+), 11 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[GitHub] [spark-website] HyukjinKwon commented on pull request #295: Replace test-only to testOnly in Developer tools page

2020-10-15 Thread GitBox



HyukjinKwon commented on pull request #295:
URL: https://github.com/apache/spark-website/pull/295#issuecomment-709235613


   cc @ScrapCodes 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[GitHub] [spark-website] HyukjinKwon opened a new pull request #295: Replace test-only to testOnly in Developer tools page

2020-10-15 Thread GitBox



HyukjinKwon opened a new pull request #295:
URL: https://github.com/apache/spark-website/pull/295


   See also https://github.com/apache/spark/pull/30028. After SBT was upgraded 
to 1.3, `test-only` should be `testOnly`.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (8e7c390 -> e85ed8a)

2020-10-15 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 8e7c390  [SPARK-33155][K8S] spark.kubernetes.pyspark.pythonVersion 
allows only '3'
 add e85ed8a  [SPARK-33156][INFRA] Upgrade GithubAction image from 18.04 to 
20.04

No new revisions were added by this update.

Summary of changes:
 .github/workflows/build_and_test.yml | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (8e7c390 -> e85ed8a)

2020-10-15 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 8e7c390  [SPARK-33155][K8S] spark.kubernetes.pyspark.pythonVersion 
allows only '3'
 add e85ed8a  [SPARK-33156][INFRA] Upgrade GithubAction image from 18.04 to 
20.04

No new revisions were added by this update.

Summary of changes:
 .github/workflows/build_and_test.yml | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (8e7c390 -> e85ed8a)

2020-10-15 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 8e7c390  [SPARK-33155][K8S] spark.kubernetes.pyspark.pythonVersion 
allows only '3'
 add e85ed8a  [SPARK-33156][INFRA] Upgrade GithubAction image from 18.04 to 
20.04

No new revisions were added by this update.

Summary of changes:
 .github/workflows/build_and_test.yml | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (8e7c390 -> e85ed8a)

2020-10-15 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 8e7c390  [SPARK-33155][K8S] spark.kubernetes.pyspark.pythonVersion 
allows only '3'
 add e85ed8a  [SPARK-33156][INFRA] Upgrade GithubAction image from 18.04 to 
20.04

No new revisions were added by this update.

Summary of changes:
 .github/workflows/build_and_test.yml | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (8e7c390 -> e85ed8a)

2020-10-15 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 8e7c390  [SPARK-33155][K8S] spark.kubernetes.pyspark.pythonVersion 
allows only '3'
 add e85ed8a  [SPARK-33156][INFRA] Upgrade GithubAction image from 18.04 to 
20.04

No new revisions were added by this update.

Summary of changes:
 .github/workflows/build_and_test.yml | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (77a8efb -> 8e7c390)

2020-10-15 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 77a8efb  [SPARK-32932][SQL] Do not use local shuffle reader at final 
stage on write command
 add 8e7c390  [SPARK-33155][K8S] spark.kubernetes.pyspark.pythonVersion 
allows only '3'

No new revisions were added by this update.

Summary of changes:
 docs/running-on-kubernetes.md  | 2 +-
 .../core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala   | 6 +++---
 .../spark/deploy/k8s/features/DriverCommandFeatureStepSuite.scala  | 4 +---
 .../kubernetes/docker/src/main/dockerfiles/spark/entrypoint.sh | 7 +--
 .../spark/deploy/k8s/integrationtest/DecommissionSuite.scala   | 1 -
 .../apache/spark/deploy/k8s/integrationtest/PythonTestsSuite.scala | 4 +---
 resource-managers/kubernetes/integration-tests/tests/pyfiles.py| 2 +-
 7 files changed, 8 insertions(+), 18 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (77a8efb -> 8e7c390)

2020-10-15 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 77a8efb  [SPARK-32932][SQL] Do not use local shuffle reader at final 
stage on write command
 add 8e7c390  [SPARK-33155][K8S] spark.kubernetes.pyspark.pythonVersion 
allows only '3'

No new revisions were added by this update.

Summary of changes:
 docs/running-on-kubernetes.md  | 2 +-
 .../core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala   | 6 +++---
 .../spark/deploy/k8s/features/DriverCommandFeatureStepSuite.scala  | 4 +---
 .../kubernetes/docker/src/main/dockerfiles/spark/entrypoint.sh | 7 +--
 .../spark/deploy/k8s/integrationtest/DecommissionSuite.scala   | 1 -
 .../apache/spark/deploy/k8s/integrationtest/PythonTestsSuite.scala | 4 +---
 resource-managers/kubernetes/integration-tests/tests/pyfiles.py| 2 +-
 7 files changed, 8 insertions(+), 18 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (77a8efb -> 8e7c390)

2020-10-15 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 77a8efb  [SPARK-32932][SQL] Do not use local shuffle reader at final 
stage on write command
 add 8e7c390  [SPARK-33155][K8S] spark.kubernetes.pyspark.pythonVersion 
allows only '3'

No new revisions were added by this update.

Summary of changes:
 docs/running-on-kubernetes.md  | 2 +-
 .../core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala   | 6 +++---
 .../spark/deploy/k8s/features/DriverCommandFeatureStepSuite.scala  | 4 +---
 .../kubernetes/docker/src/main/dockerfiles/spark/entrypoint.sh | 7 +--
 .../spark/deploy/k8s/integrationtest/DecommissionSuite.scala   | 1 -
 .../apache/spark/deploy/k8s/integrationtest/PythonTestsSuite.scala | 4 +---
 resource-managers/kubernetes/integration-tests/tests/pyfiles.py| 2 +-
 7 files changed, 8 insertions(+), 18 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (77a8efb -> 8e7c390)

2020-10-15 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 77a8efb  [SPARK-32932][SQL] Do not use local shuffle reader at final 
stage on write command
 add 8e7c390  [SPARK-33155][K8S] spark.kubernetes.pyspark.pythonVersion 
allows only '3'

No new revisions were added by this update.

Summary of changes:
 docs/running-on-kubernetes.md  | 2 +-
 .../core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala   | 6 +++---
 .../spark/deploy/k8s/features/DriverCommandFeatureStepSuite.scala  | 4 +---
 .../kubernetes/docker/src/main/dockerfiles/spark/entrypoint.sh | 7 +--
 .../spark/deploy/k8s/integrationtest/DecommissionSuite.scala   | 1 -
 .../apache/spark/deploy/k8s/integrationtest/PythonTestsSuite.scala | 4 +---
 resource-managers/kubernetes/integration-tests/tests/pyfiles.py| 2 +-
 7 files changed, 8 insertions(+), 18 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (77a8efb -> 8e7c390)

2020-10-15 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 77a8efb  [SPARK-32932][SQL] Do not use local shuffle reader at final 
stage on write command
 add 8e7c390  [SPARK-33155][K8S] spark.kubernetes.pyspark.pythonVersion 
allows only '3'

No new revisions were added by this update.

Summary of changes:
 docs/running-on-kubernetes.md  | 2 +-
 .../core/src/main/scala/org/apache/spark/deploy/k8s/Config.scala   | 6 +++---
 .../spark/deploy/k8s/features/DriverCommandFeatureStepSuite.scala  | 4 +---
 .../kubernetes/docker/src/main/dockerfiles/spark/entrypoint.sh | 7 +--
 .../spark/deploy/k8s/integrationtest/DecommissionSuite.scala   | 1 -
 .../apache/spark/deploy/k8s/integrationtest/PythonTestsSuite.scala | 4 +---
 resource-managers/kubernetes/integration-tests/tests/pyfiles.py| 2 +-
 7 files changed, 8 insertions(+), 18 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (ec34a00 -> 77a8efb)

2020-10-15 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from ec34a00  [SPARK-33153][SQL][TESTS] Ignore Spark 2.4 in 
HiveExternalCatalogVersionsSuite on Python 3.8/3.9
 add 77a8efb  [SPARK-32932][SQL] Do not use local shuffle reader at final 
stage on write command

No new revisions were added by this update.

Summary of changes:
 .../execution/adaptive/AdaptiveSparkPlanExec.scala | 14 +-
 .../adaptive/AdaptiveQueryExecSuite.scala  | 51 +-
 2 files changed, 63 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (ec34a00 -> 77a8efb)

2020-10-15 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from ec34a00  [SPARK-33153][SQL][TESTS] Ignore Spark 2.4 in 
HiveExternalCatalogVersionsSuite on Python 3.8/3.9
 add 77a8efb  [SPARK-32932][SQL] Do not use local shuffle reader at final 
stage on write command

No new revisions were added by this update.

Summary of changes:
 .../execution/adaptive/AdaptiveSparkPlanExec.scala | 14 +-
 .../adaptive/AdaptiveQueryExecSuite.scala  | 51 +-
 2 files changed, 63 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

1 2 >

1 - 100 of 101 matches

Mail list logo