date:20210310

[spark] branch branch-3.0 updated: [SPARK-34696][SQL][TESTS] Fix CodegenInterpretedPlanTest to generate correct test cases

2021-03-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new da78e90  [SPARK-34696][SQL][TESTS] Fix CodegenInterpretedPlanTest to 
generate correct test cases
da78e90 is described below

commit da78e9093522e2416ef5bbb616f7cc315be48323
Author: Dongjoon Hyun 
AuthorDate: Wed Mar 10 23:41:49 2021 -0800

[SPARK-34696][SQL][TESTS] Fix CodegenInterpretedPlanTest to generate 
correct test cases

SPARK-23596 added `CodegenInterpretedPlanTest` at Apache Spark 2.4.0 in a 
wrong way because `withSQLConf` depends on the execution time `SQLConf.get` 
instead of `test` function declaration time. So, the following code executes 
the test twice without controlling the `CodegenObjectFactoryMode`. This PR aims 
to fix it correct and introduce a new function `testFallback`.

```scala
trait CodegenInterpretedPlanTest extends PlanTest {

   override protected def test(
   testName: String,
   testTags: Tag*)(testFun: => Any)(implicit pos: source.Position): 
Unit = {
 val codegenMode = CodegenObjectFactoryMode.CODEGEN_ONLY.toString
 val interpretedMode = CodegenObjectFactoryMode.NO_CODEGEN.toString

 withSQLConf(SQLConf.CODEGEN_FACTORY_MODE.key -> codegenMode) {
   super.test(testName + " (codegen path)", testTags: _*)(testFun)(pos)
 }
 withSQLConf(SQLConf.CODEGEN_FACTORY_MODE.key -> interpretedMode) {
   super.test(testName + " (interpreted path)", testTags: 
_*)(testFun)(pos)
 }
   }
 }
```

1. We need to use like the following.
```scala
super.test(testName + " (codegen path)", testTags: _*)(
   withSQLConf(SQLConf.CODEGEN_FACTORY_MODE.key -> codegenMode) { testFun 
})(pos)
super.test(testName + " (interpreted path)", testTags: _*)(
   withSQLConf(SQLConf.CODEGEN_FACTORY_MODE.key -> interpretedMode) { 
testFun })(pos)
```

2. After we fix this behavior with the above code, several test cases 
including SPARK-34596 and SPARK-34607 fail because they didn't work at both 
`CODEGEN` and `INTERPRETED` mode. Those test cases only work at `FALLBACK` 
mode. So, inevitably, we need to introduce `testFallback`.

No.

Pass the CIs.

Closes #31766 from dongjoon-hyun/SPARK-34596-SPARK-34607.

Lead-authored-by: Dongjoon Hyun 
Co-authored-by: Dongjoon Hyun 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit 5c4d8f95385ac97a66e5b491b5883ec770ae85bd)
Signed-off-by: Dongjoon Hyun 
---
 .../catalyst/encoders/ExpressionEncoderSuite.scala | 43 ++
 .../apache/spark/sql/catalyst/plans/PlanTest.scala | 18 ++---
 2 files changed, 40 insertions(+), 21 deletions(-)

diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoderSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoderSuite.scala
index d8dc9d3..ad46a20 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoderSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoderSuite.scala
@@ -161,15 +161,16 @@ class ExpressionEncoderSuite extends 
CodegenInterpretedPlanTest with AnalysisTes
   encodeDecodeTest(Seq(Seq("abc", "xyz"), Seq[String](null), null, Seq("1", 
null, "2")),
 "seq of seq of string")
 
-  encodeDecodeTest(Array(31, -123, 4), "array of int")
+  encodeDecodeTest(Array(31, -123, 4), "array of int", useFallback = true)
   encodeDecodeTest(Array("abc", "xyz"), "array of string")
   encodeDecodeTest(Array("a", null, "x"), "array of string with null")
-  encodeDecodeTest(Array.empty[Int], "empty array of int")
+  encodeDecodeTest(Array.empty[Int], "empty array of int", useFallback = true)
   encodeDecodeTest(Array.empty[String], "empty array of string")
 
-  encodeDecodeTest(Array(Array(31, -123), null, Array(4, 67)), "array of array 
of int")
+  encodeDecodeTest(Array(Array(31, -123), null, Array(4, 67)), "array of array 
of int",
+useFallback = true)
   encodeDecodeTest(Array(Array("abc", "xyz"), Array[String](null), null, 
Array("1", null, "2")),
-"array of array of string")
+"array of array of string", useFallback = true)
 
   encodeDecodeTest(Map(1 -> "a", 2 -> "b"), "map")
   encodeDecodeTest(Map(1 -> "a", 2 -> null), "map with null")
@@ -195,8 +196,9 @@ class ExpressionEncoderSuite extends 
CodegenInterpretedPlanTest with AnalysisTes
 encoderFor(Encoders.javaSerialization[JavaSerializable]))
 
   // test product encoders
-  private def productTest[T <: Product : ExpressionEncoder](input: T): Unit = {
-encodeDecodeTest(input, input.getClass.getSimpleName)
+  private def productTest[T <: Product : ExpressionEncoder](
+  input: T, useFallback: Boolean =

[spark] branch branch-3.1 updated: [SPARK-34696][SQL][TESTS] Fix CodegenInterpretedPlanTest to generate correct test cases

2021-03-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.1 by this push:
 new 2f88e77  [SPARK-34696][SQL][TESTS] Fix CodegenInterpretedPlanTest to 
generate correct test cases
2f88e77 is described below

commit 2f88e77031607d1a5d0cdda93ccc706ad45c78b5
Author: Dongjoon Hyun 
AuthorDate: Wed Mar 10 23:41:49 2021 -0800

[SPARK-34696][SQL][TESTS] Fix CodegenInterpretedPlanTest to generate 
correct test cases

### What changes were proposed in this pull request?

SPARK-23596 added `CodegenInterpretedPlanTest` at Apache Spark 2.4.0 in a 
wrong way because `withSQLConf` depends on the execution time `SQLConf.get` 
instead of `test` function declaration time. So, the following code executes 
the test twice without controlling the `CodegenObjectFactoryMode`. This PR aims 
to fix it correct and introduce a new function `testFallback`.

```scala
trait CodegenInterpretedPlanTest extends PlanTest {

   override protected def test(
   testName: String,
   testTags: Tag*)(testFun: => Any)(implicit pos: source.Position): 
Unit = {
 val codegenMode = CodegenObjectFactoryMode.CODEGEN_ONLY.toString
 val interpretedMode = CodegenObjectFactoryMode.NO_CODEGEN.toString

 withSQLConf(SQLConf.CODEGEN_FACTORY_MODE.key -> codegenMode) {
   super.test(testName + " (codegen path)", testTags: _*)(testFun)(pos)
 }
 withSQLConf(SQLConf.CODEGEN_FACTORY_MODE.key -> interpretedMode) {
   super.test(testName + " (interpreted path)", testTags: 
_*)(testFun)(pos)
 }
   }
 }
```

### Why are the changes needed?

1. We need to use like the following.
```scala
super.test(testName + " (codegen path)", testTags: _*)(
   withSQLConf(SQLConf.CODEGEN_FACTORY_MODE.key -> codegenMode) { testFun 
})(pos)
super.test(testName + " (interpreted path)", testTags: _*)(
   withSQLConf(SQLConf.CODEGEN_FACTORY_MODE.key -> interpretedMode) { 
testFun })(pos)
```

2. After we fix this behavior with the above code, several test cases 
including SPARK-34596 and SPARK-34607 fail because they didn't work at both 
`CODEGEN` and `INTERPRETED` mode. Those test cases only work at `FALLBACK` 
mode. So, inevitably, we need to introduce `testFallback`.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass the CIs.

Closes #31766 from dongjoon-hyun/SPARK-34596-SPARK-34607.

Lead-authored-by: Dongjoon Hyun 
Co-authored-by: Dongjoon Hyun 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit 5c4d8f95385ac97a66e5b491b5883ec770ae85bd)
Signed-off-by: Dongjoon Hyun 
---
 .../catalyst/encoders/ExpressionEncoderSuite.scala | 49 ++
 .../apache/spark/sql/catalyst/plans/PlanTest.scala | 18 +---
 2 files changed, 44 insertions(+), 23 deletions(-)

diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoderSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoderSuite.scala
index 095f6a9..6c2da4d3 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoderSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoderSuite.scala
@@ -161,15 +161,16 @@ class ExpressionEncoderSuite extends 
CodegenInterpretedPlanTest with AnalysisTes
   encodeDecodeTest(Seq(Seq("abc", "xyz"), Seq[String](null), null, Seq("1", 
null, "2")),
 "seq of seq of string")
 
-  encodeDecodeTest(Array(31, -123, 4), "array of int")
+  encodeDecodeTest(Array(31, -123, 4), "array of int", useFallback = true)
   encodeDecodeTest(Array("abc", "xyz"), "array of string")
   encodeDecodeTest(Array("a", null, "x"), "array of string with null")
-  encodeDecodeTest(Array.empty[Int], "empty array of int")
+  encodeDecodeTest(Array.empty[Int], "empty array of int", useFallback = true)
   encodeDecodeTest(Array.empty[String], "empty array of string")
 
-  encodeDecodeTest(Array(Array(31, -123), null, Array(4, 67)), "array of array 
of int")
+  encodeDecodeTest(Array(Array(31, -123), null, Array(4, 67)), "array of array 
of int",
+useFallback = true)
   encodeDecodeTest(Array(Array("abc", "xyz"), Array[String](null), null, 
Array("1", null, "2")),
-"array of array of string")
+"array of array of string", useFallback = true)
 
   encodeDecodeTest(Map(1 -> "a", 2 -> "b"), "map")
   encodeDecodeTest(Map(1 -> "a", 2 -> null), "map with null")
@@ -195,8 +196,9 @@ class ExpressionEncoderSuite extends 
CodegenInterpretedPlanTest with AnalysisTes
 encoderFor(Encoders.javaSerialization[JavaSerializable]))
 
   // test product encoders
-  private def productTest[T <: Product

[spark] branch master updated: [SPARK-34696][SQL][TESTS] Fix CodegenInterpretedPlanTest to generate correct test cases

2021-03-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 5c4d8f9  [SPARK-34696][SQL][TESTS] Fix CodegenInterpretedPlanTest to 
generate correct test cases
5c4d8f9 is described below

commit 5c4d8f95385ac97a66e5b491b5883ec770ae85bd
Author: Dongjoon Hyun 
AuthorDate: Wed Mar 10 23:41:49 2021 -0800

[SPARK-34696][SQL][TESTS] Fix CodegenInterpretedPlanTest to generate 
correct test cases

### What changes were proposed in this pull request?

SPARK-23596 added `CodegenInterpretedPlanTest` at Apache Spark 2.4.0 in a 
wrong way because `withSQLConf` depends on the execution time `SQLConf.get` 
instead of `test` function declaration time. So, the following code executes 
the test twice without controlling the `CodegenObjectFactoryMode`. This PR aims 
to fix it correct and introduce a new function `testFallback`.

```scala
trait CodegenInterpretedPlanTest extends PlanTest {

   override protected def test(
   testName: String,
   testTags: Tag*)(testFun: => Any)(implicit pos: source.Position): 
Unit = {
 val codegenMode = CodegenObjectFactoryMode.CODEGEN_ONLY.toString
 val interpretedMode = CodegenObjectFactoryMode.NO_CODEGEN.toString

 withSQLConf(SQLConf.CODEGEN_FACTORY_MODE.key -> codegenMode) {
   super.test(testName + " (codegen path)", testTags: _*)(testFun)(pos)
 }
 withSQLConf(SQLConf.CODEGEN_FACTORY_MODE.key -> interpretedMode) {
   super.test(testName + " (interpreted path)", testTags: 
_*)(testFun)(pos)
 }
   }
 }
```

### Why are the changes needed?

1. We need to use like the following.
```scala
super.test(testName + " (codegen path)", testTags: _*)(
   withSQLConf(SQLConf.CODEGEN_FACTORY_MODE.key -> codegenMode) { testFun 
})(pos)
super.test(testName + " (interpreted path)", testTags: _*)(
   withSQLConf(SQLConf.CODEGEN_FACTORY_MODE.key -> interpretedMode) { 
testFun })(pos)
```

2. After we fix this behavior with the above code, several test cases 
including SPARK-34596 and SPARK-34607 fail because they didn't work at both 
`CODEGEN` and `INTERPRETED` mode. Those test cases only work at `FALLBACK` 
mode. So, inevitably, we need to introduce `testFallback`.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass the CIs.

Closes #31766 from dongjoon-hyun/SPARK-34596-SPARK-34607.

Lead-authored-by: Dongjoon Hyun 
Co-authored-by: Dongjoon Hyun 
Signed-off-by: Dongjoon Hyun 
---
 .../catalyst/encoders/ExpressionEncoderSuite.scala | 49 ++
 .../apache/spark/sql/catalyst/plans/PlanTest.scala | 18 +---
 2 files changed, 44 insertions(+), 23 deletions(-)

diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoderSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoderSuite.scala
index 095f6a9..6c2da4d3 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoderSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoderSuite.scala
@@ -161,15 +161,16 @@ class ExpressionEncoderSuite extends 
CodegenInterpretedPlanTest with AnalysisTes
   encodeDecodeTest(Seq(Seq("abc", "xyz"), Seq[String](null), null, Seq("1", 
null, "2")),
 "seq of seq of string")
 
-  encodeDecodeTest(Array(31, -123, 4), "array of int")
+  encodeDecodeTest(Array(31, -123, 4), "array of int", useFallback = true)
   encodeDecodeTest(Array("abc", "xyz"), "array of string")
   encodeDecodeTest(Array("a", null, "x"), "array of string with null")
-  encodeDecodeTest(Array.empty[Int], "empty array of int")
+  encodeDecodeTest(Array.empty[Int], "empty array of int", useFallback = true)
   encodeDecodeTest(Array.empty[String], "empty array of string")
 
-  encodeDecodeTest(Array(Array(31, -123), null, Array(4, 67)), "array of array 
of int")
+  encodeDecodeTest(Array(Array(31, -123), null, Array(4, 67)), "array of array 
of int",
+useFallback = true)
   encodeDecodeTest(Array(Array("abc", "xyz"), Array[String](null), null, 
Array("1", null, "2")),
-"array of array of string")
+"array of array of string", useFallback = true)
 
   encodeDecodeTest(Map(1 -> "a", 2 -> "b"), "map")
   encodeDecodeTest(Map(1 -> "a", 2 -> null), "map with null")
@@ -195,8 +196,9 @@ class ExpressionEncoderSuite extends 
CodegenInterpretedPlanTest with AnalysisTes
 encoderFor(Encoders.javaSerialization[JavaSerializable]))
 
   // test product encoders
-  private def productTest[T <: Product : ExpressionEncoder](input: T): Unit = {
-encodeDecodeTest(input, input.getClass.getSimpleName)
+  private def

[spark] branch branch-3.1 updated: [MINOR][SQL] Remove unnecessary extend from BroadcastHashJoinExec

2021-03-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.1 by this push:
 new 424fd2b  [MINOR][SQL] Remove unnecessary extend from 
BroadcastHashJoinExec
424fd2b is described below

commit 424fd2b6c61add08849b14e6b2a81737a8a013c5
Author: Cheng Su 
AuthorDate: Wed Mar 10 23:38:53 2021 -0800

[MINOR][SQL] Remove unnecessary extend from BroadcastHashJoinExec

### What changes were proposed in this pull request?

This is just a minor fix. `HashJoin` already extends `JoinCodegenSupport`. 
So we don't need `CodegenSupport` here for `BroadcastHashJoinExec`. Submitted 
separately as a PR here per 
https://github.com/apache/spark/pull/31802#discussion_r592066686 .

### Why are the changes needed?

Clean up code.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Existing unit tests.

Closes #31805 from c21/bhj-minor.

Authored-by: Cheng Su 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit 14ad7afa1aa0f3bfd75f1bf076a27af792721190)
Signed-off-by: Dongjoon Hyun 
---
 .../org/apache/spark/sql/execution/joins/BroadcastHashJoinExec.scala| 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/joins/BroadcastHashJoinExec.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/joins/BroadcastHashJoinExec.scala
index 2a9e158..cec1286 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/joins/BroadcastHashJoinExec.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/joins/BroadcastHashJoinExec.scala
@@ -46,7 +46,7 @@ case class BroadcastHashJoinExec(
 left: SparkPlan,
 right: SparkPlan,
 isNullAwareAntiJoin: Boolean = false)
-  extends HashJoin with CodegenSupport {
+  extends HashJoin {
 
   if (isNullAwareAntiJoin) {
 require(leftKeys.length == 1, "leftKeys length should be 1")


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (9aa8f06 -> 14ad7af)

2021-03-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 9aa8f06  [SPARK-34711][SQL][TESTS] Exercise code-gen enable/disable 
code paths for SHJ in join test suites
 add 14ad7af  [MINOR][SQL] Remove unnecessary extend from 
BroadcastHashJoinExec

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/execution/joins/BroadcastHashJoinExec.scala| 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.1 updated: [SPARK-34711][SQL][TESTS] Exercise code-gen enable/disable code paths for SHJ in join test suites

2021-03-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.1 by this push:
 new ff8b94e  [SPARK-34711][SQL][TESTS] Exercise code-gen enable/disable 
code paths for SHJ in join test suites
ff8b94e is described below

commit ff8b94e6655a71736870508d980a7d924b3149a7
Author: Cheng Su 
AuthorDate: Wed Mar 10 23:34:09 2021 -0800

[SPARK-34711][SQL][TESTS] Exercise code-gen enable/disable code paths for 
SHJ in join test suites

### What changes were proposed in this pull request?

Per comment in 
https://github.com/apache/spark/pull/31802#discussion_r592068440 , we would 
like to exercise whole stage code-gen enabled and disabled code paths in join 
unit test suites. This is for better test coverage of shuffled hash join.

### Why are the changes needed?

Better test coverage.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Existing and added unit tests here.

Closes #31806 from c21/test-minor.

Authored-by: Cheng Su 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit 9aa8f06313f53747fc30e7e6719e31693d2cd3f0)
Signed-off-by: Dongjoon Hyun 
---
 .../org/apache/spark/sql/execution/joins/ExistenceJoinSuite.scala | 2 +-
 .../scala/org/apache/spark/sql/execution/joins/InnerJoinSuite.scala   | 4 ++--
 .../scala/org/apache/spark/sql/execution/joins/OuterJoinSuite.scala   | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/joins/ExistenceJoinSuite.scala
 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/joins/ExistenceJoinSuite.scala
index fcbc0da..13848e5 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/joins/ExistenceJoinSuite.scala
+++ 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/joins/ExistenceJoinSuite.scala
@@ -103,7 +103,7 @@ class ExistenceJoinSuite extends SparkPlanTest with 
SharedSparkSession {
   ProjectExec(output, FilterExec(condition, join))
 }
 
-test(s"$testName using ShuffledHashJoin") {
+testWithWholeStageCodegenOnAndOff(s"$testName using ShuffledHashJoin") { _ 
=>
   extractJoinParts().foreach { case (_, leftKeys, rightKeys, 
boundCondition, _, _, _) =>
 withSQLConf(SQLConf.SHUFFLE_PARTITIONS.key -> "1") {
   checkAnswer2(leftRows, rightRows, (left: SparkPlan, right: 
SparkPlan) =>
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/joins/InnerJoinSuite.scala
 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/joins/InnerJoinSuite.scala
index f476c15..cf05c6b 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/joins/InnerJoinSuite.scala
+++ 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/joins/InnerJoinSuite.scala
@@ -153,7 +153,7 @@ class InnerJoinSuite extends SparkPlanTest with 
SharedSparkSession {
   }
 }
 
-test(s"$testName using ShuffledHashJoin (build=left)") {
+testWithWholeStageCodegenOnAndOff(s"$testName using ShuffledHashJoin 
(build=left)") { _ =>
   extractJoinParts().foreach { case (_, leftKeys, rightKeys, 
boundCondition, _, _, _) =>
 withSQLConf(SQLConf.SHUFFLE_PARTITIONS.key -> "1") {
   checkAnswer2(leftRows, rightRows, (leftPlan: SparkPlan, rightPlan: 
SparkPlan) =>
@@ -165,7 +165,7 @@ class InnerJoinSuite extends SparkPlanTest with 
SharedSparkSession {
   }
 }
 
-test(s"$testName using ShuffledHashJoin (build=right)") {
+testWithWholeStageCodegenOnAndOff(s"$testName using ShuffledHashJoin 
(build=right)") { _ =>
   extractJoinParts().foreach { case (_, leftKeys, rightKeys, 
boundCondition, _, _, _) =>
 withSQLConf(SQLConf.SHUFFLE_PARTITIONS.key -> "1") {
   checkAnswer2(leftRows, rightRows, (leftPlan: SparkPlan, rightPlan: 
SparkPlan) =>
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/joins/OuterJoinSuite.scala
 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/joins/OuterJoinSuite.scala
index 238d37a..150d40d 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/joins/OuterJoinSuite.scala
+++ 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/joins/OuterJoinSuite.scala
@@ -104,7 +104,7 @@ class OuterJoinSuite extends SparkPlanTest with 
SharedSparkSession {
   ExtractEquiJoinKeys.unapply(join)
 }
 
-test(s"$testName using ShuffledHashJoin") {
+testWithWholeStageCodegenOnAndOff(s"$testName using ShuffledHashJoin") { _ 
=>
   extractJoinParts().foreach { case (_, leftKeys, rightKeys, 
boundCondition, _, _, _) =>
 withSQLConf(SQLConf.SHUFFLE_PARTITIONS.key -> "1") {
   val buildSide = if (joinType == LeftOuter) BuildRight else BuildLeft

[spark] branch master updated (7226379 -> 9aa8f06)

2021-03-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 7226379  [SPARK-34457][SQL] DataSource V2: Add default null ordering 
to SortDirection
 add 9aa8f06  [SPARK-34711][SQL][TESTS] Exercise code-gen enable/disable 
code paths for SHJ in join test suites

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/execution/joins/ExistenceJoinSuite.scala | 2 +-
 .../scala/org/apache/spark/sql/execution/joins/InnerJoinSuite.scala   | 4 ++--
 .../scala/org/apache/spark/sql/execution/joins/OuterJoinSuite.scala   | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (2a6e68e -> 7226379)

2021-03-10 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 2a6e68e  [SPARK-34546][SQL] AlterViewAs.query should be analyzed 
during the analysis phase, and AlterViewAs should invalidate the cache
 add 7226379  [SPARK-34457][SQL] DataSource V2: Add default null ordering 
to SortDirection

No new revisions were added by this update.

Summary of changes:
 .../sql/connector/expressions/Expressions.java  | 13 +
 .../sql/connector/expressions/SortDirection.java| 21 -
 2 files changed, 33 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (5e120e4 -> 2a6e68e)

2021-03-10 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 5e120e4  [SPARK-34507][BUILD] Update scala.version in parent POM when 
changing Scala version for cross-build
 add 2a6e68e  [SPARK-34546][SQL] AlterViewAs.query should be analyzed 
during the analysis phase, and AlterViewAs should invalidate the cache

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/analysis/Analyzer.scala |   2 +-
 .../analysis/UnsupportedOperationChecker.scala |   2 +-
 .../spark/sql/catalyst/catalog/interface.scala |  10 +-
 .../plans/logical/basicLogicalOperators.scala  |   8 +-
 .../sql/catalyst/plans/logical/v2Commands.scala|   2 +-
 .../spark/sql/catalyst/analysis/AnalysisTest.scala |   2 +-
 .../sql/catalyst/catalog/SessionCatalogSuite.scala |   2 +-
 .../catalyst/analysis/ResolveSessionCatalog.scala  |   2 +-
 .../apache/spark/sql/execution/command/views.scala | 177 -
 .../org/apache/spark/sql/CachedTableSuite.scala|  64 
 10 files changed, 183 insertions(+), 88 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated: [SPARK-34703][PYSPARK][2.4] Fix pyspark test when using sort_values on Pandas

2021-03-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new 906df15  [SPARK-34703][PYSPARK][2.4] Fix pyspark test when using 
sort_values on Pandas
906df15 is described below

commit 906df15f81c1e1c41a097f4230695da3a919227a
Author: Liang-Chi Hsieh 
AuthorDate: Wed Mar 10 18:42:11 2021 -0800

[SPARK-34703][PYSPARK][2.4] Fix pyspark test when using sort_values on 
Pandas

### What changes were proposed in this pull request?

This patch fixes a few PySpark test error related to Pandas, in order to 
restore 2.4 Jenkins builds.

### Why are the changes needed?

There are APIs changed since Pandas 0.24. If there are index and column 
name are the same, `sort_values` will throw error.

Three PySpark tests are currently failed in Jenkins 2.4 build: 
`test_column_order`, `test_complex_groupby`, `test_udf_with_key`:

```
==
ERROR: test_column_order (pyspark.sql.tests.GroupedMapPandasUDFTests)
--
Traceback (most recent call last):
  File "/spark/python/pyspark/sql/tests.py", line 5996, in test_column_order
expected = pd_result.sort_values(['id', 'v']).reset_index(drop=True)
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/frame.py", line 
4711, in sort_values
for x in by]
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/generic.py", 
line 1702, in _get_label_or_level_values
self._check_label_or_level_ambiguity(key, axis=axis)
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/generic.py", 
line 1656, in _check_label_or_level_ambiguity
raise ValueError(msg)
ValueError: 'id' is both an index level and a column label, which is 
ambiguous.

==
ERROR: test_complex_groupby (pyspark.sql.tests.GroupedMapPandasUDFTests)
--
Traceback (most recent call last):
  File "/spark/python/pyspark/sql/tests.py", line 5765, in 
test_complex_groupby
expected = expected.sort_values(['id', 'v']).reset_index(drop=True)
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/frame.py", line 
4711, in sort_values
for x in by]
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/generic.py", 
line 1702, in _get_label_or_level_values
self._check_label_or_level_ambiguity(key, axis=axis)
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/generic.py", 
line 1656, in _check_label_or_level_ambiguity
raise ValueError(msg)
ValueError: 'id' is both an index level and a column label, which is 
ambiguous.

==
ERROR: test_udf_with_key (pyspark.sql.tests.GroupedMapPandasUDFTests)
--
Traceback (most recent call last):
  File "/spark/python/pyspark/sql/tests.py", line 5922, in test_udf_with_key
.sort_values(['id', 'v']).reset_index(drop=True)
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/frame.py", line 
4711, in sort_values
for x in by]
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/generic.py", 
line 1702, in _get_label_or_level_values
self._check_label_or_level_ambiguity(key, axis=axis)
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/generic.py", 
line 1656, in _check_label_or_level_ambiguity
raise ValueError(msg)
ValueError: 'id' is both an index level and a column label, which is 
ambiguous.
```

### Does this PR introduce _any_ user-facing change?

No, dev only.

### How was this patch tested?

Verified by running the tests locally.

Closes #31803 from viirya/SPARK-34703.

Authored-by: Liang-Chi Hsieh 
Signed-off-by: Dongjoon Hyun 
---
 python/pyspark/sql/tests.py | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/python/pyspark/sql/tests.py b/python/pyspark/sql/tests.py
index 70f3882..e3b8e19 100644
--- a/python/pyspark/sql/tests.py
+++ b/python/pyspark/sql/tests.py
@@ -5761,7 +5761,7 @@ class GroupedMapPandasUDFTests(ReusedSQLTestCase):
 
 result = df.groupby(col('id') % 2 == 0).apply(normalize).sort('id', 
'v').toPandas()
 pdf = df.toPandas()
-expected = pdf.groupby(pdf['id'] % 2 == 0).apply(normalize.func)
+expected = pdf.groupby(pdf['id'] % 2 == 0, 
as_index=False).apply(normalize.func)
 expected = expected.sort_values(['id', 'v']).reset_index(drop=True)
 expected =

[spark] branch branch-2.4 updated (191b24c -> 7985360)

2021-03-10 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a change to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 191b24c  [SPARK-34672][BUILD][2.4] Fix docker file for creating release
 add 7985360  [SPARK-34507][BUILD] Update scala.version in parent POM when 
changing Scala version for cross-build

No new revisions were added by this update.

Summary of changes:
 dev/change-scala-version.sh | 6 ++
 1 file changed, 6 insertions(+)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-34507][BUILD] Update scala.version in parent POM when changing Scala version for cross-build

2021-03-10 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new e0aa67c  [SPARK-34507][BUILD] Update scala.version in parent POM when 
changing Scala version for cross-build
e0aa67c is described below

commit e0aa67c98424336bc6665142efbc892d02d65edb
Author: Sean Owen 
AuthorDate: Thu Mar 11 10:02:24 2021 +0900

[SPARK-34507][BUILD] Update scala.version in parent POM when changing Scala 
version for cross-build

### What changes were proposed in this pull request?

The `change-scala-version.sh` script updates Scala versions across the 
build for cross-build purposes. It manually changes `scala.binary.version` but 
not `scala.version`.

### Why are the changes needed?

It seems that this has always been an oversight, and the cross-built builds 
of Spark have an incorrect scala.version. See 2.4.5's 2.12 POM for example, 
which shows a Scala 2.11 version.
https://search.maven.org/artifact/org.apache.spark/spark-core_2.12/2.4.5/pom

More comments in the JIRA.

### Does this PR introduce _any_ user-facing change?

Should be a build-only bug fix.

### How was this patch tested?

Existing tests, but really N/A

Closes #31801 from srowen/SPARK-34507.

Authored-by: Sean Owen 
Signed-off-by: HyukjinKwon 
---
 dev/change-scala-version.sh | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/dev/change-scala-version.sh b/dev/change-scala-version.sh
index 06411b9..9cdc7d9 100755
--- a/dev/change-scala-version.sh
+++ b/dev/change-scala-version.sh
@@ -60,6 +60,12 @@ BASEDIR=$(dirname $0)/..
 find "$BASEDIR" -name 'pom.xml' -not -path '*target*' -print \
   -exec bash -c "sed_i 's/\(artifactId.*\)_'$FROM_VERSION'/\1_'$TO_VERSION'/g' 
{}" \;
 
+# Update  in parent POM
+# First find the right full version from the profile's build
+SCALA_VERSION=`build/mvn help:evaluate -Pscala-${TO_VERSION} 
-Dexpression=scala.version -q -DforceStdout`
+sed_i 
'1,/[0-9]*\.[0-9]*\.[0-9]*[0-9]*\.[0-9]*\.[0-9]*'$SCALA_VERSION' in parent POM
 # Match any scala binary version to ensure idempotency
 sed_i 
'1,/[0-9]*\.[0-9]*[0-9]*\.[0-9]*'$TO_VERSION'

[spark] branch branch-3.1 updated: [SPARK-34507][BUILD] Update scala.version in parent POM when changing Scala version for cross-build

2021-03-10 Thread srowen

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.1 by this push:
 new 4b38935  [SPARK-34507][BUILD] Update scala.version in parent POM when 
changing Scala version for cross-build
4b38935 is described below

commit 4b3893519aa734e034847a4c3c11d4882c1a2c0d
Author: Sean Owen 
AuthorDate: Thu Mar 11 10:02:24 2021 +0900

[SPARK-34507][BUILD] Update scala.version in parent POM when changing Scala 
version for cross-build

### What changes were proposed in this pull request?

The `change-scala-version.sh` script updates Scala versions across the 
build for cross-build purposes. It manually changes `scala.binary.version` but 
not `scala.version`.

### Why are the changes needed?

It seems that this has always been an oversight, and the cross-built builds 
of Spark have an incorrect scala.version. See 2.4.5's 2.12 POM for example, 
which shows a Scala 2.11 version.
https://search.maven.org/artifact/org.apache.spark/spark-core_2.12/2.4.5/pom

More comments in the JIRA.

### Does this PR introduce _any_ user-facing change?

Should be a build-only bug fix.

### How was this patch tested?

Existing tests, but really N/A

Closes #31801 from srowen/SPARK-34507.

Authored-by: Sean Owen 
Signed-off-by: HyukjinKwon 
---
 dev/change-scala-version.sh | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/dev/change-scala-version.sh b/dev/change-scala-version.sh
index 06411b9..9cdc7d9 100755
--- a/dev/change-scala-version.sh
+++ b/dev/change-scala-version.sh
@@ -60,6 +60,12 @@ BASEDIR=$(dirname $0)/..
 find "$BASEDIR" -name 'pom.xml' -not -path '*target*' -print \
   -exec bash -c "sed_i 's/\(artifactId.*\)_'$FROM_VERSION'/\1_'$TO_VERSION'/g' 
{}" \;
 
+# Update  in parent POM
+# First find the right full version from the profile's build
+SCALA_VERSION=`build/mvn help:evaluate -Pscala-${TO_VERSION} 
-Dexpression=scala.version -q -DforceStdout`
+sed_i 
'1,/[0-9]*\.[0-9]*\.[0-9]*[0-9]*\.[0-9]*\.[0-9]*'$SCALA_VERSION' in parent POM
 # Match any scala binary version to ensure idempotency
 sed_i 
'1,/[0-9]*\.[0-9]*[0-9]*\.[0-9]*'$TO_VERSION'

[spark] branch master updated (fc182f7 -> 5e120e4)

2021-03-10 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from fc182f7  [SPARK-34682][SQL] Use PrivateMethodTester instead of 
reflection
 add 5e120e4  [SPARK-34507][BUILD] Update scala.version in parent POM when 
changing Scala version for cross-build

No new revisions were added by this update.

Summary of changes:
 dev/change-scala-version.sh | 6 ++
 1 file changed, 6 insertions(+)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.1 updated: [SPARK-34682][SQL] Use PrivateMethodTester instead of reflection

2021-03-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.1 by this push:
 new f76f657  [SPARK-34682][SQL] Use PrivateMethodTester instead of 
reflection
f76f657 is described below

commit f76f657fd3335cba47e34bad680b71373b589ee3
Author: Andy Grove 
AuthorDate: Wed Mar 10 12:08:31 2021 -0800

[SPARK-34682][SQL] Use PrivateMethodTester instead of reflection

### Why are the changes needed?
SPARK-34682 was merged prematurely. This PR implements feedback from the 
review. I wasn't sure whether I should create a new JIRA or not.

### Does this PR introduce _any_ user-facing change?
No. Just improves the test.

### How was this patch tested?
Updated test.

Closes #31798 from andygrove/SPARK-34682-follow-up.

Authored-by: Andy Grove 
Signed-off-by: Dongjoon Hyun 
(cherry picked from commit fc182f7e7f9ff55a6a005044ae0968340cf6f30d)
Signed-off-by: Dongjoon Hyun 
---
 .../sql/execution/adaptive/AdaptiveQueryExecSuite.scala| 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/adaptive/AdaptiveQueryExecSuite.scala
 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/adaptive/AdaptiveQueryExecSuite.scala
index cdd1901..f7570c0 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/adaptive/AdaptiveQueryExecSuite.scala
+++ 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/adaptive/AdaptiveQueryExecSuite.scala
@@ -18,10 +18,10 @@
 package org.apache.spark.sql.execution.adaptive
 
 import java.io.File
-import java.lang.reflect.InvocationTargetException
 import java.net.URI
 
 import org.apache.log4j.Level
+import org.scalatest.PrivateMethodTester
 
 import org.apache.spark.scheduler.{SparkListener, SparkListenerEvent, 
SparkListenerJobStart}
 import org.apache.spark.sql.{Dataset, QueryTest, Row, SparkSession, Strategy}
@@ -46,7 +46,8 @@ import org.apache.spark.util.Utils
 class AdaptiveQueryExecSuite
   extends QueryTest
   with SharedSparkSession
-  with AdaptiveSparkPlanHelper {
+  with AdaptiveSparkPlanHelper
+  with PrivateMethodTester {
 
   import testImplicits._
 
@@ -881,12 +882,11 @@ class AdaptiveQueryExecSuite
   val reader = readers.head
   val c = reader.canonicalized.asInstanceOf[CustomShuffleReaderExec]
   // we can't just call execute() because that has separate checks for 
canonicalized plans
-  val doExecute = c.getClass.getMethod("doExecute")
-  doExecute.setAccessible(true)
-  val ex = intercept[InvocationTargetException] {
-doExecute.invoke(c)
+  val ex = intercept[IllegalStateException] {
+val doExecute = PrivateMethod[Unit](Symbol("doExecute"))
+c.invokePrivate(doExecute())
   }
-  assert(ex.getCause.getMessage === "operating on canonicalized plan")
+  assert(ex.getMessage === "operating on canonicalized plan")
 }
   }
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (fd48438 -> fc182f7)

2021-03-10 Thread dongjoon

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from fd48438  [SPARK-34682][SQL] Fix regression in canonicalization error 
check in CustomShuffleReaderExec
 add fc182f7  [SPARK-34682][SQL] Use PrivateMethodTester instead of 
reflection

No new revisions were added by this update.

Summary of changes:
 .../sql/execution/adaptive/AdaptiveQueryExecSuite.scala| 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.1 updated: [SPARK-34682][SQL] Fix regression in canonicalization error check in CustomShuffleReaderExec

2021-03-10 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.1 by this push:
 new adac454  [SPARK-34682][SQL] Fix regression in canonicalization error 
check in CustomShuffleReaderExec
adac454 is described below

commit adac45400de87ceda91429c4ac857ab02b54e19d
Author: Andy Grove 
AuthorDate: Wed Mar 10 20:48:00 2021 +0900

[SPARK-34682][SQL] Fix regression in canonicalization error check in 
CustomShuffleReaderExec

### What changes were proposed in this pull request?
There is a regression in 3.1.1 compared to 3.0.2 when checking for a 
canonicalized plan when executing CustomShuffleReaderExec.

The regression was caused by the call to `sendDriverMetrics` which happens 
before the check and will always fail if the plan is canonicalized.

### Why are the changes needed?
This is a regression in a useful error check.

### Does this PR introduce _any_ user-facing change?
No. This is not an error that a user would typically see, as far as I know.

### How was this patch tested?
I tested this change locally by making a distribution from this PR branch. 
Before fixing the regression I saw:

```
java.util.NoSuchElementException: key not found: numPartitions
```

After fixing this regression I saw:

```
java.lang.IllegalStateException: operating on canonicalized plan
```

Closes #31793 from andygrove/SPARK-34682.

Lead-authored-by: Andy Grove 
Co-authored-by: Andy Grove 
Signed-off-by: HyukjinKwon 
(cherry picked from commit fd4843803c4670c656a94c1af652fb4b945bc82c)
Signed-off-by: HyukjinKwon 
---
 .../adaptive/CustomShuffleReaderExec.scala  | 12 ++--
 .../execution/adaptive/AdaptiveQueryExecSuite.scala | 21 +
 2 files changed, 27 insertions(+), 6 deletions(-)

diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/CustomShuffleReaderExec.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/CustomShuffleReaderExec.scala
index 49a4c25..2319c9e 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/CustomShuffleReaderExec.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/CustomShuffleReaderExec.scala
@@ -179,12 +179,12 @@ case class CustomShuffleReaderExec private(
   }
 
   private lazy val shuffleRDD: RDD[_] = {
-sendDriverMetrics()
-
-shuffleStage.map { stage =>
-  stage.shuffle.getShuffleRDD(partitionSpecs.toArray)
-}.getOrElse {
-  throw new IllegalStateException("operating on canonicalized plan")
+shuffleStage match {
+  case Some(stage) =>
+sendDriverMetrics()
+stage.shuffle.getShuffleRDD(partitionSpecs.toArray)
+  case _ =>
+throw new IllegalStateException("operating on canonicalized plan")
 }
   }
 
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/adaptive/AdaptiveQueryExecSuite.scala
 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/adaptive/AdaptiveQueryExecSuite.scala
index 92f7f40..cdd1901 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/adaptive/AdaptiveQueryExecSuite.scala
+++ 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/adaptive/AdaptiveQueryExecSuite.scala
@@ -18,6 +18,7 @@
 package org.apache.spark.sql.execution.adaptive
 
 import java.io.File
+import java.lang.reflect.InvocationTargetException
 import java.net.URI
 
 import org.apache.log4j.Level
@@ -869,6 +870,26 @@ class AdaptiveQueryExecSuite
 }
   }
 
+  test("SPARK-34682: CustomShuffleReaderExec operating on canonicalized plan") 
{
+withSQLConf(SQLConf.ADAPTIVE_EXECUTION_ENABLED.key -> "true") {
+  val (_, adaptivePlan) = runAdaptiveAndVerifyResult(
+"SELECT key FROM testData GROUP BY key")
+  val readers = collect(adaptivePlan) {
+case r: CustomShuffleReaderExec => r
+  }
+  assert(readers.length == 1)
+  val reader = readers.head
+  val c = reader.canonicalized.asInstanceOf[CustomShuffleReaderExec]
+  // we can't just call execute() because that has separate checks for 
canonicalized plans
+  val doExecute = c.getClass.getMethod("doExecute")
+  doExecute.setAccessible(true)
+  val ex = intercept[InvocationTargetException] {
+doExecute.invoke(c)
+  }
+  assert(ex.getCause.getMessage === "operating on canonicalized plan")
+}
+  }
+
   test("metrics of the shuffle reader") {
 withSQLConf(SQLConf.ADAPTIVE_EXECUTION_ENABLED.key -> "true") {
   val (_, adaptivePlan) = runAdaptiveAndVerifyResult(


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (a916690 -> fd48438)

2021-03-10 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from a916690  [SPARK-34681][SQL] Fix bug for full outer shuffled hash join 
when building left side with non-equal condition
 add fd48438  [SPARK-34682][SQL] Fix regression in canonicalization error 
check in CustomShuffleReaderExec

No new revisions were added by this update.

Summary of changes:
 .../adaptive/CustomShuffleReaderExec.scala  | 12 ++--
 .../execution/adaptive/AdaptiveQueryExecSuite.scala | 21 +
 2 files changed, 27 insertions(+), 6 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-34683][SQL][DOCS][3.0] Update the documents to explain the usage of LIST FILE and LIST JAR in case they take multiple file names

2021-03-10 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new d5579c8  [SPARK-34683][SQL][DOCS][3.0] Update the documents to explain 
the usage of LIST FILE and LIST JAR in case they take multiple file names
d5579c8 is described below

commit d5579c89c4de61217b68f003d3c7f8f24fd60292
Author: Kousuke Saruta 
AuthorDate: Wed Mar 10 20:12:43 2021 +0900

[SPARK-34683][SQL][DOCS][3.0] Update the documents to explain the usage of 
LIST FILE and LIST JAR in case they take multiple file names

### What changes were proposed in this pull request?

This PR partially backports the change of #31721 (SPARK-34603).
This PR improves documents to explain `LIST FILE` and `LIST JAR` commands 
can take multiple file names.

### Why are the changes needed?

To explain the usage more correctly.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Built documents and confirmed.

Closes #31794 from sarutak/fix-add-file-docs-3.0.

Authored-by: Kousuke Saruta 
Signed-off-by: HyukjinKwon 
---
 docs/sql-ref-syntax-aux-resource-mgmt-list-file.md | 2 +-
 docs/sql-ref-syntax-aux-resource-mgmt-list-jar.md  | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/docs/sql-ref-syntax-aux-resource-mgmt-list-file.md 
b/docs/sql-ref-syntax-aux-resource-mgmt-list-file.md
index 9b9a7df..5300232 100644
--- a/docs/sql-ref-syntax-aux-resource-mgmt-list-file.md
+++ b/docs/sql-ref-syntax-aux-resource-mgmt-list-file.md
@@ -26,7 +26,7 @@ license: |
 ### Syntax
 
 ```sql
-LIST FILE
+LIST { FILE | FILES } file_name [ ... ]
 ```
 
 ### Examples
diff --git a/docs/sql-ref-syntax-aux-resource-mgmt-list-jar.md 
b/docs/sql-ref-syntax-aux-resource-mgmt-list-jar.md
index 04aa52c..cfe8def 100644
--- a/docs/sql-ref-syntax-aux-resource-mgmt-list-jar.md
+++ b/docs/sql-ref-syntax-aux-resource-mgmt-list-jar.md
@@ -26,7 +26,7 @@ license: |
 ### Syntax
 
 ```sql
-LIST JAR
+LIST { JAR | JARS } file_name [ ... ]
 ```
 
 ### Examples


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.1 updated: [SPARK-34683][SQL][DOCS][3.1] Update the documents to explain the usage of LIST FILE and LIST JAR in case they take multiple file names

2021-03-10 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.1 by this push:
 new 61e0fed  [SPARK-34683][SQL][DOCS][3.1] Update the documents to explain 
the usage of LIST FILE and LIST JAR in case they take multiple file names
61e0fed is described below

commit 61e0fedd8896e5c054e79fb61d6d8e6b933cfe7e
Author: Kousuke Saruta 
AuthorDate: Wed Mar 10 20:12:00 2021 +0900

[SPARK-34683][SQL][DOCS][3.1] Update the documents to explain the usage of 
LIST FILE and LIST JAR in case they take multiple file names

### What changes were proposed in this pull request?

This PR partially backports the change of #31721 (SPARK-34603).
This PR improves documents to explain `LIST FILE` and `LIST JAR` commands 
can take multiple file names.

### Why are the changes needed?

To explain the usage more correctly.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Built documents and confirmed.

Closes #31795 from sarutak/fix-add-file-docs-3.1.

Authored-by: Kousuke Saruta 
Signed-off-by: HyukjinKwon 
---
 docs/sql-ref-syntax-aux-resource-mgmt-list-file.md | 2 +-
 docs/sql-ref-syntax-aux-resource-mgmt-list-jar.md  | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/docs/sql-ref-syntax-aux-resource-mgmt-list-file.md 
b/docs/sql-ref-syntax-aux-resource-mgmt-list-file.md
index 9b9a7df..5300232 100644
--- a/docs/sql-ref-syntax-aux-resource-mgmt-list-file.md
+++ b/docs/sql-ref-syntax-aux-resource-mgmt-list-file.md
@@ -26,7 +26,7 @@ license: |
 ### Syntax
 
 ```sql
-LIST FILE
+LIST { FILE | FILES } file_name [ ... ]
 ```
 
 ### Examples
diff --git a/docs/sql-ref-syntax-aux-resource-mgmt-list-jar.md 
b/docs/sql-ref-syntax-aux-resource-mgmt-list-jar.md
index 04aa52c..cfe8def 100644
--- a/docs/sql-ref-syntax-aux-resource-mgmt-list-jar.md
+++ b/docs/sql-ref-syntax-aux-resource-mgmt-list-jar.md
@@ -26,7 +26,7 @@ license: |
 ### Syntax
 
 ```sql
-LIST JAR
+LIST { JAR | JARS } file_name [ ... ]
 ```
 
 ### Examples


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-34696][SQL][TESTS] Fix CodegenInterpretedPlanTest to generate correct test cases

[spark] branch branch-3.1 updated: [SPARK-34696][SQL][TESTS] Fix CodegenInterpretedPlanTest to generate correct test cases

[spark] branch master updated: [SPARK-34696][SQL][TESTS] Fix CodegenInterpretedPlanTest to generate correct test cases

[spark] branch branch-3.1 updated: [MINOR][SQL] Remove unnecessary extend from BroadcastHashJoinExec

[spark] branch master updated (9aa8f06 -> 14ad7af)

[spark] branch branch-3.1 updated: [SPARK-34711][SQL][TESTS] Exercise code-gen enable/disable code paths for SHJ in join test suites

[spark] branch master updated (7226379 -> 9aa8f06)

[spark] branch master updated (2a6e68e -> 7226379)

[spark] branch master updated (5e120e4 -> 2a6e68e)

[spark] branch branch-2.4 updated: [SPARK-34703][PYSPARK][2.4] Fix pyspark test when using sort_values on Pandas

[spark] branch branch-2.4 updated (191b24c -> 7985360)

[spark] branch branch-3.0 updated: [SPARK-34507][BUILD] Update scala.version in parent POM when changing Scala version for cross-build

[spark] branch branch-3.1 updated: [SPARK-34507][BUILD] Update scala.version in parent POM when changing Scala version for cross-build

[spark] branch master updated (fc182f7 -> 5e120e4)

[spark] branch branch-3.1 updated: [SPARK-34682][SQL] Use PrivateMethodTester instead of reflection

[spark] branch master updated (fd48438 -> fc182f7)

[spark] branch branch-3.1 updated: [SPARK-34682][SQL] Fix regression in canonicalization error check in CustomShuffleReaderExec

[spark] branch master updated (a916690 -> fd48438)

[spark] branch branch-3.0 updated: [SPARK-34683][SQL][DOCS][3.0] Update the documents to explain the usage of LIST FILE and LIST JAR in case they take multiple file names

[spark] branch branch-3.1 updated: [SPARK-34683][SQL][DOCS][3.1] Update the documents to explain the usage of LIST FILE and LIST JAR in case they take multiple file names

20 matches

Site Navigation

Mail list logo

Footer information