date:20201110

[spark] branch master updated (4b36797 -> 8760032)

2020-11-10 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 4b36797  [SPARK-33417][SQL][TEST] Correct the behaviour of query 
filters in TPCDSQueryBenchmark
 add 8760032  [SPARK-33412][SQL] OverwriteByExpression should resolve its 
delete condition based on the table relation not the input query

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/catalyst/analysis/Analyzer.scala   |  9 -
 .../spark/sql/catalyst/plans/logical/v2Commands.scala   |  3 ++-
 .../catalyst/analysis/DataSourceV2AnalysisSuite.scala   | 17 -
 3 files changed, 18 insertions(+), 11 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated: [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark

2020-11-10 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new fece4a3  [SPARK-33417][SQL][TEST] Correct the behaviour of query 
filters in TPCDSQueryBenchmark
fece4a3 is described below

commit fece4a3a36e23c7b99d6cb64e0c4484c9e17235f
Author: Takeshi Yamamuro 
AuthorDate: Wed Nov 11 15:24:05 2020 +0900

[SPARK-33417][SQL][TEST] Correct the behaviour of query filters in 
TPCDSQueryBenchmark

### What changes were proposed in this pull request?

This PR intends to fix the behaviour of query filters in 
`TPCDSQueryBenchmark`. We can use an option `--query-filter` for selecting 
TPCDS queries to run, e.g., `--query-filter q6,q8,q13`. But, the current master 
has a weird behaviour about the option. For example, if we pass `--query-filter 
q6` so as to run the TPCDS q6 only, `TPCDSQueryBenchmark` runs `q6` and 
`q6-v2.7` because the `filterQueries` method does not respect the name suffix. 
So, there is no way now to run the TPCDS q6 only.

### Why are the changes needed?

Bugfix.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Manually checked.

Closes #30324 from maropu/FilterBugInTPCDSQueryBenchmark.

Authored-by: Takeshi Yamamuro 
Signed-off-by: Takeshi Yamamuro 
(cherry picked from commit 4b367976a877adb981f65d546e1522fdf30d0731)
Signed-off-by: Takeshi Yamamuro 
---
 .../execution/benchmark/TPCDSQueryBenchmark.scala   | 21 ++---
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala
 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala
index fccee97..1f8b057 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala
+++ 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala
@@ -90,11 +90,16 @@ object TPCDSQueryBenchmark extends Logging {
 }
   }
 
-  def filterQueries(
+  private def filterQueries(
   origQueries: Seq[String],
-  args: TPCDSQueryBenchmarkArguments): Seq[String] = {
-if (args.queryFilter.nonEmpty) {
-  origQueries.filter(args.queryFilter.contains)
+  queryFilter: Set[String],
+  nameSuffix: String = ""): Seq[String] = {
+if (queryFilter.nonEmpty) {
+  if (nameSuffix.nonEmpty) {
+origQueries.filter { name => queryFilter.contains(s"$name$nameSuffix") 
}
+  } else {
+origQueries.filter(queryFilter.contains)
+  }
 } else {
   origQueries
 }
@@ -117,6 +122,7 @@ object TPCDSQueryBenchmark extends Logging {
   "q91", "q92", "q93", "q94", "q95", "q96", "q97", "q98", "q99")
 
 // This list only includes TPC-DS v2.7 queries that are different from 
v1.4 ones
+val nameSuffixForQueriesV2_7 = "-v2.7"
 val tpcdsQueriesV2_7 = Seq(
   "q5a", "q6", "q10a", "q11", "q12", "q14", "q14a", "q18a",
   "q20", "q22", "q22a", "q24", "q27a", "q34", "q35", "q35a", "q36a", 
"q47", "q49",
@@ -124,8 +130,9 @@ object TPCDSQueryBenchmark extends Logging {
   "q80a", "q86a", "q98")
 
 // If `--query-filter` defined, filters the queries that this option 
selects
-val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs)
-val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs)
+val queriesV1_4ToRun = filterQueries(tpcdsQueries, 
benchmarkArgs.queryFilter)
+val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, 
benchmarkArgs.queryFilter,
+  nameSuffix = nameSuffixForQueriesV2_7)
 
 if ((queriesV1_4ToRun ++ queriesV2_7ToRun).isEmpty) {
   throw new RuntimeException(
@@ -135,6 +142,6 @@ object TPCDSQueryBenchmark extends Logging {
 val tableSizes = setupTables(benchmarkArgs.dataLocation)
 runTpcdsQueries(queryLocation = "tpcds", queries = queriesV1_4ToRun, 
tableSizes)
 runTpcdsQueries(queryLocation = "tpcds-v2.7.0", queries = 
queriesV2_7ToRun, tableSizes,
-  nameSuffix = "-v2.7")
+  nameSuffix = nameSuffixForQueriesV2_7)
   }
 }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated (4a1c143 -> 577dbb9)

2020-11-10 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 4a1c143  [SPARK-9][PYTHON] Pyspark application will hang due to 
non Exception error
 add 577dbb9  [SPARK-33417][SQL][TEST] Correct the behaviour of query 
filters in TPCDSQueryBenchmark

No new revisions were added by this update.

Summary of changes:
 .../execution/benchmark/TPCDSQueryBenchmark.scala   | 21 ++---
 1 file changed, 14 insertions(+), 7 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (4b36797 -> 8760032)

2020-11-10 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 4b36797  [SPARK-33417][SQL][TEST] Correct the behaviour of query 
filters in TPCDSQueryBenchmark
 add 8760032  [SPARK-33412][SQL] OverwriteByExpression should resolve its 
delete condition based on the table relation not the input query

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/catalyst/analysis/Analyzer.scala   |  9 -
 .../spark/sql/catalyst/plans/logical/v2Commands.scala   |  3 ++-
 .../catalyst/analysis/DataSourceV2AnalysisSuite.scala   | 17 -
 3 files changed, 18 insertions(+), 11 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated: [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark

2020-11-10 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new fece4a3  [SPARK-33417][SQL][TEST] Correct the behaviour of query 
filters in TPCDSQueryBenchmark
fece4a3 is described below

commit fece4a3a36e23c7b99d6cb64e0c4484c9e17235f
Author: Takeshi Yamamuro 
AuthorDate: Wed Nov 11 15:24:05 2020 +0900

[SPARK-33417][SQL][TEST] Correct the behaviour of query filters in 
TPCDSQueryBenchmark

### What changes were proposed in this pull request?

This PR intends to fix the behaviour of query filters in 
`TPCDSQueryBenchmark`. We can use an option `--query-filter` for selecting 
TPCDS queries to run, e.g., `--query-filter q6,q8,q13`. But, the current master 
has a weird behaviour about the option. For example, if we pass `--query-filter 
q6` so as to run the TPCDS q6 only, `TPCDSQueryBenchmark` runs `q6` and 
`q6-v2.7` because the `filterQueries` method does not respect the name suffix. 
So, there is no way now to run the TPCDS q6 only.

### Why are the changes needed?

Bugfix.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Manually checked.

Closes #30324 from maropu/FilterBugInTPCDSQueryBenchmark.

Authored-by: Takeshi Yamamuro 
Signed-off-by: Takeshi Yamamuro 
(cherry picked from commit 4b367976a877adb981f65d546e1522fdf30d0731)
Signed-off-by: Takeshi Yamamuro 
---
 .../execution/benchmark/TPCDSQueryBenchmark.scala   | 21 ++---
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala
 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala
index fccee97..1f8b057 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala
+++ 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala
@@ -90,11 +90,16 @@ object TPCDSQueryBenchmark extends Logging {
 }
   }
 
-  def filterQueries(
+  private def filterQueries(
   origQueries: Seq[String],
-  args: TPCDSQueryBenchmarkArguments): Seq[String] = {
-if (args.queryFilter.nonEmpty) {
-  origQueries.filter(args.queryFilter.contains)
+  queryFilter: Set[String],
+  nameSuffix: String = ""): Seq[String] = {
+if (queryFilter.nonEmpty) {
+  if (nameSuffix.nonEmpty) {
+origQueries.filter { name => queryFilter.contains(s"$name$nameSuffix") 
}
+  } else {
+origQueries.filter(queryFilter.contains)
+  }
 } else {
   origQueries
 }
@@ -117,6 +122,7 @@ object TPCDSQueryBenchmark extends Logging {
   "q91", "q92", "q93", "q94", "q95", "q96", "q97", "q98", "q99")
 
 // This list only includes TPC-DS v2.7 queries that are different from 
v1.4 ones
+val nameSuffixForQueriesV2_7 = "-v2.7"
 val tpcdsQueriesV2_7 = Seq(
   "q5a", "q6", "q10a", "q11", "q12", "q14", "q14a", "q18a",
   "q20", "q22", "q22a", "q24", "q27a", "q34", "q35", "q35a", "q36a", 
"q47", "q49",
@@ -124,8 +130,9 @@ object TPCDSQueryBenchmark extends Logging {
   "q80a", "q86a", "q98")
 
 // If `--query-filter` defined, filters the queries that this option 
selects
-val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs)
-val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs)
+val queriesV1_4ToRun = filterQueries(tpcdsQueries, 
benchmarkArgs.queryFilter)
+val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, 
benchmarkArgs.queryFilter,
+  nameSuffix = nameSuffixForQueriesV2_7)
 
 if ((queriesV1_4ToRun ++ queriesV2_7ToRun).isEmpty) {
   throw new RuntimeException(
@@ -135,6 +142,6 @@ object TPCDSQueryBenchmark extends Logging {
 val tableSizes = setupTables(benchmarkArgs.dataLocation)
 runTpcdsQueries(queryLocation = "tpcds", queries = queriesV1_4ToRun, 
tableSizes)
 runTpcdsQueries(queryLocation = "tpcds-v2.7.0", queries = 
queriesV2_7ToRun, tableSizes,
-  nameSuffix = "-v2.7")
+  nameSuffix = nameSuffixForQueriesV2_7)
   }
 }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark

2020-11-10 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 577dbb9  [SPARK-33417][SQL][TEST] Correct the behaviour of query 
filters in TPCDSQueryBenchmark
577dbb9 is described below

commit 577dbb96835f13f4cd92ea4caab9e6dece00be50
Author: Takeshi Yamamuro 
AuthorDate: Wed Nov 11 15:24:05 2020 +0900

[SPARK-33417][SQL][TEST] Correct the behaviour of query filters in 
TPCDSQueryBenchmark

### What changes were proposed in this pull request?

This PR intends to fix the behaviour of query filters in 
`TPCDSQueryBenchmark`. We can use an option `--query-filter` for selecting 
TPCDS queries to run, e.g., `--query-filter q6,q8,q13`. But, the current master 
has a weird behaviour about the option. For example, if we pass `--query-filter 
q6` so as to run the TPCDS q6 only, `TPCDSQueryBenchmark` runs `q6` and 
`q6-v2.7` because the `filterQueries` method does not respect the name suffix. 
So, there is no way now to run the TPCDS q6 only.

### Why are the changes needed?

Bugfix.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Manually checked.

Closes #30324 from maropu/FilterBugInTPCDSQueryBenchmark.

Authored-by: Takeshi Yamamuro 
Signed-off-by: Takeshi Yamamuro 
(cherry picked from commit 4b367976a877adb981f65d546e1522fdf30d0731)
Signed-off-by: Takeshi Yamamuro 
---
 .../execution/benchmark/TPCDSQueryBenchmark.scala   | 21 ++---
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala
 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala
index 7bbf079..43bc7c1 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala
+++ 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala
@@ -98,11 +98,16 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark {
 }
   }
 
-  def filterQueries(
+  private def filterQueries(
   origQueries: Seq[String],
-  args: TPCDSQueryBenchmarkArguments): Seq[String] = {
-if (args.queryFilter.nonEmpty) {
-  origQueries.filter(args.queryFilter.contains)
+  queryFilter: Set[String],
+  nameSuffix: String = ""): Seq[String] = {
+if (queryFilter.nonEmpty) {
+  if (nameSuffix.nonEmpty) {
+origQueries.filter { name => queryFilter.contains(s"$name$nameSuffix") 
}
+  } else {
+origQueries.filter(queryFilter.contains)
+  }
 } else {
   origQueries
 }
@@ -125,6 +130,7 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark {
   "q91", "q92", "q93", "q94", "q95", "q96", "q97", "q98", "q99")
 
 // This list only includes TPC-DS v2.7 queries that are different from 
v1.4 ones
+val nameSuffixForQueriesV2_7 = "-v2.7"
 val tpcdsQueriesV2_7 = Seq(
   "q5a", "q6", "q10a", "q11", "q12", "q14", "q14a", "q18a",
   "q20", "q22", "q22a", "q24", "q27a", "q34", "q35", "q35a", "q36a", 
"q47", "q49",
@@ -132,8 +138,9 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark {
   "q80a", "q86a", "q98")
 
 // If `--query-filter` defined, filters the queries that this option 
selects
-val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs)
-val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs)
+val queriesV1_4ToRun = filterQueries(tpcdsQueries, 
benchmarkArgs.queryFilter)
+val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, 
benchmarkArgs.queryFilter,
+  nameSuffix = nameSuffixForQueriesV2_7)
 
 if ((queriesV1_4ToRun ++ queriesV2_7ToRun).isEmpty) {
   throw new RuntimeException(
@@ -143,6 +150,6 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark {
 val tableSizes = setupTables(benchmarkArgs.dataLocation)
 runTpcdsQueries(queryLocation = "tpcds", queries = queriesV1_4ToRun, 
tableSizes)
 runTpcdsQueries(queryLocation = "tpcds-v2.7.0", queries = 
queriesV2_7ToRun, tableSizes,
-  nameSuffix = "-v2.7")
+  nameSuffix = nameSuffixForQueriesV2_7)
   }
 }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (6d5d030 -> 4b36797)

2020-11-10 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 6d5d030  [SPARK-33414][SQL] Migrate SHOW CREATE TABLE command to use 
UnresolvedTableOrView to resolve the identifier
 add 4b36797  [SPARK-33417][SQL][TEST] Correct the behaviour of query 
filters in TPCDSQueryBenchmark

No new revisions were added by this update.

Summary of changes:
 .../execution/benchmark/TPCDSQueryBenchmark.scala   | 21 ++---
 1 file changed, 14 insertions(+), 7 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (4b36797 -> 8760032)

2020-11-10 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 4b36797  [SPARK-33417][SQL][TEST] Correct the behaviour of query 
filters in TPCDSQueryBenchmark
 add 8760032  [SPARK-33412][SQL] OverwriteByExpression should resolve its 
delete condition based on the table relation not the input query

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/catalyst/analysis/Analyzer.scala   |  9 -
 .../spark/sql/catalyst/plans/logical/v2Commands.scala   |  3 ++-
 .../catalyst/analysis/DataSourceV2AnalysisSuite.scala   | 17 -
 3 files changed, 18 insertions(+), 11 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated: [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark

2020-11-10 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new fece4a3  [SPARK-33417][SQL][TEST] Correct the behaviour of query 
filters in TPCDSQueryBenchmark
fece4a3 is described below

commit fece4a3a36e23c7b99d6cb64e0c4484c9e17235f
Author: Takeshi Yamamuro 
AuthorDate: Wed Nov 11 15:24:05 2020 +0900

[SPARK-33417][SQL][TEST] Correct the behaviour of query filters in 
TPCDSQueryBenchmark

### What changes were proposed in this pull request?

This PR intends to fix the behaviour of query filters in 
`TPCDSQueryBenchmark`. We can use an option `--query-filter` for selecting 
TPCDS queries to run, e.g., `--query-filter q6,q8,q13`. But, the current master 
has a weird behaviour about the option. For example, if we pass `--query-filter 
q6` so as to run the TPCDS q6 only, `TPCDSQueryBenchmark` runs `q6` and 
`q6-v2.7` because the `filterQueries` method does not respect the name suffix. 
So, there is no way now to run the TPCDS q6 only.

### Why are the changes needed?

Bugfix.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Manually checked.

Closes #30324 from maropu/FilterBugInTPCDSQueryBenchmark.

Authored-by: Takeshi Yamamuro 
Signed-off-by: Takeshi Yamamuro 
(cherry picked from commit 4b367976a877adb981f65d546e1522fdf30d0731)
Signed-off-by: Takeshi Yamamuro 
---
 .../execution/benchmark/TPCDSQueryBenchmark.scala   | 21 ++---
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala
 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala
index fccee97..1f8b057 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala
+++ 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala
@@ -90,11 +90,16 @@ object TPCDSQueryBenchmark extends Logging {
 }
   }
 
-  def filterQueries(
+  private def filterQueries(
   origQueries: Seq[String],
-  args: TPCDSQueryBenchmarkArguments): Seq[String] = {
-if (args.queryFilter.nonEmpty) {
-  origQueries.filter(args.queryFilter.contains)
+  queryFilter: Set[String],
+  nameSuffix: String = ""): Seq[String] = {
+if (queryFilter.nonEmpty) {
+  if (nameSuffix.nonEmpty) {
+origQueries.filter { name => queryFilter.contains(s"$name$nameSuffix") 
}
+  } else {
+origQueries.filter(queryFilter.contains)
+  }
 } else {
   origQueries
 }
@@ -117,6 +122,7 @@ object TPCDSQueryBenchmark extends Logging {
   "q91", "q92", "q93", "q94", "q95", "q96", "q97", "q98", "q99")
 
 // This list only includes TPC-DS v2.7 queries that are different from 
v1.4 ones
+val nameSuffixForQueriesV2_7 = "-v2.7"
 val tpcdsQueriesV2_7 = Seq(
   "q5a", "q6", "q10a", "q11", "q12", "q14", "q14a", "q18a",
   "q20", "q22", "q22a", "q24", "q27a", "q34", "q35", "q35a", "q36a", 
"q47", "q49",
@@ -124,8 +130,9 @@ object TPCDSQueryBenchmark extends Logging {
   "q80a", "q86a", "q98")
 
 // If `--query-filter` defined, filters the queries that this option 
selects
-val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs)
-val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs)
+val queriesV1_4ToRun = filterQueries(tpcdsQueries, 
benchmarkArgs.queryFilter)
+val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, 
benchmarkArgs.queryFilter,
+  nameSuffix = nameSuffixForQueriesV2_7)
 
 if ((queriesV1_4ToRun ++ queriesV2_7ToRun).isEmpty) {
   throw new RuntimeException(
@@ -135,6 +142,6 @@ object TPCDSQueryBenchmark extends Logging {
 val tableSizes = setupTables(benchmarkArgs.dataLocation)
 runTpcdsQueries(queryLocation = "tpcds", queries = queriesV1_4ToRun, 
tableSizes)
 runTpcdsQueries(queryLocation = "tpcds-v2.7.0", queries = 
queriesV2_7ToRun, tableSizes,
-  nameSuffix = "-v2.7")
+  nameSuffix = nameSuffixForQueriesV2_7)
   }
 }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark

2020-11-10 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 577dbb9  [SPARK-33417][SQL][TEST] Correct the behaviour of query 
filters in TPCDSQueryBenchmark
577dbb9 is described below

commit 577dbb96835f13f4cd92ea4caab9e6dece00be50
Author: Takeshi Yamamuro 
AuthorDate: Wed Nov 11 15:24:05 2020 +0900

[SPARK-33417][SQL][TEST] Correct the behaviour of query filters in 
TPCDSQueryBenchmark

### What changes were proposed in this pull request?

This PR intends to fix the behaviour of query filters in 
`TPCDSQueryBenchmark`. We can use an option `--query-filter` for selecting 
TPCDS queries to run, e.g., `--query-filter q6,q8,q13`. But, the current master 
has a weird behaviour about the option. For example, if we pass `--query-filter 
q6` so as to run the TPCDS q6 only, `TPCDSQueryBenchmark` runs `q6` and 
`q6-v2.7` because the `filterQueries` method does not respect the name suffix. 
So, there is no way now to run the TPCDS q6 only.

### Why are the changes needed?

Bugfix.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Manually checked.

Closes #30324 from maropu/FilterBugInTPCDSQueryBenchmark.

Authored-by: Takeshi Yamamuro 
Signed-off-by: Takeshi Yamamuro 
(cherry picked from commit 4b367976a877adb981f65d546e1522fdf30d0731)
Signed-off-by: Takeshi Yamamuro 
---
 .../execution/benchmark/TPCDSQueryBenchmark.scala   | 21 ++---
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala
 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala
index 7bbf079..43bc7c1 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala
+++ 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala
@@ -98,11 +98,16 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark {
 }
   }
 
-  def filterQueries(
+  private def filterQueries(
   origQueries: Seq[String],
-  args: TPCDSQueryBenchmarkArguments): Seq[String] = {
-if (args.queryFilter.nonEmpty) {
-  origQueries.filter(args.queryFilter.contains)
+  queryFilter: Set[String],
+  nameSuffix: String = ""): Seq[String] = {
+if (queryFilter.nonEmpty) {
+  if (nameSuffix.nonEmpty) {
+origQueries.filter { name => queryFilter.contains(s"$name$nameSuffix") 
}
+  } else {
+origQueries.filter(queryFilter.contains)
+  }
 } else {
   origQueries
 }
@@ -125,6 +130,7 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark {
   "q91", "q92", "q93", "q94", "q95", "q96", "q97", "q98", "q99")
 
 // This list only includes TPC-DS v2.7 queries that are different from 
v1.4 ones
+val nameSuffixForQueriesV2_7 = "-v2.7"
 val tpcdsQueriesV2_7 = Seq(
   "q5a", "q6", "q10a", "q11", "q12", "q14", "q14a", "q18a",
   "q20", "q22", "q22a", "q24", "q27a", "q34", "q35", "q35a", "q36a", 
"q47", "q49",
@@ -132,8 +138,9 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark {
   "q80a", "q86a", "q98")
 
 // If `--query-filter` defined, filters the queries that this option 
selects
-val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs)
-val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs)
+val queriesV1_4ToRun = filterQueries(tpcdsQueries, 
benchmarkArgs.queryFilter)
+val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, 
benchmarkArgs.queryFilter,
+  nameSuffix = nameSuffixForQueriesV2_7)
 
 if ((queriesV1_4ToRun ++ queriesV2_7ToRun).isEmpty) {
   throw new RuntimeException(
@@ -143,6 +150,6 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark {
 val tableSizes = setupTables(benchmarkArgs.dataLocation)
 runTpcdsQueries(queryLocation = "tpcds", queries = queriesV1_4ToRun, 
tableSizes)
 runTpcdsQueries(queryLocation = "tpcds-v2.7.0", queries = 
queriesV2_7ToRun, tableSizes,
-  nameSuffix = "-v2.7")
+  nameSuffix = nameSuffixForQueriesV2_7)
   }
 }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (6d5d030 -> 4b36797)

2020-11-10 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 6d5d030  [SPARK-33414][SQL] Migrate SHOW CREATE TABLE command to use 
UnresolvedTableOrView to resolve the identifier
 add 4b36797  [SPARK-33417][SQL][TEST] Correct the behaviour of query 
filters in TPCDSQueryBenchmark

No new revisions were added by this update.

Summary of changes:
 .../execution/benchmark/TPCDSQueryBenchmark.scala   | 21 ++---
 1 file changed, 14 insertions(+), 7 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated: [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark

2020-11-10 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new fece4a3  [SPARK-33417][SQL][TEST] Correct the behaviour of query 
filters in TPCDSQueryBenchmark
fece4a3 is described below

commit fece4a3a36e23c7b99d6cb64e0c4484c9e17235f
Author: Takeshi Yamamuro 
AuthorDate: Wed Nov 11 15:24:05 2020 +0900

[SPARK-33417][SQL][TEST] Correct the behaviour of query filters in 
TPCDSQueryBenchmark

### What changes were proposed in this pull request?

This PR intends to fix the behaviour of query filters in 
`TPCDSQueryBenchmark`. We can use an option `--query-filter` for selecting 
TPCDS queries to run, e.g., `--query-filter q6,q8,q13`. But, the current master 
has a weird behaviour about the option. For example, if we pass `--query-filter 
q6` so as to run the TPCDS q6 only, `TPCDSQueryBenchmark` runs `q6` and 
`q6-v2.7` because the `filterQueries` method does not respect the name suffix. 
So, there is no way now to run the TPCDS q6 only.

### Why are the changes needed?

Bugfix.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Manually checked.

Closes #30324 from maropu/FilterBugInTPCDSQueryBenchmark.

Authored-by: Takeshi Yamamuro 
Signed-off-by: Takeshi Yamamuro 
(cherry picked from commit 4b367976a877adb981f65d546e1522fdf30d0731)
Signed-off-by: Takeshi Yamamuro 
---
 .../execution/benchmark/TPCDSQueryBenchmark.scala   | 21 ++---
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala
 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala
index fccee97..1f8b057 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala
+++ 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala
@@ -90,11 +90,16 @@ object TPCDSQueryBenchmark extends Logging {
 }
   }
 
-  def filterQueries(
+  private def filterQueries(
   origQueries: Seq[String],
-  args: TPCDSQueryBenchmarkArguments): Seq[String] = {
-if (args.queryFilter.nonEmpty) {
-  origQueries.filter(args.queryFilter.contains)
+  queryFilter: Set[String],
+  nameSuffix: String = ""): Seq[String] = {
+if (queryFilter.nonEmpty) {
+  if (nameSuffix.nonEmpty) {
+origQueries.filter { name => queryFilter.contains(s"$name$nameSuffix") 
}
+  } else {
+origQueries.filter(queryFilter.contains)
+  }
 } else {
   origQueries
 }
@@ -117,6 +122,7 @@ object TPCDSQueryBenchmark extends Logging {
   "q91", "q92", "q93", "q94", "q95", "q96", "q97", "q98", "q99")
 
 // This list only includes TPC-DS v2.7 queries that are different from 
v1.4 ones
+val nameSuffixForQueriesV2_7 = "-v2.7"
 val tpcdsQueriesV2_7 = Seq(
   "q5a", "q6", "q10a", "q11", "q12", "q14", "q14a", "q18a",
   "q20", "q22", "q22a", "q24", "q27a", "q34", "q35", "q35a", "q36a", 
"q47", "q49",
@@ -124,8 +130,9 @@ object TPCDSQueryBenchmark extends Logging {
   "q80a", "q86a", "q98")
 
 // If `--query-filter` defined, filters the queries that this option 
selects
-val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs)
-val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs)
+val queriesV1_4ToRun = filterQueries(tpcdsQueries, 
benchmarkArgs.queryFilter)
+val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, 
benchmarkArgs.queryFilter,
+  nameSuffix = nameSuffixForQueriesV2_7)
 
 if ((queriesV1_4ToRun ++ queriesV2_7ToRun).isEmpty) {
   throw new RuntimeException(
@@ -135,6 +142,6 @@ object TPCDSQueryBenchmark extends Logging {
 val tableSizes = setupTables(benchmarkArgs.dataLocation)
 runTpcdsQueries(queryLocation = "tpcds", queries = queriesV1_4ToRun, 
tableSizes)
 runTpcdsQueries(queryLocation = "tpcds-v2.7.0", queries = 
queriesV2_7ToRun, tableSizes,
-  nameSuffix = "-v2.7")
+  nameSuffix = nameSuffixForQueriesV2_7)
   }
 }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark

2020-11-10 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 577dbb9  [SPARK-33417][SQL][TEST] Correct the behaviour of query 
filters in TPCDSQueryBenchmark
577dbb9 is described below

commit 577dbb96835f13f4cd92ea4caab9e6dece00be50
Author: Takeshi Yamamuro 
AuthorDate: Wed Nov 11 15:24:05 2020 +0900

[SPARK-33417][SQL][TEST] Correct the behaviour of query filters in 
TPCDSQueryBenchmark

### What changes were proposed in this pull request?

This PR intends to fix the behaviour of query filters in 
`TPCDSQueryBenchmark`. We can use an option `--query-filter` for selecting 
TPCDS queries to run, e.g., `--query-filter q6,q8,q13`. But, the current master 
has a weird behaviour about the option. For example, if we pass `--query-filter 
q6` so as to run the TPCDS q6 only, `TPCDSQueryBenchmark` runs `q6` and 
`q6-v2.7` because the `filterQueries` method does not respect the name suffix. 
So, there is no way now to run the TPCDS q6 only.

### Why are the changes needed?

Bugfix.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Manually checked.

Closes #30324 from maropu/FilterBugInTPCDSQueryBenchmark.

Authored-by: Takeshi Yamamuro 
Signed-off-by: Takeshi Yamamuro 
(cherry picked from commit 4b367976a877adb981f65d546e1522fdf30d0731)
Signed-off-by: Takeshi Yamamuro 
---
 .../execution/benchmark/TPCDSQueryBenchmark.scala   | 21 ++---
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala
 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala
index 7bbf079..43bc7c1 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala
+++ 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala
@@ -98,11 +98,16 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark {
 }
   }
 
-  def filterQueries(
+  private def filterQueries(
   origQueries: Seq[String],
-  args: TPCDSQueryBenchmarkArguments): Seq[String] = {
-if (args.queryFilter.nonEmpty) {
-  origQueries.filter(args.queryFilter.contains)
+  queryFilter: Set[String],
+  nameSuffix: String = ""): Seq[String] = {
+if (queryFilter.nonEmpty) {
+  if (nameSuffix.nonEmpty) {
+origQueries.filter { name => queryFilter.contains(s"$name$nameSuffix") 
}
+  } else {
+origQueries.filter(queryFilter.contains)
+  }
 } else {
   origQueries
 }
@@ -125,6 +130,7 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark {
   "q91", "q92", "q93", "q94", "q95", "q96", "q97", "q98", "q99")
 
 // This list only includes TPC-DS v2.7 queries that are different from 
v1.4 ones
+val nameSuffixForQueriesV2_7 = "-v2.7"
 val tpcdsQueriesV2_7 = Seq(
   "q5a", "q6", "q10a", "q11", "q12", "q14", "q14a", "q18a",
   "q20", "q22", "q22a", "q24", "q27a", "q34", "q35", "q35a", "q36a", 
"q47", "q49",
@@ -132,8 +138,9 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark {
   "q80a", "q86a", "q98")
 
 // If `--query-filter` defined, filters the queries that this option 
selects
-val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs)
-val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs)
+val queriesV1_4ToRun = filterQueries(tpcdsQueries, 
benchmarkArgs.queryFilter)
+val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, 
benchmarkArgs.queryFilter,
+  nameSuffix = nameSuffixForQueriesV2_7)
 
 if ((queriesV1_4ToRun ++ queriesV2_7ToRun).isEmpty) {
   throw new RuntimeException(
@@ -143,6 +150,6 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark {
 val tableSizes = setupTables(benchmarkArgs.dataLocation)
 runTpcdsQueries(queryLocation = "tpcds", queries = queriesV1_4ToRun, 
tableSizes)
 runTpcdsQueries(queryLocation = "tpcds-v2.7.0", queries = 
queriesV2_7ToRun, tableSizes,
-  nameSuffix = "-v2.7")
+  nameSuffix = nameSuffixForQueriesV2_7)
   }
 }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (6d5d030 -> 4b36797)

2020-11-10 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 6d5d030  [SPARK-33414][SQL] Migrate SHOW CREATE TABLE command to use 
UnresolvedTableOrView to resolve the identifier
 add 4b36797  [SPARK-33417][SQL][TEST] Correct the behaviour of query 
filters in TPCDSQueryBenchmark

No new revisions were added by this update.

Summary of changes:
 .../execution/benchmark/TPCDSQueryBenchmark.scala   | 21 ++---
 1 file changed, 14 insertions(+), 7 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated: [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark

2020-11-10 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new fece4a3  [SPARK-33417][SQL][TEST] Correct the behaviour of query 
filters in TPCDSQueryBenchmark
fece4a3 is described below

commit fece4a3a36e23c7b99d6cb64e0c4484c9e17235f
Author: Takeshi Yamamuro 
AuthorDate: Wed Nov 11 15:24:05 2020 +0900

[SPARK-33417][SQL][TEST] Correct the behaviour of query filters in 
TPCDSQueryBenchmark

### What changes were proposed in this pull request?

This PR intends to fix the behaviour of query filters in 
`TPCDSQueryBenchmark`. We can use an option `--query-filter` for selecting 
TPCDS queries to run, e.g., `--query-filter q6,q8,q13`. But, the current master 
has a weird behaviour about the option. For example, if we pass `--query-filter 
q6` so as to run the TPCDS q6 only, `TPCDSQueryBenchmark` runs `q6` and 
`q6-v2.7` because the `filterQueries` method does not respect the name suffix. 
So, there is no way now to run the TPCDS q6 only.

### Why are the changes needed?

Bugfix.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Manually checked.

Closes #30324 from maropu/FilterBugInTPCDSQueryBenchmark.

Authored-by: Takeshi Yamamuro 
Signed-off-by: Takeshi Yamamuro 
(cherry picked from commit 4b367976a877adb981f65d546e1522fdf30d0731)
Signed-off-by: Takeshi Yamamuro 
---
 .../execution/benchmark/TPCDSQueryBenchmark.scala   | 21 ++---
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala
 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala
index fccee97..1f8b057 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala
+++ 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala
@@ -90,11 +90,16 @@ object TPCDSQueryBenchmark extends Logging {
 }
   }
 
-  def filterQueries(
+  private def filterQueries(
   origQueries: Seq[String],
-  args: TPCDSQueryBenchmarkArguments): Seq[String] = {
-if (args.queryFilter.nonEmpty) {
-  origQueries.filter(args.queryFilter.contains)
+  queryFilter: Set[String],
+  nameSuffix: String = ""): Seq[String] = {
+if (queryFilter.nonEmpty) {
+  if (nameSuffix.nonEmpty) {
+origQueries.filter { name => queryFilter.contains(s"$name$nameSuffix") 
}
+  } else {
+origQueries.filter(queryFilter.contains)
+  }
 } else {
   origQueries
 }
@@ -117,6 +122,7 @@ object TPCDSQueryBenchmark extends Logging {
   "q91", "q92", "q93", "q94", "q95", "q96", "q97", "q98", "q99")
 
 // This list only includes TPC-DS v2.7 queries that are different from 
v1.4 ones
+val nameSuffixForQueriesV2_7 = "-v2.7"
 val tpcdsQueriesV2_7 = Seq(
   "q5a", "q6", "q10a", "q11", "q12", "q14", "q14a", "q18a",
   "q20", "q22", "q22a", "q24", "q27a", "q34", "q35", "q35a", "q36a", 
"q47", "q49",
@@ -124,8 +130,9 @@ object TPCDSQueryBenchmark extends Logging {
   "q80a", "q86a", "q98")
 
 // If `--query-filter` defined, filters the queries that this option 
selects
-val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs)
-val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs)
+val queriesV1_4ToRun = filterQueries(tpcdsQueries, 
benchmarkArgs.queryFilter)
+val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, 
benchmarkArgs.queryFilter,
+  nameSuffix = nameSuffixForQueriesV2_7)
 
 if ((queriesV1_4ToRun ++ queriesV2_7ToRun).isEmpty) {
   throw new RuntimeException(
@@ -135,6 +142,6 @@ object TPCDSQueryBenchmark extends Logging {
 val tableSizes = setupTables(benchmarkArgs.dataLocation)
 runTpcdsQueries(queryLocation = "tpcds", queries = queriesV1_4ToRun, 
tableSizes)
 runTpcdsQueries(queryLocation = "tpcds-v2.7.0", queries = 
queriesV2_7ToRun, tableSizes,
-  nameSuffix = "-v2.7")
+  nameSuffix = nameSuffixForQueriesV2_7)
   }
 }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark

2020-11-10 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 577dbb9  [SPARK-33417][SQL][TEST] Correct the behaviour of query 
filters in TPCDSQueryBenchmark
577dbb9 is described below

commit 577dbb96835f13f4cd92ea4caab9e6dece00be50
Author: Takeshi Yamamuro 
AuthorDate: Wed Nov 11 15:24:05 2020 +0900

[SPARK-33417][SQL][TEST] Correct the behaviour of query filters in 
TPCDSQueryBenchmark

### What changes were proposed in this pull request?

This PR intends to fix the behaviour of query filters in 
`TPCDSQueryBenchmark`. We can use an option `--query-filter` for selecting 
TPCDS queries to run, e.g., `--query-filter q6,q8,q13`. But, the current master 
has a weird behaviour about the option. For example, if we pass `--query-filter 
q6` so as to run the TPCDS q6 only, `TPCDSQueryBenchmark` runs `q6` and 
`q6-v2.7` because the `filterQueries` method does not respect the name suffix. 
So, there is no way now to run the TPCDS q6 only.

### Why are the changes needed?

Bugfix.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Manually checked.

Closes #30324 from maropu/FilterBugInTPCDSQueryBenchmark.

Authored-by: Takeshi Yamamuro 
Signed-off-by: Takeshi Yamamuro 
(cherry picked from commit 4b367976a877adb981f65d546e1522fdf30d0731)
Signed-off-by: Takeshi Yamamuro 
---
 .../execution/benchmark/TPCDSQueryBenchmark.scala   | 21 ++---
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala
 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala
index 7bbf079..43bc7c1 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala
+++ 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala
@@ -98,11 +98,16 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark {
 }
   }
 
-  def filterQueries(
+  private def filterQueries(
   origQueries: Seq[String],
-  args: TPCDSQueryBenchmarkArguments): Seq[String] = {
-if (args.queryFilter.nonEmpty) {
-  origQueries.filter(args.queryFilter.contains)
+  queryFilter: Set[String],
+  nameSuffix: String = ""): Seq[String] = {
+if (queryFilter.nonEmpty) {
+  if (nameSuffix.nonEmpty) {
+origQueries.filter { name => queryFilter.contains(s"$name$nameSuffix") 
}
+  } else {
+origQueries.filter(queryFilter.contains)
+  }
 } else {
   origQueries
 }
@@ -125,6 +130,7 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark {
   "q91", "q92", "q93", "q94", "q95", "q96", "q97", "q98", "q99")
 
 // This list only includes TPC-DS v2.7 queries that are different from 
v1.4 ones
+val nameSuffixForQueriesV2_7 = "-v2.7"
 val tpcdsQueriesV2_7 = Seq(
   "q5a", "q6", "q10a", "q11", "q12", "q14", "q14a", "q18a",
   "q20", "q22", "q22a", "q24", "q27a", "q34", "q35", "q35a", "q36a", 
"q47", "q49",
@@ -132,8 +138,9 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark {
   "q80a", "q86a", "q98")
 
 // If `--query-filter` defined, filters the queries that this option 
selects
-val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs)
-val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs)
+val queriesV1_4ToRun = filterQueries(tpcdsQueries, 
benchmarkArgs.queryFilter)
+val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, 
benchmarkArgs.queryFilter,
+  nameSuffix = nameSuffixForQueriesV2_7)
 
 if ((queriesV1_4ToRun ++ queriesV2_7ToRun).isEmpty) {
   throw new RuntimeException(
@@ -143,6 +150,6 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark {
 val tableSizes = setupTables(benchmarkArgs.dataLocation)
 runTpcdsQueries(queryLocation = "tpcds", queries = queriesV1_4ToRun, 
tableSizes)
 runTpcdsQueries(queryLocation = "tpcds-v2.7.0", queries = 
queriesV2_7ToRun, tableSizes,
-  nameSuffix = "-v2.7")
+  nameSuffix = nameSuffixForQueriesV2_7)
   }
 }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (6d5d030 -> 4b36797)

2020-11-10 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 6d5d030  [SPARK-33414][SQL] Migrate SHOW CREATE TABLE command to use 
UnresolvedTableOrView to resolve the identifier
 add 4b36797  [SPARK-33417][SQL][TEST] Correct the behaviour of query 
filters in TPCDSQueryBenchmark

No new revisions were added by this update.

Summary of changes:
 .../execution/benchmark/TPCDSQueryBenchmark.scala   | 21 ++---
 1 file changed, 14 insertions(+), 7 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (6d5d030 -> 4b36797)

2020-11-10 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 6d5d030  [SPARK-33414][SQL] Migrate SHOW CREATE TABLE command to use 
UnresolvedTableOrView to resolve the identifier
 add 4b36797  [SPARK-33417][SQL][TEST] Correct the behaviour of query 
filters in TPCDSQueryBenchmark

No new revisions were added by this update.

Summary of changes:
 .../execution/benchmark/TPCDSQueryBenchmark.scala   | 21 ++---
 1 file changed, 14 insertions(+), 7 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (1e2eeda -> 6d5d030)

2020-11-10 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 1e2eeda  [SPARK-33382][SQL][TESTS] Unify datasource v1 and v2 SHOW 
TABLES tests
 add 6d5d030  [SPARK-33414][SQL] Migrate SHOW CREATE TABLE command to use 
UnresolvedTableOrView to resolve the identifier

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/catalyst/parser/AstBuilder.scala   |  8 ++--
 .../spark/sql/catalyst/plans/logical/statements.scala   |  7 ---
 .../spark/sql/catalyst/plans/logical/v2Commands.scala   |  7 +++
 .../apache/spark/sql/catalyst/parser/DDLParserSuite.scala   |  8 +++-
 .../spark/sql/catalyst/analysis/ResolveSessionCatalog.scala | 13 ++---
 .../sql/execution/datasources/v2/DataSourceV2Strategy.scala |  3 +++
 .../scala/org/apache/spark/sql/ShowCreateTableSuite.scala   |  7 ---
 .../apache/spark/sql/connector/DataSourceV2SQLSuite.scala   |  3 ++-
 .../scala/org/apache/spark/sql/execution/SQLViewSuite.scala |  2 +-
 9 files changed, 36 insertions(+), 22 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (1e2eeda -> 6d5d030)

2020-11-10 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 1e2eeda  [SPARK-33382][SQL][TESTS] Unify datasource v1 and v2 SHOW 
TABLES tests
 add 6d5d030  [SPARK-33414][SQL] Migrate SHOW CREATE TABLE command to use 
UnresolvedTableOrView to resolve the identifier

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/catalyst/parser/AstBuilder.scala   |  8 ++--
 .../spark/sql/catalyst/plans/logical/statements.scala   |  7 ---
 .../spark/sql/catalyst/plans/logical/v2Commands.scala   |  7 +++
 .../apache/spark/sql/catalyst/parser/DDLParserSuite.scala   |  8 +++-
 .../spark/sql/catalyst/analysis/ResolveSessionCatalog.scala | 13 ++---
 .../sql/execution/datasources/v2/DataSourceV2Strategy.scala |  3 +++
 .../scala/org/apache/spark/sql/ShowCreateTableSuite.scala   |  7 ---
 .../apache/spark/sql/connector/DataSourceV2SQLSuite.scala   |  3 ++-
 .../scala/org/apache/spark/sql/execution/SQLViewSuite.scala |  2 +-
 9 files changed, 36 insertions(+), 22 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (1e2eeda -> 6d5d030)

2020-11-10 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 1e2eeda  [SPARK-33382][SQL][TESTS] Unify datasource v1 and v2 SHOW 
TABLES tests
 add 6d5d030  [SPARK-33414][SQL] Migrate SHOW CREATE TABLE command to use 
UnresolvedTableOrView to resolve the identifier

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/catalyst/parser/AstBuilder.scala   |  8 ++--
 .../spark/sql/catalyst/plans/logical/statements.scala   |  7 ---
 .../spark/sql/catalyst/plans/logical/v2Commands.scala   |  7 +++
 .../apache/spark/sql/catalyst/parser/DDLParserSuite.scala   |  8 +++-
 .../spark/sql/catalyst/analysis/ResolveSessionCatalog.scala | 13 ++---
 .../sql/execution/datasources/v2/DataSourceV2Strategy.scala |  3 +++
 .../scala/org/apache/spark/sql/ShowCreateTableSuite.scala   |  7 ---
 .../apache/spark/sql/connector/DataSourceV2SQLSuite.scala   |  3 ++-
 .../scala/org/apache/spark/sql/execution/SQLViewSuite.scala |  2 +-
 9 files changed, 36 insertions(+), 22 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (1e2eeda -> 6d5d030)

2020-11-10 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 1e2eeda  [SPARK-33382][SQL][TESTS] Unify datasource v1 and v2 SHOW 
TABLES tests
 add 6d5d030  [SPARK-33414][SQL] Migrate SHOW CREATE TABLE command to use 
UnresolvedTableOrView to resolve the identifier

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/catalyst/parser/AstBuilder.scala   |  8 ++--
 .../spark/sql/catalyst/plans/logical/statements.scala   |  7 ---
 .../spark/sql/catalyst/plans/logical/v2Commands.scala   |  7 +++
 .../apache/spark/sql/catalyst/parser/DDLParserSuite.scala   |  8 +++-
 .../spark/sql/catalyst/analysis/ResolveSessionCatalog.scala | 13 ++---
 .../sql/execution/datasources/v2/DataSourceV2Strategy.scala |  3 +++
 .../scala/org/apache/spark/sql/ShowCreateTableSuite.scala   |  7 ---
 .../apache/spark/sql/connector/DataSourceV2SQLSuite.scala   |  3 ++-
 .../scala/org/apache/spark/sql/execution/SQLViewSuite.scala |  2 +-
 9 files changed, 36 insertions(+), 22 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (1e2eeda -> 6d5d030)

2020-11-10 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 1e2eeda  [SPARK-33382][SQL][TESTS] Unify datasource v1 and v2 SHOW 
TABLES tests
 add 6d5d030  [SPARK-33414][SQL] Migrate SHOW CREATE TABLE command to use 
UnresolvedTableOrView to resolve the identifier

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/catalyst/parser/AstBuilder.scala   |  8 ++--
 .../spark/sql/catalyst/plans/logical/statements.scala   |  7 ---
 .../spark/sql/catalyst/plans/logical/v2Commands.scala   |  7 +++
 .../apache/spark/sql/catalyst/parser/DDLParserSuite.scala   |  8 +++-
 .../spark/sql/catalyst/analysis/ResolveSessionCatalog.scala | 13 ++---
 .../sql/execution/datasources/v2/DataSourceV2Strategy.scala |  3 +++
 .../scala/org/apache/spark/sql/ShowCreateTableSuite.scala   |  7 ---
 .../apache/spark/sql/connector/DataSourceV2SQLSuite.scala   |  3 ++-
 .../scala/org/apache/spark/sql/execution/SQLViewSuite.scala |  2 +-
 9 files changed, 36 insertions(+), 22 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (5197c5d -> 1e2eeda)

2020-11-10 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 5197c5d  [SPARK-33390][SQL] Make Literal support char array
 add 1e2eeda  [SPARK-33382][SQL][TESTS] Unify datasource v1 and v2 SHOW 
TABLES tests

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/parser/DDLParserSuite.scala |  49 ---
 .../spark/sql/connector/DataSourceV2SQLSuite.scala | 150 +
 .../execution/command/ShowTablesParserSuite.scala  |  76 +++
 .../sql/execution/command/ShowTablesSuite.scala| 122 +
 .../sql/execution/command/v1/ShowTablesSuite.scala |  95 +
 .../sql/execution/command/v2/ShowTablesSuite.scala | 115 
 6 files changed, 409 insertions(+), 198 deletions(-)
 create mode 100644 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/ShowTablesParserSuite.scala
 create mode 100644 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/ShowTablesSuite.scala
 create mode 100644 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/v1/ShowTablesSuite.scala
 create mode 100644 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/v2/ShowTablesSuite.scala


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (5197c5d -> 1e2eeda)

2020-11-10 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 5197c5d  [SPARK-33390][SQL] Make Literal support char array
 add 1e2eeda  [SPARK-33382][SQL][TESTS] Unify datasource v1 and v2 SHOW 
TABLES tests

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/parser/DDLParserSuite.scala |  49 ---
 .../spark/sql/connector/DataSourceV2SQLSuite.scala | 150 +
 .../execution/command/ShowTablesParserSuite.scala  |  76 +++
 .../sql/execution/command/ShowTablesSuite.scala| 122 +
 .../sql/execution/command/v1/ShowTablesSuite.scala |  95 +
 .../sql/execution/command/v2/ShowTablesSuite.scala | 115 
 6 files changed, 409 insertions(+), 198 deletions(-)
 create mode 100644 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/ShowTablesParserSuite.scala
 create mode 100644 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/ShowTablesSuite.scala
 create mode 100644 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/v1/ShowTablesSuite.scala
 create mode 100644 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/v2/ShowTablesSuite.scala


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (5197c5d -> 1e2eeda)

2020-11-10 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 5197c5d  [SPARK-33390][SQL] Make Literal support char array
 add 1e2eeda  [SPARK-33382][SQL][TESTS] Unify datasource v1 and v2 SHOW 
TABLES tests

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/parser/DDLParserSuite.scala |  49 ---
 .../spark/sql/connector/DataSourceV2SQLSuite.scala | 150 +
 .../execution/command/ShowTablesParserSuite.scala  |  76 +++
 .../sql/execution/command/ShowTablesSuite.scala| 122 +
 .../sql/execution/command/v1/ShowTablesSuite.scala |  95 +
 .../sql/execution/command/v2/ShowTablesSuite.scala | 115 
 6 files changed, 409 insertions(+), 198 deletions(-)
 create mode 100644 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/ShowTablesParserSuite.scala
 create mode 100644 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/ShowTablesSuite.scala
 create mode 100644 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/v1/ShowTablesSuite.scala
 create mode 100644 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/v2/ShowTablesSuite.scala


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (5197c5d -> 1e2eeda)

2020-11-10 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 5197c5d  [SPARK-33390][SQL] Make Literal support char array
 add 1e2eeda  [SPARK-33382][SQL][TESTS] Unify datasource v1 and v2 SHOW 
TABLES tests

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/parser/DDLParserSuite.scala |  49 ---
 .../spark/sql/connector/DataSourceV2SQLSuite.scala | 150 +
 .../execution/command/ShowTablesParserSuite.scala  |  76 +++
 .../sql/execution/command/ShowTablesSuite.scala| 122 +
 .../sql/execution/command/v1/ShowTablesSuite.scala |  95 +
 .../sql/execution/command/v2/ShowTablesSuite.scala | 115 
 6 files changed, 409 insertions(+), 198 deletions(-)
 create mode 100644 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/ShowTablesParserSuite.scala
 create mode 100644 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/ShowTablesSuite.scala
 create mode 100644 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/v1/ShowTablesSuite.scala
 create mode 100644 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/v2/ShowTablesSuite.scala


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (5197c5d -> 1e2eeda)

2020-11-10 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 5197c5d  [SPARK-33390][SQL] Make Literal support char array
 add 1e2eeda  [SPARK-33382][SQL][TESTS] Unify datasource v1 and v2 SHOW 
TABLES tests

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/parser/DDLParserSuite.scala |  49 ---
 .../spark/sql/connector/DataSourceV2SQLSuite.scala | 150 +
 .../execution/command/ShowTablesParserSuite.scala  |  76 +++
 .../sql/execution/command/ShowTablesSuite.scala| 122 +
 .../sql/execution/command/v1/ShowTablesSuite.scala |  95 +
 .../sql/execution/command/v2/ShowTablesSuite.scala | 115 
 6 files changed, 409 insertions(+), 198 deletions(-)
 create mode 100644 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/ShowTablesParserSuite.scala
 create mode 100644 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/ShowTablesSuite.scala
 create mode 100644 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/v1/ShowTablesSuite.scala
 create mode 100644 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/v2/ShowTablesSuite.scala


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (4634694 -> 5197c5d)

2020-11-10 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 4634694  [SPARK-33404][SQL] Fix incorrect results in `date_trunc` 
expression
 add 5197c5d  [SPARK-33390][SQL] Make Literal support char array

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/catalyst/CatalystTypeConverters.scala   | 1 +
 .../org/apache/spark/sql/catalyst/expressions/literals.scala | 4 
 .../apache/spark/sql/catalyst/CatalystTypeConvertersSuite.scala  | 7 +++
 .../spark/sql/catalyst/expressions/LiteralExpressionSuite.scala  | 9 +
 sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala  | 8 
 5 files changed, 29 insertions(+)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (4634694 -> 5197c5d)

2020-11-10 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 4634694  [SPARK-33404][SQL] Fix incorrect results in `date_trunc` 
expression
 add 5197c5d  [SPARK-33390][SQL] Make Literal support char array

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/catalyst/CatalystTypeConverters.scala   | 1 +
 .../org/apache/spark/sql/catalyst/expressions/literals.scala | 4 
 .../apache/spark/sql/catalyst/CatalystTypeConvertersSuite.scala  | 7 +++
 .../spark/sql/catalyst/expressions/LiteralExpressionSuite.scala  | 9 +
 sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala  | 8 
 5 files changed, 29 insertions(+)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (4634694 -> 5197c5d)

2020-11-10 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 4634694  [SPARK-33404][SQL] Fix incorrect results in `date_trunc` 
expression
 add 5197c5d  [SPARK-33390][SQL] Make Literal support char array

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/catalyst/CatalystTypeConverters.scala   | 1 +
 .../org/apache/spark/sql/catalyst/expressions/literals.scala | 4 
 .../apache/spark/sql/catalyst/CatalystTypeConvertersSuite.scala  | 7 +++
 .../spark/sql/catalyst/expressions/LiteralExpressionSuite.scala  | 9 +
 sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala  | 8 
 5 files changed, 29 insertions(+)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (4634694 -> 5197c5d)

2020-11-10 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 4634694  [SPARK-33404][SQL] Fix incorrect results in `date_trunc` 
expression
 add 5197c5d  [SPARK-33390][SQL] Make Literal support char array

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/catalyst/CatalystTypeConverters.scala   | 1 +
 .../org/apache/spark/sql/catalyst/expressions/literals.scala | 4 
 .../apache/spark/sql/catalyst/CatalystTypeConvertersSuite.scala  | 7 +++
 .../spark/sql/catalyst/expressions/LiteralExpressionSuite.scala  | 9 +
 sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala  | 8 
 5 files changed, 29 insertions(+)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (4634694 -> 5197c5d)

2020-11-10 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 4634694  [SPARK-33404][SQL] Fix incorrect results in `date_trunc` 
expression
 add 5197c5d  [SPARK-33390][SQL] Make Literal support char array

No new revisions were added by this update.

Summary of changes:
 .../org/apache/spark/sql/catalyst/CatalystTypeConverters.scala   | 1 +
 .../org/apache/spark/sql/catalyst/expressions/literals.scala | 4 
 .../apache/spark/sql/catalyst/CatalystTypeConvertersSuite.scala  | 7 +++
 .../spark/sql/catalyst/expressions/LiteralExpressionSuite.scala  | 9 +
 sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala  | 8 
 5 files changed, 29 insertions(+)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (6fa80ed -> 4634694)

2020-11-10 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 6fa80ed  [SPARK-7][SQL] Support subexpression elimination in 
branches of conditional expressions
 add 4634694  [SPARK-33404][SQL] Fix incorrect results in `date_trunc` 
expression

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/util/DateTimeUtils.scala|  6 ++--
 .../sql/catalyst/util/DateTimeUtilsSuite.scala | 34 +++---
 2 files changed, 28 insertions(+), 12 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (6fa80ed -> 4634694)

2020-11-10 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 6fa80ed  [SPARK-7][SQL] Support subexpression elimination in 
branches of conditional expressions
 add 4634694  [SPARK-33404][SQL] Fix incorrect results in `date_trunc` 
expression

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/util/DateTimeUtils.scala|  6 ++--
 .../sql/catalyst/util/DateTimeUtilsSuite.scala | 34 +++---
 2 files changed, 28 insertions(+), 12 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (6fa80ed -> 4634694)

2020-11-10 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 6fa80ed  [SPARK-7][SQL] Support subexpression elimination in 
branches of conditional expressions
 add 4634694  [SPARK-33404][SQL] Fix incorrect results in `date_trunc` 
expression

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/util/DateTimeUtils.scala|  6 ++--
 .../sql/catalyst/util/DateTimeUtilsSuite.scala | 34 +++---
 2 files changed, 28 insertions(+), 12 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (6fa80ed -> 4634694)

2020-11-10 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 6fa80ed  [SPARK-7][SQL] Support subexpression elimination in 
branches of conditional expressions
 add 4634694  [SPARK-33404][SQL] Fix incorrect results in `date_trunc` 
expression

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/util/DateTimeUtils.scala|  6 ++--
 .../sql/catalyst/util/DateTimeUtilsSuite.scala | 34 +++---
 2 files changed, 28 insertions(+), 12 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (6fa80ed -> 4634694)

2020-11-10 Thread yamamuro

This is an automated email from the ASF dual-hosted git repository.

yamamuro pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 6fa80ed  [SPARK-7][SQL] Support subexpression elimination in 
branches of conditional expressions
 add 4634694  [SPARK-33404][SQL] Fix incorrect results in `date_trunc` 
expression

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/util/DateTimeUtils.scala|  6 ++--
 .../sql/catalyst/util/DateTimeUtilsSuite.scala | 34 +++---
 2 files changed, 28 insertions(+), 12 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (122c899 -> 6fa80ed)

2020-11-10 Thread viirya

This is an automated email from the ASF dual-hosted git repository.

viirya pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 122c899  [SPARK-33251][FOLLOWUP][PYTHON][DOCS][MINOR] Adjusts returns 
PrefixSpan.findFrequentSequentialPatterns
 add 6fa80ed  [SPARK-7][SQL] Support subexpression elimination in 
branches of conditional expressions

No new revisions were added by this update.

Summary of changes:
 .../expressions/EquivalentExpressions.scala|  96 ++
 .../expressions/codegen/CodeGenerator.scala|   2 +-
 .../SubexpressionEliminationSuite.scala| 111 +++--
 3 files changed, 177 insertions(+), 32 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (122c899 -> 6fa80ed)

2020-11-10 Thread viirya

This is an automated email from the ASF dual-hosted git repository.

viirya pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 122c899  [SPARK-33251][FOLLOWUP][PYTHON][DOCS][MINOR] Adjusts returns 
PrefixSpan.findFrequentSequentialPatterns
 add 6fa80ed  [SPARK-7][SQL] Support subexpression elimination in 
branches of conditional expressions

No new revisions were added by this update.

Summary of changes:
 .../expressions/EquivalentExpressions.scala|  96 ++
 .../expressions/codegen/CodeGenerator.scala|   2 +-
 .../SubexpressionEliminationSuite.scala| 111 +++--
 3 files changed, 177 insertions(+), 32 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (122c899 -> 6fa80ed)

2020-11-10 Thread viirya

This is an automated email from the ASF dual-hosted git repository.

viirya pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 122c899  [SPARK-33251][FOLLOWUP][PYTHON][DOCS][MINOR] Adjusts returns 
PrefixSpan.findFrequentSequentialPatterns
 add 6fa80ed  [SPARK-7][SQL] Support subexpression elimination in 
branches of conditional expressions

No new revisions were added by this update.

Summary of changes:
 .../expressions/EquivalentExpressions.scala|  96 ++
 .../expressions/codegen/CodeGenerator.scala|   2 +-
 .../SubexpressionEliminationSuite.scala| 111 +++--
 3 files changed, 177 insertions(+), 32 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (122c899 -> 6fa80ed)

2020-11-10 Thread viirya

This is an automated email from the ASF dual-hosted git repository.

viirya pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 122c899  [SPARK-33251][FOLLOWUP][PYTHON][DOCS][MINOR] Adjusts returns 
PrefixSpan.findFrequentSequentialPatterns
 add 6fa80ed  [SPARK-7][SQL] Support subexpression elimination in 
branches of conditional expressions

No new revisions were added by this update.

Summary of changes:
 .../expressions/EquivalentExpressions.scala|  96 ++
 .../expressions/codegen/CodeGenerator.scala|   2 +-
 .../SubexpressionEliminationSuite.scala| 111 +++--
 3 files changed, 177 insertions(+), 32 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (122c899 -> 6fa80ed)

2020-11-10 Thread viirya

This is an automated email from the ASF dual-hosted git repository.

viirya pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 122c899  [SPARK-33251][FOLLOWUP][PYTHON][DOCS][MINOR] Adjusts returns 
PrefixSpan.findFrequentSequentialPatterns
 add 6fa80ed  [SPARK-7][SQL] Support subexpression elimination in 
branches of conditional expressions

No new revisions were added by this update.

Summary of changes:
 .../expressions/EquivalentExpressions.scala|  96 ++
 .../expressions/codegen/CodeGenerator.scala|   2 +-
 .../SubexpressionEliminationSuite.scala| 111 +++--
 3 files changed, 177 insertions(+), 32 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (3165ca7 -> 122c899)

2020-11-10 Thread huaxingao

This is an automated email from the ASF dual-hosted git repository.

huaxingao pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 3165ca7  [SPARK-33376][SQL] Remove the option of "sharesHadoopClasses" 
in Hive IsolatedClientLoader
 add 122c899  [SPARK-33251][FOLLOWUP][PYTHON][DOCS][MINOR] Adjusts returns 
PrefixSpan.findFrequentSequentialPatterns

No new revisions were added by this update.

Summary of changes:
 python/pyspark/ml/fpm.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (3165ca7 -> 122c899)

2020-11-10 Thread huaxingao

This is an automated email from the ASF dual-hosted git repository.

huaxingao pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 3165ca7  [SPARK-33376][SQL] Remove the option of "sharesHadoopClasses" 
in Hive IsolatedClientLoader
 add 122c899  [SPARK-33251][FOLLOWUP][PYTHON][DOCS][MINOR] Adjusts returns 
PrefixSpan.findFrequentSequentialPatterns

No new revisions were added by this update.

Summary of changes:
 python/pyspark/ml/fpm.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (3165ca7 -> 122c899)

2020-11-10 Thread huaxingao

This is an automated email from the ASF dual-hosted git repository.

huaxingao pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 3165ca7  [SPARK-33376][SQL] Remove the option of "sharesHadoopClasses" 
in Hive IsolatedClientLoader
 add 122c899  [SPARK-33251][FOLLOWUP][PYTHON][DOCS][MINOR] Adjusts returns 
PrefixSpan.findFrequentSequentialPatterns

No new revisions were added by this update.

Summary of changes:
 python/pyspark/ml/fpm.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (3165ca7 -> 122c899)

2020-11-10 Thread huaxingao

This is an automated email from the ASF dual-hosted git repository.

huaxingao pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 3165ca7  [SPARK-33376][SQL] Remove the option of "sharesHadoopClasses" 
in Hive IsolatedClientLoader
 add 122c899  [SPARK-33251][FOLLOWUP][PYTHON][DOCS][MINOR] Adjusts returns 
PrefixSpan.findFrequentSequentialPatterns

No new revisions were added by this update.

Summary of changes:
 python/pyspark/ml/fpm.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (3165ca7 -> 122c899)

2020-11-10 Thread huaxingao

This is an automated email from the ASF dual-hosted git repository.

huaxingao pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 3165ca7  [SPARK-33376][SQL] Remove the option of "sharesHadoopClasses" 
in Hive IsolatedClientLoader
 add 122c899  [SPARK-33251][FOLLOWUP][PYTHON][DOCS][MINOR] Adjusts returns 
PrefixSpan.findFrequentSequentialPatterns

No new revisions were added by this update.

Summary of changes:
 python/pyspark/ml/fpm.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (34f5e7c -> 3165ca7)

2020-11-10 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 34f5e7c  [SPARK-33302][SQL] Push down filters through Expand
 add 3165ca7  [SPARK-33376][SQL] Remove the option of "sharesHadoopClasses" 
in Hive IsolatedClientLoader

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/hive/client/IsolatedClientLoader.scala | 16 
 .../spark/sql/hive/client/HadoopVersionInfoSuite.scala   |  3 +--
 .../apache/spark/sql/hive/client/HiveClientBuilder.scala |  6 ++
 .../sql/hive/client/HivePartitionFilteringSuite.scala|  4 
 .../apache/spark/sql/hive/client/HiveVersionSuite.scala  |  7 ++-
 5 files changed, 9 insertions(+), 27 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (34f5e7c -> 3165ca7)

2020-11-10 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 34f5e7c  [SPARK-33302][SQL] Push down filters through Expand
 add 3165ca7  [SPARK-33376][SQL] Remove the option of "sharesHadoopClasses" 
in Hive IsolatedClientLoader

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/hive/client/IsolatedClientLoader.scala | 16 
 .../spark/sql/hive/client/HadoopVersionInfoSuite.scala   |  3 +--
 .../apache/spark/sql/hive/client/HiveClientBuilder.scala |  6 ++
 .../sql/hive/client/HivePartitionFilteringSuite.scala|  4 
 .../apache/spark/sql/hive/client/HiveVersionSuite.scala  |  7 ++-
 5 files changed, 9 insertions(+), 27 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (34f5e7c -> 3165ca7)

2020-11-10 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 34f5e7c  [SPARK-33302][SQL] Push down filters through Expand
 add 3165ca7  [SPARK-33376][SQL] Remove the option of "sharesHadoopClasses" 
in Hive IsolatedClientLoader

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/hive/client/IsolatedClientLoader.scala | 16 
 .../spark/sql/hive/client/HadoopVersionInfoSuite.scala   |  3 +--
 .../apache/spark/sql/hive/client/HiveClientBuilder.scala |  6 ++
 .../sql/hive/client/HivePartitionFilteringSuite.scala|  4 
 .../apache/spark/sql/hive/client/HiveVersionSuite.scala  |  7 ++-
 5 files changed, 9 insertions(+), 27 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (34f5e7c -> 3165ca7)

2020-11-10 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 34f5e7c  [SPARK-33302][SQL] Push down filters through Expand
 add 3165ca7  [SPARK-33376][SQL] Remove the option of "sharesHadoopClasses" 
in Hive IsolatedClientLoader

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/hive/client/IsolatedClientLoader.scala | 16 
 .../spark/sql/hive/client/HadoopVersionInfoSuite.scala   |  3 +--
 .../apache/spark/sql/hive/client/HiveClientBuilder.scala |  6 ++
 .../sql/hive/client/HivePartitionFilteringSuite.scala|  4 
 .../apache/spark/sql/hive/client/HiveVersionSuite.scala  |  7 ++-
 5 files changed, 9 insertions(+), 27 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (34f5e7c -> 3165ca7)

2020-11-10 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 34f5e7c  [SPARK-33302][SQL] Push down filters through Expand
 add 3165ca7  [SPARK-33376][SQL] Remove the option of "sharesHadoopClasses" 
in Hive IsolatedClientLoader

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/hive/client/IsolatedClientLoader.scala | 16 
 .../spark/sql/hive/client/HadoopVersionInfoSuite.scala   |  3 +--
 .../apache/spark/sql/hive/client/HiveClientBuilder.scala |  6 ++
 .../sql/hive/client/HivePartitionFilteringSuite.scala|  4 
 .../apache/spark/sql/hive/client/HiveVersionSuite.scala  |  7 ++-
 5 files changed, 9 insertions(+), 27 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (4934da56 -> 34f5e7c)

2020-11-10 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 4934da56 [SPARK-33305][SQL] DSv2: DROP TABLE command should also 
invalidate cache
 add 34f5e7c  [SPARK-33302][SQL] Push down filters through Expand

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/optimizer/Optimizer.scala   |  1 +
 .../catalyst/optimizer/FilterPushdownSuite.scala   | 24 +-
 .../optimizer/LeftSemiAntiJoinPushDownSuite.scala  | 15 ++
 3 files changed, 39 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (4934da56 -> 34f5e7c)

2020-11-10 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 4934da56 [SPARK-33305][SQL] DSv2: DROP TABLE command should also 
invalidate cache
 add 34f5e7c  [SPARK-33302][SQL] Push down filters through Expand

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/optimizer/Optimizer.scala   |  1 +
 .../catalyst/optimizer/FilterPushdownSuite.scala   | 24 +-
 .../optimizer/LeftSemiAntiJoinPushDownSuite.scala  | 15 ++
 3 files changed, 39 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (27bb40b -> 4934da56)

2020-11-10 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 27bb40b  [SPARK-9][PYTHON] Pyspark application will hang due to 
non Exception error
 add 4934da56 [SPARK-33305][SQL] DSv2: DROP TABLE command should also 
invalidate cache

No new revisions were added by this update.

Summary of changes:
 .../execution/datasources/v2/DataSourceV2Strategy.scala  |  2 +-
 .../sql/execution/datasources/v2/DropTableExec.scala |  7 ++-
 .../spark/sql/connector/DataSourceV2SQLSuite.scala   | 16 
 3 files changed, 23 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (4934da56 -> 34f5e7c)

2020-11-10 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 4934da56 [SPARK-33305][SQL] DSv2: DROP TABLE command should also 
invalidate cache
 add 34f5e7c  [SPARK-33302][SQL] Push down filters through Expand

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/optimizer/Optimizer.scala   |  1 +
 .../catalyst/optimizer/FilterPushdownSuite.scala   | 24 +-
 .../optimizer/LeftSemiAntiJoinPushDownSuite.scala  | 15 ++
 3 files changed, 39 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (27bb40b -> 4934da56)

2020-11-10 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 27bb40b  [SPARK-9][PYTHON] Pyspark application will hang due to 
non Exception error
 add 4934da56 [SPARK-33305][SQL] DSv2: DROP TABLE command should also 
invalidate cache

No new revisions were added by this update.

Summary of changes:
 .../execution/datasources/v2/DataSourceV2Strategy.scala  |  2 +-
 .../sql/execution/datasources/v2/DropTableExec.scala |  7 ++-
 .../spark/sql/connector/DataSourceV2SQLSuite.scala   | 16 
 3 files changed, 23 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (4934da56 -> 34f5e7c)

2020-11-10 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 4934da56 [SPARK-33305][SQL] DSv2: DROP TABLE command should also 
invalidate cache
 add 34f5e7c  [SPARK-33302][SQL] Push down filters through Expand

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/optimizer/Optimizer.scala   |  1 +
 .../catalyst/optimizer/FilterPushdownSuite.scala   | 24 +-
 .../optimizer/LeftSemiAntiJoinPushDownSuite.scala  | 15 ++
 3 files changed, 39 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (27bb40b -> 4934da56)

2020-11-10 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 27bb40b  [SPARK-9][PYTHON] Pyspark application will hang due to 
non Exception error
 add 4934da56 [SPARK-33305][SQL] DSv2: DROP TABLE command should also 
invalidate cache

No new revisions were added by this update.

Summary of changes:
 .../execution/datasources/v2/DataSourceV2Strategy.scala  |  2 +-
 .../sql/execution/datasources/v2/DropTableExec.scala |  7 ++-
 .../spark/sql/connector/DataSourceV2SQLSuite.scala   | 16 
 3 files changed, 23 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (4934da56 -> 34f5e7c)

2020-11-10 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 4934da56 [SPARK-33305][SQL] DSv2: DROP TABLE command should also 
invalidate cache
 add 34f5e7c  [SPARK-33302][SQL] Push down filters through Expand

No new revisions were added by this update.

Summary of changes:
 .../spark/sql/catalyst/optimizer/Optimizer.scala   |  1 +
 .../catalyst/optimizer/FilterPushdownSuite.scala   | 24 +-
 .../optimizer/LeftSemiAntiJoinPushDownSuite.scala  | 15 ++
 3 files changed, 39 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (27bb40b -> 4934da56)

2020-11-10 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 27bb40b  [SPARK-9][PYTHON] Pyspark application will hang due to 
non Exception error
 add 4934da56 [SPARK-33305][SQL] DSv2: DROP TABLE command should also 
invalidate cache

No new revisions were added by this update.

Summary of changes:
 .../execution/datasources/v2/DataSourceV2Strategy.scala  |  2 +-
 .../sql/execution/datasources/v2/DropTableExec.scala |  7 ++-
 .../spark/sql/connector/DataSourceV2SQLSuite.scala   | 16 
 3 files changed, 23 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-33305][SQL] DSv2: DROP TABLE command should also invalidate cache

2020-11-10 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
 new 4934da56 [SPARK-33305][SQL] DSv2: DROP TABLE command should also 
invalidate cache
4934da56 is described below

commit 4934da56bcc13fc61afc8e8cc44fb5290b5e7b32
Author: Chao Sun 
AuthorDate: Tue Nov 10 14:37:42 2020 +

[SPARK-33305][SQL] DSv2: DROP TABLE command should also invalidate cache

### What changes were proposed in this pull request?

This changes `DropTableExec` to also invalidate caches referencing the 
table to be dropped, in a cascading manner.

### Why are the changes needed?

In DSv1, `DROP TABLE` command also invalidate caches as described in 
[SPARK-19765](https://issues.apache.org/jira/browse/SPARK-19765). However in 
DSv2 the same command only drops the table but doesn't handle the caches. This 
could lead to correctness issue.

### Does this PR introduce _any_ user-facing change?

Yes. Now DSv2 `DROP TABLE` command also invalidates cache.

### How was this patch tested?

Added a new UT

Closes #30211 from sunchao/SPARK-33305.

Authored-by: Chao Sun 
Signed-off-by: Wenchen Fan 
---
 .../execution/datasources/v2/DataSourceV2Strategy.scala  |  2 +-
 .../sql/execution/datasources/v2/DropTableExec.scala |  7 ++-
 .../spark/sql/connector/DataSourceV2SQLSuite.scala   | 16 
 3 files changed, 23 insertions(+), 2 deletions(-)

diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Strategy.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Strategy.scala
index 817b3ce..5695d23 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Strategy.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Strategy.scala
@@ -229,7 +229,7 @@ class DataSourceV2Strategy(session: SparkSession) extends 
Strategy with Predicat
   throw new AnalysisException("Describing columns is not supported for v2 
tables.")
 
 case DropTable(r: ResolvedTable, ifExists, purge) =>
-  DropTableExec(r.catalog, r.identifier, ifExists, purge) :: Nil
+  DropTableExec(session, r.catalog, r.table, r.identifier, ifExists, 
purge) :: Nil
 
 case _: NoopDropTable =>
   LocalTableScanExec(Nil, Nil) :: Nil
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DropTableExec.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DropTableExec.scala
index 1fd0cd1..068475f 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DropTableExec.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DropTableExec.scala
@@ -17,22 +17,27 @@
 
 package org.apache.spark.sql.execution.datasources.v2
 
+import org.apache.spark.sql.SparkSession
 import org.apache.spark.sql.catalyst.InternalRow
 import org.apache.spark.sql.catalyst.analysis.NoSuchTableException
 import org.apache.spark.sql.catalyst.expressions.Attribute
-import org.apache.spark.sql.connector.catalog.{Identifier, TableCatalog}
+import org.apache.spark.sql.connector.catalog.{Identifier, Table, TableCatalog}
 
 /**
  * Physical plan node for dropping a table.
  */
 case class DropTableExec(
+session: SparkSession,
 catalog: TableCatalog,
+table: Table,
 ident: Identifier,
 ifExists: Boolean,
 purge: Boolean) extends V2CommandExec {
 
   override def run(): Seq[InternalRow] = {
 if (catalog.tableExists(ident)) {
+  val v2Relation = DataSourceV2Relation.create(table, Some(catalog), 
Some(ident))
+  session.sharedState.cacheManager.uncacheQuery(session, v2Relation, 
cascade = true)
   catalog.dropTable(ident, purge)
 } else if (!ifExists) {
   throw new NoSuchTableException(ident)
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala
 
b/sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala
index ee3f7be..dfa32b9 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala
+++ 
b/sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala
@@ -784,6 +784,22 @@ class DataSourceV2SQLSuite
 }
   }
 
+  test("SPARK-33305: DROP TABLE should also invalidate cache") {
+val t = "testcat.ns.t"
+val view = "view"
+withTable(t) {
+  withTempView(view) {
+sql(s"CREATE TABLE $t USING foo AS SELECT id, data FROM source")
+sql(s"CACHE TABLE $view AS SELECT id FROM $t")
+checkAnswer(sql(s"SELECT * FROM $t"), spark.table("source"))
+checkAnswer(sql(s"SELECT * FROM $view"), 
spark.table("source").select("id"))
+
+sql(s"DROP

[spark] branch branch-2.4 updated (bfeaef1 -> efceeee)

2020-11-10 Thread yumwang

This is an automated email from the ASF dual-hosted git repository.

yumwang pushed a change to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git.


from bfeaef1  [SPARK-33405][BUILD][2.4] Upgrade commons-compress to 1.20
 add efc  [SPARK-33372][SQL][2.4] Fix InSet bucket pruning

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/execution/datasources/FileSourceStrategy.scala  | 5 ++---
 .../test/scala/org/apache/spark/sql/sources/BucketedReadSuite.scala  | 2 +-
 2 files changed, 3 insertions(+), 4 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated (bfeaef1 -> efceeee)

2020-11-10 Thread yumwang

This is an automated email from the ASF dual-hosted git repository.

yumwang pushed a change to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git.


from bfeaef1  [SPARK-33405][BUILD][2.4] Upgrade commons-compress to 1.20
 add efc  [SPARK-33372][SQL][2.4] Fix InSet bucket pruning

No new revisions were added by this update.

Summary of changes:
 .../apache/spark/sql/execution/datasources/FileSourceStrategy.scala  | 5 ++---
 .../test/scala/org/apache/spark/sql/sources/BucketedReadSuite.scala  | 2 +-
 2 files changed, 3 insertions(+), 4 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated: [SPARK-33372][SQL][2.4] Fix InSet bucket pruning

2020-11-10 Thread yumwang

This is an automated email from the ASF dual-hosted git repository.

yumwang pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new efc  [SPARK-33372][SQL][2.4] Fix InSet bucket pruning
efc is described below

commit efcd7ebfd498b3010ef35d2c1388c2319c53
Author: Yuming Wang 
AuthorDate: Tue Nov 10 20:30:53 2020 +0800

[SPARK-33372][SQL][2.4] Fix InSet bucket pruning

### What changes were proposed in this pull request?

This is a backport of #30279.

This pr fix `InSet` bucket pruning because of it's values should not be 
`Literal`:

https://github.com/apache/spark/blob/cbd3fdea62dab73fc4a96702de8fd1f07722da66/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala#L253-L255

### Why are the changes needed?

Fix bug.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Unit test

Closes #30308 from wangyum/SPARK-33372-2.4.

Authored-by: Yuming Wang 
Signed-off-by: Yuming Wang 
---
 .../apache/spark/sql/execution/datasources/FileSourceStrategy.scala  | 5 ++---
 .../test/scala/org/apache/spark/sql/sources/BucketedReadSuite.scala  | 2 +-
 2 files changed, 3 insertions(+), 4 deletions(-)

diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala
index fe27b78..9467293 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala
@@ -89,9 +89,8 @@ object FileSourceStrategy extends Strategy with Logging {
   case expressions.In(a: Attribute, list)
 if list.forall(_.isInstanceOf[Literal]) && a.name == bucketColumnName 
=>
 getBucketSetFromIterable(a, list.map(e => e.eval(EmptyRow)))
-  case expressions.InSet(a: Attribute, hset)
-if hset.forall(_.isInstanceOf[Literal]) && a.name == bucketColumnName 
=>
-getBucketSetFromIterable(a, hset.map(e => 
expressions.Literal(e).eval(EmptyRow)))
+  case expressions.InSet(a: Attribute, hset) if a.name == bucketColumnName 
=>
+getBucketSetFromIterable(a, hset)
   case expressions.IsNull(a: Attribute) if a.name == bucketColumnName =>
 getBucketSetFromValue(a, null)
   case expressions.And(left, right) =>
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/sources/BucketedReadSuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/sources/BucketedReadSuite.scala
index 42443b0..c01b7db 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/sql/sources/BucketedReadSuite.scala
+++ 
b/sql/core/src/test/scala/org/apache/spark/sql/sources/BucketedReadSuite.scala
@@ -173,7 +173,7 @@ abstract class BucketedReadSuite extends QueryTest with 
SQLTestUtils {
   df)
 
 // Case 4: InSet
-val inSetExpr = expressions.InSet($"j".expr, Set(j, j + 1, j + 2, j + 
3).map(lit(_).expr))
+val inSetExpr = expressions.InSet($"j".expr, Set(j, j + 1, j + 2, j + 
3))
 checkPrunedAnswers(
   bucketSpec,
   bucketValues = Seq(j, j + 1, j + 2, j + 3),


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated: [SPARK-33372][SQL][2.4] Fix InSet bucket pruning

2020-11-10 Thread yumwang

This is an automated email from the ASF dual-hosted git repository.

yumwang pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new efc  [SPARK-33372][SQL][2.4] Fix InSet bucket pruning
efc is described below

commit efcd7ebfd498b3010ef35d2c1388c2319c53
Author: Yuming Wang 
AuthorDate: Tue Nov 10 20:30:53 2020 +0800

[SPARK-33372][SQL][2.4] Fix InSet bucket pruning

### What changes were proposed in this pull request?

This is a backport of #30279.

This pr fix `InSet` bucket pruning because of it's values should not be 
`Literal`:

https://github.com/apache/spark/blob/cbd3fdea62dab73fc4a96702de8fd1f07722da66/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala#L253-L255

### Why are the changes needed?

Fix bug.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Unit test

Closes #30308 from wangyum/SPARK-33372-2.4.

Authored-by: Yuming Wang 
Signed-off-by: Yuming Wang 
---
 .../apache/spark/sql/execution/datasources/FileSourceStrategy.scala  | 5 ++---
 .../test/scala/org/apache/spark/sql/sources/BucketedReadSuite.scala  | 2 +-
 2 files changed, 3 insertions(+), 4 deletions(-)

diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala
index fe27b78..9467293 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala
@@ -89,9 +89,8 @@ object FileSourceStrategy extends Strategy with Logging {
   case expressions.In(a: Attribute, list)
 if list.forall(_.isInstanceOf[Literal]) && a.name == bucketColumnName 
=>
 getBucketSetFromIterable(a, list.map(e => e.eval(EmptyRow)))
-  case expressions.InSet(a: Attribute, hset)
-if hset.forall(_.isInstanceOf[Literal]) && a.name == bucketColumnName 
=>
-getBucketSetFromIterable(a, hset.map(e => 
expressions.Literal(e).eval(EmptyRow)))
+  case expressions.InSet(a: Attribute, hset) if a.name == bucketColumnName 
=>
+getBucketSetFromIterable(a, hset)
   case expressions.IsNull(a: Attribute) if a.name == bucketColumnName =>
 getBucketSetFromValue(a, null)
   case expressions.And(left, right) =>
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/sources/BucketedReadSuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/sources/BucketedReadSuite.scala
index 42443b0..c01b7db 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/sql/sources/BucketedReadSuite.scala
+++ 
b/sql/core/src/test/scala/org/apache/spark/sql/sources/BucketedReadSuite.scala
@@ -173,7 +173,7 @@ abstract class BucketedReadSuite extends QueryTest with 
SQLTestUtils {
   df)
 
 // Case 4: InSet
-val inSetExpr = expressions.InSet($"j".expr, Set(j, j + 1, j + 2, j + 
3).map(lit(_).expr))
+val inSetExpr = expressions.InSet($"j".expr, Set(j, j + 1, j + 2, j + 
3))
 checkPrunedAnswers(
   bucketSpec,
   bucketValues = Seq(j, j + 1, j + 2, j + 3),


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-2.4 updated: [SPARK-33372][SQL][2.4] Fix InSet bucket pruning

2020-11-10 Thread yumwang

This is an automated email from the ASF dual-hosted git repository.

yumwang pushed a commit to branch branch-2.4
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-2.4 by this push:
 new efc  [SPARK-33372][SQL][2.4] Fix InSet bucket pruning
efc is described below

commit efcd7ebfd498b3010ef35d2c1388c2319c53
Author: Yuming Wang 
AuthorDate: Tue Nov 10 20:30:53 2020 +0800

[SPARK-33372][SQL][2.4] Fix InSet bucket pruning

### What changes were proposed in this pull request?

This is a backport of #30279.

This pr fix `InSet` bucket pruning because of it's values should not be 
`Literal`:

https://github.com/apache/spark/blob/cbd3fdea62dab73fc4a96702de8fd1f07722da66/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala#L253-L255

### Why are the changes needed?

Fix bug.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Unit test

Closes #30308 from wangyum/SPARK-33372-2.4.

Authored-by: Yuming Wang 
Signed-off-by: Yuming Wang 
---
 .../apache/spark/sql/execution/datasources/FileSourceStrategy.scala  | 5 ++---
 .../test/scala/org/apache/spark/sql/sources/BucketedReadSuite.scala  | 2 +-
 2 files changed, 3 insertions(+), 4 deletions(-)

diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala
index fe27b78..9467293 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala
@@ -89,9 +89,8 @@ object FileSourceStrategy extends Strategy with Logging {
   case expressions.In(a: Attribute, list)
 if list.forall(_.isInstanceOf[Literal]) && a.name == bucketColumnName 
=>
 getBucketSetFromIterable(a, list.map(e => e.eval(EmptyRow)))
-  case expressions.InSet(a: Attribute, hset)
-if hset.forall(_.isInstanceOf[Literal]) && a.name == bucketColumnName 
=>
-getBucketSetFromIterable(a, hset.map(e => 
expressions.Literal(e).eval(EmptyRow)))
+  case expressions.InSet(a: Attribute, hset) if a.name == bucketColumnName 
=>
+getBucketSetFromIterable(a, hset)
   case expressions.IsNull(a: Attribute) if a.name == bucketColumnName =>
 getBucketSetFromValue(a, null)
   case expressions.And(left, right) =>
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/sources/BucketedReadSuite.scala 
b/sql/core/src/test/scala/org/apache/spark/sql/sources/BucketedReadSuite.scala
index 42443b0..c01b7db 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/sql/sources/BucketedReadSuite.scala
+++ 
b/sql/core/src/test/scala/org/apache/spark/sql/sources/BucketedReadSuite.scala
@@ -173,7 +173,7 @@ abstract class BucketedReadSuite extends QueryTest with 
SQLTestUtils {
   df)
 
 // Case 4: InSet
-val inSetExpr = expressions.InSet($"j".expr, Set(j, j + 1, j + 2, j + 
3).map(lit(_).expr))
+val inSetExpr = expressions.InSet($"j".expr, Set(j, j + 1, j + 2, j + 
3))
 checkPrunedAnswers(
   bucketSpec,
   bucketValues = Seq(j, j + 1, j + 2, j + 3),


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-33339][PYTHON] Pyspark application will hang due to non Exception error

2020-11-10 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 4a1c143  [SPARK-9][PYTHON] Pyspark application will hang due to 
non Exception error
4a1c143 is described below

commit 4a1c143f1a042a9a23d00929670eadbdb1afca11
Author: lrz 
AuthorDate: Tue Nov 10 19:39:18 2020 +0900

[SPARK-9][PYTHON] Pyspark application will hang due to non Exception 
error

### What changes were proposed in this pull request?

When a system.exit exception occurs during the process, the python worker 
exits abnormally, and then the executor task is still waiting for the worker 
for reading from socket, causing it to hang.
The system.exit exception may be caused by the user's error code, but spark 
should at least throw an error to remind the user, not get stuck
we can run a simple test to reproduce this case:

```
from pyspark.sql import SparkSession
def err(line):
  raise SystemExit
spark = SparkSession.builder.appName("test").getOrCreate()
spark.sparkContext.parallelize(range(1,2), 2).map(err).collect()
spark.stop()
```

### Why are the changes needed?

to make sure pyspark application won't hang if there's non-Exception error 
in python worker

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

added a new test and also manually tested the case above

Closes #30248 from li36909/pyspark.

Lead-authored-by: lrz 
Co-authored-by: Hyukjin Kwon 
Signed-off-by: HyukjinKwon 
(cherry picked from commit 27bb40b6297361985e3590687f0332a72b71bc85)
Signed-off-by: HyukjinKwon 
---
 python/pyspark/tests/test_worker.py | 9 +
 python/pyspark/worker.py| 4 ++--
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/python/pyspark/tests/test_worker.py 
b/python/pyspark/tests/test_worker.py
index bfcbc43..f51d4b2 100644
--- a/python/pyspark/tests/test_worker.py
+++ b/python/pyspark/tests/test_worker.py
@@ -98,6 +98,15 @@ class WorkerTests(ReusedPySparkTestCase):
 self.assertRaises(Exception, lambda: rdd.foreach(raise_exception))
 self.assertEqual(100, rdd.map(str).count())
 
+def test_after_non_exception_error(self):
+# SPARK-9: Pyspark application will hang due to non Exception
+def raise_system_exit(_):
+raise SystemExit()
+rdd = self.sc.parallelize(range(100), 1)
+with QuietTest(self.sc):
+self.assertRaises(Exception, lambda: 
rdd.foreach(raise_system_exit))
+self.assertEqual(100, rdd.map(str).count())
+
 def test_after_jvm_exception(self):
 tempFile = tempfile.NamedTemporaryFile(delete=False)
 tempFile.write(b"Hello World!")
diff --git a/python/pyspark/worker.py b/python/pyspark/worker.py
index 814f796..0bce87d 100644
--- a/python/pyspark/worker.py
+++ b/python/pyspark/worker.py
@@ -608,7 +608,7 @@ def main(infile, outfile):
 # reuse.
 TaskContext._setTaskContext(None)
 BarrierTaskContext._setTaskContext(None)
-except Exception:
+except BaseException:
 try:
 exc_info = traceback.format_exc()
 if isinstance(exc_info, bytes):
@@ -622,7 +622,7 @@ def main(infile, outfile):
 except IOError:
 # JVM close the socket
 pass
-except Exception:
+except BaseException:
 # Write the error to stderr if it happened while serializing
 print("PySpark worker failed with exception:", file=sys.stderr)
 print(traceback.format_exc(), file=sys.stderr)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (e3a768d -> 27bb40b)

2020-11-10 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from e3a768d  [SPARK-33391][SQL] element_at with CreateArray not respect 
one based index
 add 27bb40b  [SPARK-9][PYTHON] Pyspark application will hang due to 
non Exception error

No new revisions were added by this update.

Summary of changes:
 python/pyspark/tests/test_worker.py | 9 +
 python/pyspark/worker.py| 4 ++--
 2 files changed, 11 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-33339][PYTHON] Pyspark application will hang due to non Exception error

2020-11-10 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 4a1c143  [SPARK-9][PYTHON] Pyspark application will hang due to 
non Exception error
4a1c143 is described below

commit 4a1c143f1a042a9a23d00929670eadbdb1afca11
Author: lrz 
AuthorDate: Tue Nov 10 19:39:18 2020 +0900

[SPARK-9][PYTHON] Pyspark application will hang due to non Exception 
error

### What changes were proposed in this pull request?

When a system.exit exception occurs during the process, the python worker 
exits abnormally, and then the executor task is still waiting for the worker 
for reading from socket, causing it to hang.
The system.exit exception may be caused by the user's error code, but spark 
should at least throw an error to remind the user, not get stuck
we can run a simple test to reproduce this case:

```
from pyspark.sql import SparkSession
def err(line):
  raise SystemExit
spark = SparkSession.builder.appName("test").getOrCreate()
spark.sparkContext.parallelize(range(1,2), 2).map(err).collect()
spark.stop()
```

### Why are the changes needed?

to make sure pyspark application won't hang if there's non-Exception error 
in python worker

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

added a new test and also manually tested the case above

Closes #30248 from li36909/pyspark.

Lead-authored-by: lrz 
Co-authored-by: Hyukjin Kwon 
Signed-off-by: HyukjinKwon 
(cherry picked from commit 27bb40b6297361985e3590687f0332a72b71bc85)
Signed-off-by: HyukjinKwon 
---
 python/pyspark/tests/test_worker.py | 9 +
 python/pyspark/worker.py| 4 ++--
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/python/pyspark/tests/test_worker.py 
b/python/pyspark/tests/test_worker.py
index bfcbc43..f51d4b2 100644
--- a/python/pyspark/tests/test_worker.py
+++ b/python/pyspark/tests/test_worker.py
@@ -98,6 +98,15 @@ class WorkerTests(ReusedPySparkTestCase):
 self.assertRaises(Exception, lambda: rdd.foreach(raise_exception))
 self.assertEqual(100, rdd.map(str).count())
 
+def test_after_non_exception_error(self):
+# SPARK-9: Pyspark application will hang due to non Exception
+def raise_system_exit(_):
+raise SystemExit()
+rdd = self.sc.parallelize(range(100), 1)
+with QuietTest(self.sc):
+self.assertRaises(Exception, lambda: 
rdd.foreach(raise_system_exit))
+self.assertEqual(100, rdd.map(str).count())
+
 def test_after_jvm_exception(self):
 tempFile = tempfile.NamedTemporaryFile(delete=False)
 tempFile.write(b"Hello World!")
diff --git a/python/pyspark/worker.py b/python/pyspark/worker.py
index 814f796..0bce87d 100644
--- a/python/pyspark/worker.py
+++ b/python/pyspark/worker.py
@@ -608,7 +608,7 @@ def main(infile, outfile):
 # reuse.
 TaskContext._setTaskContext(None)
 BarrierTaskContext._setTaskContext(None)
-except Exception:
+except BaseException:
 try:
 exc_info = traceback.format_exc()
 if isinstance(exc_info, bytes):
@@ -622,7 +622,7 @@ def main(infile, outfile):
 except IOError:
 # JVM close the socket
 pass
-except Exception:
+except BaseException:
 # Write the error to stderr if it happened while serializing
 print("PySpark worker failed with exception:", file=sys.stderr)
 print(traceback.format_exc(), file=sys.stderr)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (e3a768d -> 27bb40b)

2020-11-10 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from e3a768d  [SPARK-33391][SQL] element_at with CreateArray not respect 
one based index
 add 27bb40b  [SPARK-9][PYTHON] Pyspark application will hang due to 
non Exception error

No new revisions were added by this update.

Summary of changes:
 python/pyspark/tests/test_worker.py | 9 +
 python/pyspark/worker.py| 4 ++--
 2 files changed, 11 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-33339][PYTHON] Pyspark application will hang due to non Exception error

2020-11-10 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 4a1c143  [SPARK-9][PYTHON] Pyspark application will hang due to 
non Exception error
4a1c143 is described below

commit 4a1c143f1a042a9a23d00929670eadbdb1afca11
Author: lrz 
AuthorDate: Tue Nov 10 19:39:18 2020 +0900

[SPARK-9][PYTHON] Pyspark application will hang due to non Exception 
error

### What changes were proposed in this pull request?

When a system.exit exception occurs during the process, the python worker 
exits abnormally, and then the executor task is still waiting for the worker 
for reading from socket, causing it to hang.
The system.exit exception may be caused by the user's error code, but spark 
should at least throw an error to remind the user, not get stuck
we can run a simple test to reproduce this case:

```
from pyspark.sql import SparkSession
def err(line):
  raise SystemExit
spark = SparkSession.builder.appName("test").getOrCreate()
spark.sparkContext.parallelize(range(1,2), 2).map(err).collect()
spark.stop()
```

### Why are the changes needed?

to make sure pyspark application won't hang if there's non-Exception error 
in python worker

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

added a new test and also manually tested the case above

Closes #30248 from li36909/pyspark.

Lead-authored-by: lrz 
Co-authored-by: Hyukjin Kwon 
Signed-off-by: HyukjinKwon 
(cherry picked from commit 27bb40b6297361985e3590687f0332a72b71bc85)
Signed-off-by: HyukjinKwon 
---
 python/pyspark/tests/test_worker.py | 9 +
 python/pyspark/worker.py| 4 ++--
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/python/pyspark/tests/test_worker.py 
b/python/pyspark/tests/test_worker.py
index bfcbc43..f51d4b2 100644
--- a/python/pyspark/tests/test_worker.py
+++ b/python/pyspark/tests/test_worker.py
@@ -98,6 +98,15 @@ class WorkerTests(ReusedPySparkTestCase):
 self.assertRaises(Exception, lambda: rdd.foreach(raise_exception))
 self.assertEqual(100, rdd.map(str).count())
 
+def test_after_non_exception_error(self):
+# SPARK-9: Pyspark application will hang due to non Exception
+def raise_system_exit(_):
+raise SystemExit()
+rdd = self.sc.parallelize(range(100), 1)
+with QuietTest(self.sc):
+self.assertRaises(Exception, lambda: 
rdd.foreach(raise_system_exit))
+self.assertEqual(100, rdd.map(str).count())
+
 def test_after_jvm_exception(self):
 tempFile = tempfile.NamedTemporaryFile(delete=False)
 tempFile.write(b"Hello World!")
diff --git a/python/pyspark/worker.py b/python/pyspark/worker.py
index 814f796..0bce87d 100644
--- a/python/pyspark/worker.py
+++ b/python/pyspark/worker.py
@@ -608,7 +608,7 @@ def main(infile, outfile):
 # reuse.
 TaskContext._setTaskContext(None)
 BarrierTaskContext._setTaskContext(None)
-except Exception:
+except BaseException:
 try:
 exc_info = traceback.format_exc()
 if isinstance(exc_info, bytes):
@@ -622,7 +622,7 @@ def main(infile, outfile):
 except IOError:
 # JVM close the socket
 pass
-except Exception:
+except BaseException:
 # Write the error to stderr if it happened while serializing
 print("PySpark worker failed with exception:", file=sys.stderr)
 print(traceback.format_exc(), file=sys.stderr)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (e3a768d -> 27bb40b)

2020-11-10 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from e3a768d  [SPARK-33391][SQL] element_at with CreateArray not respect 
one based index
 add 27bb40b  [SPARK-9][PYTHON] Pyspark application will hang due to 
non Exception error

No new revisions were added by this update.

Summary of changes:
 python/pyspark/tests/test_worker.py | 9 +
 python/pyspark/worker.py| 4 ++--
 2 files changed, 11 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-33339][PYTHON] Pyspark application will hang due to non Exception error

2020-11-10 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 4a1c143  [SPARK-9][PYTHON] Pyspark application will hang due to 
non Exception error
4a1c143 is described below

commit 4a1c143f1a042a9a23d00929670eadbdb1afca11
Author: lrz 
AuthorDate: Tue Nov 10 19:39:18 2020 +0900

[SPARK-9][PYTHON] Pyspark application will hang due to non Exception 
error

### What changes were proposed in this pull request?

When a system.exit exception occurs during the process, the python worker 
exits abnormally, and then the executor task is still waiting for the worker 
for reading from socket, causing it to hang.
The system.exit exception may be caused by the user's error code, but spark 
should at least throw an error to remind the user, not get stuck
we can run a simple test to reproduce this case:

```
from pyspark.sql import SparkSession
def err(line):
  raise SystemExit
spark = SparkSession.builder.appName("test").getOrCreate()
spark.sparkContext.parallelize(range(1,2), 2).map(err).collect()
spark.stop()
```

### Why are the changes needed?

to make sure pyspark application won't hang if there's non-Exception error 
in python worker

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

added a new test and also manually tested the case above

Closes #30248 from li36909/pyspark.

Lead-authored-by: lrz 
Co-authored-by: Hyukjin Kwon 
Signed-off-by: HyukjinKwon 
(cherry picked from commit 27bb40b6297361985e3590687f0332a72b71bc85)
Signed-off-by: HyukjinKwon 
---
 python/pyspark/tests/test_worker.py | 9 +
 python/pyspark/worker.py| 4 ++--
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/python/pyspark/tests/test_worker.py 
b/python/pyspark/tests/test_worker.py
index bfcbc43..f51d4b2 100644
--- a/python/pyspark/tests/test_worker.py
+++ b/python/pyspark/tests/test_worker.py
@@ -98,6 +98,15 @@ class WorkerTests(ReusedPySparkTestCase):
 self.assertRaises(Exception, lambda: rdd.foreach(raise_exception))
 self.assertEqual(100, rdd.map(str).count())
 
+def test_after_non_exception_error(self):
+# SPARK-9: Pyspark application will hang due to non Exception
+def raise_system_exit(_):
+raise SystemExit()
+rdd = self.sc.parallelize(range(100), 1)
+with QuietTest(self.sc):
+self.assertRaises(Exception, lambda: 
rdd.foreach(raise_system_exit))
+self.assertEqual(100, rdd.map(str).count())
+
 def test_after_jvm_exception(self):
 tempFile = tempfile.NamedTemporaryFile(delete=False)
 tempFile.write(b"Hello World!")
diff --git a/python/pyspark/worker.py b/python/pyspark/worker.py
index 814f796..0bce87d 100644
--- a/python/pyspark/worker.py
+++ b/python/pyspark/worker.py
@@ -608,7 +608,7 @@ def main(infile, outfile):
 # reuse.
 TaskContext._setTaskContext(None)
 BarrierTaskContext._setTaskContext(None)
-except Exception:
+except BaseException:
 try:
 exc_info = traceback.format_exc()
 if isinstance(exc_info, bytes):
@@ -622,7 +622,7 @@ def main(infile, outfile):
 except IOError:
 # JVM close the socket
 pass
-except Exception:
+except BaseException:
 # Write the error to stderr if it happened while serializing
 print("PySpark worker failed with exception:", file=sys.stderr)
 print(traceback.format_exc(), file=sys.stderr)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (e3a768d -> 27bb40b)

2020-11-10 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from e3a768d  [SPARK-33391][SQL] element_at with CreateArray not respect 
one based index
 add 27bb40b  [SPARK-9][PYTHON] Pyspark application will hang due to 
non Exception error

No new revisions were added by this update.

Summary of changes:
 python/pyspark/tests/test_worker.py | 9 +
 python/pyspark/worker.py| 4 ++--
 2 files changed, 11 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated: [SPARK-33339][PYTHON] Pyspark application will hang due to non Exception error

2020-11-10 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.0 by this push:
 new 4a1c143  [SPARK-9][PYTHON] Pyspark application will hang due to 
non Exception error
4a1c143 is described below

commit 4a1c143f1a042a9a23d00929670eadbdb1afca11
Author: lrz 
AuthorDate: Tue Nov 10 19:39:18 2020 +0900

[SPARK-9][PYTHON] Pyspark application will hang due to non Exception 
error

### What changes were proposed in this pull request?

When a system.exit exception occurs during the process, the python worker 
exits abnormally, and then the executor task is still waiting for the worker 
for reading from socket, causing it to hang.
The system.exit exception may be caused by the user's error code, but spark 
should at least throw an error to remind the user, not get stuck
we can run a simple test to reproduce this case:

```
from pyspark.sql import SparkSession
def err(line):
  raise SystemExit
spark = SparkSession.builder.appName("test").getOrCreate()
spark.sparkContext.parallelize(range(1,2), 2).map(err).collect()
spark.stop()
```

### Why are the changes needed?

to make sure pyspark application won't hang if there's non-Exception error 
in python worker

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

added a new test and also manually tested the case above

Closes #30248 from li36909/pyspark.

Lead-authored-by: lrz 
Co-authored-by: Hyukjin Kwon 
Signed-off-by: HyukjinKwon 
(cherry picked from commit 27bb40b6297361985e3590687f0332a72b71bc85)
Signed-off-by: HyukjinKwon 
---
 python/pyspark/tests/test_worker.py | 9 +
 python/pyspark/worker.py| 4 ++--
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/python/pyspark/tests/test_worker.py 
b/python/pyspark/tests/test_worker.py
index bfcbc43..f51d4b2 100644
--- a/python/pyspark/tests/test_worker.py
+++ b/python/pyspark/tests/test_worker.py
@@ -98,6 +98,15 @@ class WorkerTests(ReusedPySparkTestCase):
 self.assertRaises(Exception, lambda: rdd.foreach(raise_exception))
 self.assertEqual(100, rdd.map(str).count())
 
+def test_after_non_exception_error(self):
+# SPARK-9: Pyspark application will hang due to non Exception
+def raise_system_exit(_):
+raise SystemExit()
+rdd = self.sc.parallelize(range(100), 1)
+with QuietTest(self.sc):
+self.assertRaises(Exception, lambda: 
rdd.foreach(raise_system_exit))
+self.assertEqual(100, rdd.map(str).count())
+
 def test_after_jvm_exception(self):
 tempFile = tempfile.NamedTemporaryFile(delete=False)
 tempFile.write(b"Hello World!")
diff --git a/python/pyspark/worker.py b/python/pyspark/worker.py
index 814f796..0bce87d 100644
--- a/python/pyspark/worker.py
+++ b/python/pyspark/worker.py
@@ -608,7 +608,7 @@ def main(infile, outfile):
 # reuse.
 TaskContext._setTaskContext(None)
 BarrierTaskContext._setTaskContext(None)
-except Exception:
+except BaseException:
 try:
 exc_info = traceback.format_exc()
 if isinstance(exc_info, bytes):
@@ -622,7 +622,7 @@ def main(infile, outfile):
 except IOError:
 # JVM close the socket
 pass
-except Exception:
+except BaseException:
 # Write the error to stderr if it happened while serializing
 print("PySpark worker failed with exception:", file=sys.stderr)
 print(traceback.format_exc(), file=sys.stderr)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated (e3a768d -> 27bb40b)

2020-11-10 Thread gurwls223

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git.


from e3a768d  [SPARK-33391][SQL] element_at with CreateArray not respect 
one based index
 add 27bb40b  [SPARK-9][PYTHON] Pyspark application will hang due to 
non Exception error

No new revisions were added by this update.

Summary of changes:
 python/pyspark/tests/test_worker.py | 9 +
 python/pyspark/worker.py| 4 ++--
 2 files changed, 11 insertions(+), 2 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.0 updated (1aa8f4f -> b905d65)

2020-11-10 Thread wenchen

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a change to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git.


from 1aa8f4f  [SPARK-33405][BUILD][3.0] Upgrade commons-compress to 1.20
 add b905d65  [SPARK-33391][SQL] element_at with CreateArray not respect 
one based index

No new revisions were added by this update.

Summary of changes:
 .../expressions/collectionOperations.scala | 30 +
 .../expressions/CollectionExpressionsSuite.scala   | 38 +-
 2 files changed, 60 insertions(+), 8 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

79 matches

Mail list logo