[spark] branch master updated (4b36797 -> 8760032)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 4b36797 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark add 8760032 [SPARK-33412][SQL] OverwriteByExpression should resolve its delete condition based on the table relation not the input query No new revisions were added by this update. Summary of changes: .../apache/spark/sql/catalyst/analysis/Analyzer.scala | 9 - .../spark/sql/catalyst/plans/logical/v2Commands.scala | 3 ++- .../catalyst/analysis/DataSourceV2AnalysisSuite.scala | 17 - 3 files changed, 18 insertions(+), 11 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-2.4 updated: [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.4 by this push: new fece4a3 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark fece4a3 is described below commit fece4a3a36e23c7b99d6cb64e0c4484c9e17235f Author: Takeshi Yamamuro AuthorDate: Wed Nov 11 15:24:05 2020 +0900 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark ### What changes were proposed in this pull request? This PR intends to fix the behaviour of query filters in `TPCDSQueryBenchmark`. We can use an option `--query-filter` for selecting TPCDS queries to run, e.g., `--query-filter q6,q8,q13`. But, the current master has a weird behaviour about the option. For example, if we pass `--query-filter q6` so as to run the TPCDS q6 only, `TPCDSQueryBenchmark` runs `q6` and `q6-v2.7` because the `filterQueries` method does not respect the name suffix. So, there is no way now to run the TPCDS q6 only. ### Why are the changes needed? Bugfix. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manually checked. Closes #30324 from maropu/FilterBugInTPCDSQueryBenchmark. Authored-by: Takeshi Yamamuro Signed-off-by: Takeshi Yamamuro (cherry picked from commit 4b367976a877adb981f65d546e1522fdf30d0731) Signed-off-by: Takeshi Yamamuro --- .../execution/benchmark/TPCDSQueryBenchmark.scala | 21 ++--- 1 file changed, 14 insertions(+), 7 deletions(-) diff --git a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala index fccee97..1f8b057 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala @@ -90,11 +90,16 @@ object TPCDSQueryBenchmark extends Logging { } } - def filterQueries( + private def filterQueries( origQueries: Seq[String], - args: TPCDSQueryBenchmarkArguments): Seq[String] = { -if (args.queryFilter.nonEmpty) { - origQueries.filter(args.queryFilter.contains) + queryFilter: Set[String], + nameSuffix: String = ""): Seq[String] = { +if (queryFilter.nonEmpty) { + if (nameSuffix.nonEmpty) { +origQueries.filter { name => queryFilter.contains(s"$name$nameSuffix") } + } else { +origQueries.filter(queryFilter.contains) + } } else { origQueries } @@ -117,6 +122,7 @@ object TPCDSQueryBenchmark extends Logging { "q91", "q92", "q93", "q94", "q95", "q96", "q97", "q98", "q99") // This list only includes TPC-DS v2.7 queries that are different from v1.4 ones +val nameSuffixForQueriesV2_7 = "-v2.7" val tpcdsQueriesV2_7 = Seq( "q5a", "q6", "q10a", "q11", "q12", "q14", "q14a", "q18a", "q20", "q22", "q22a", "q24", "q27a", "q34", "q35", "q35a", "q36a", "q47", "q49", @@ -124,8 +130,9 @@ object TPCDSQueryBenchmark extends Logging { "q80a", "q86a", "q98") // If `--query-filter` defined, filters the queries that this option selects -val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs) -val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs) +val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs.queryFilter) +val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs.queryFilter, + nameSuffix = nameSuffixForQueriesV2_7) if ((queriesV1_4ToRun ++ queriesV2_7ToRun).isEmpty) { throw new RuntimeException( @@ -135,6 +142,6 @@ object TPCDSQueryBenchmark extends Logging { val tableSizes = setupTables(benchmarkArgs.dataLocation) runTpcdsQueries(queryLocation = "tpcds", queries = queriesV1_4ToRun, tableSizes) runTpcdsQueries(queryLocation = "tpcds-v2.7.0", queries = queriesV2_7ToRun, tableSizes, - nameSuffix = "-v2.7") + nameSuffix = nameSuffixForQueriesV2_7) } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated (4a1c143 -> 577dbb9)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git. from 4a1c143 [SPARK-9][PYTHON] Pyspark application will hang due to non Exception error add 577dbb9 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark No new revisions were added by this update. Summary of changes: .../execution/benchmark/TPCDSQueryBenchmark.scala | 21 ++--- 1 file changed, 14 insertions(+), 7 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (4b36797 -> 8760032)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 4b36797 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark add 8760032 [SPARK-33412][SQL] OverwriteByExpression should resolve its delete condition based on the table relation not the input query No new revisions were added by this update. Summary of changes: .../apache/spark/sql/catalyst/analysis/Analyzer.scala | 9 - .../spark/sql/catalyst/plans/logical/v2Commands.scala | 3 ++- .../catalyst/analysis/DataSourceV2AnalysisSuite.scala | 17 - 3 files changed, 18 insertions(+), 11 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-2.4 updated: [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.4 by this push: new fece4a3 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark fece4a3 is described below commit fece4a3a36e23c7b99d6cb64e0c4484c9e17235f Author: Takeshi Yamamuro AuthorDate: Wed Nov 11 15:24:05 2020 +0900 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark ### What changes were proposed in this pull request? This PR intends to fix the behaviour of query filters in `TPCDSQueryBenchmark`. We can use an option `--query-filter` for selecting TPCDS queries to run, e.g., `--query-filter q6,q8,q13`. But, the current master has a weird behaviour about the option. For example, if we pass `--query-filter q6` so as to run the TPCDS q6 only, `TPCDSQueryBenchmark` runs `q6` and `q6-v2.7` because the `filterQueries` method does not respect the name suffix. So, there is no way now to run the TPCDS q6 only. ### Why are the changes needed? Bugfix. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manually checked. Closes #30324 from maropu/FilterBugInTPCDSQueryBenchmark. Authored-by: Takeshi Yamamuro Signed-off-by: Takeshi Yamamuro (cherry picked from commit 4b367976a877adb981f65d546e1522fdf30d0731) Signed-off-by: Takeshi Yamamuro --- .../execution/benchmark/TPCDSQueryBenchmark.scala | 21 ++--- 1 file changed, 14 insertions(+), 7 deletions(-) diff --git a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala index fccee97..1f8b057 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala @@ -90,11 +90,16 @@ object TPCDSQueryBenchmark extends Logging { } } - def filterQueries( + private def filterQueries( origQueries: Seq[String], - args: TPCDSQueryBenchmarkArguments): Seq[String] = { -if (args.queryFilter.nonEmpty) { - origQueries.filter(args.queryFilter.contains) + queryFilter: Set[String], + nameSuffix: String = ""): Seq[String] = { +if (queryFilter.nonEmpty) { + if (nameSuffix.nonEmpty) { +origQueries.filter { name => queryFilter.contains(s"$name$nameSuffix") } + } else { +origQueries.filter(queryFilter.contains) + } } else { origQueries } @@ -117,6 +122,7 @@ object TPCDSQueryBenchmark extends Logging { "q91", "q92", "q93", "q94", "q95", "q96", "q97", "q98", "q99") // This list only includes TPC-DS v2.7 queries that are different from v1.4 ones +val nameSuffixForQueriesV2_7 = "-v2.7" val tpcdsQueriesV2_7 = Seq( "q5a", "q6", "q10a", "q11", "q12", "q14", "q14a", "q18a", "q20", "q22", "q22a", "q24", "q27a", "q34", "q35", "q35a", "q36a", "q47", "q49", @@ -124,8 +130,9 @@ object TPCDSQueryBenchmark extends Logging { "q80a", "q86a", "q98") // If `--query-filter` defined, filters the queries that this option selects -val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs) -val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs) +val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs.queryFilter) +val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs.queryFilter, + nameSuffix = nameSuffixForQueriesV2_7) if ((queriesV1_4ToRun ++ queriesV2_7ToRun).isEmpty) { throw new RuntimeException( @@ -135,6 +142,6 @@ object TPCDSQueryBenchmark extends Logging { val tableSizes = setupTables(benchmarkArgs.dataLocation) runTpcdsQueries(queryLocation = "tpcds", queries = queriesV1_4ToRun, tableSizes) runTpcdsQueries(queryLocation = "tpcds-v2.7.0", queries = queriesV2_7ToRun, tableSizes, - nameSuffix = "-v2.7") + nameSuffix = nameSuffixForQueriesV2_7) } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 577dbb9 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark 577dbb9 is described below commit 577dbb96835f13f4cd92ea4caab9e6dece00be50 Author: Takeshi Yamamuro AuthorDate: Wed Nov 11 15:24:05 2020 +0900 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark ### What changes were proposed in this pull request? This PR intends to fix the behaviour of query filters in `TPCDSQueryBenchmark`. We can use an option `--query-filter` for selecting TPCDS queries to run, e.g., `--query-filter q6,q8,q13`. But, the current master has a weird behaviour about the option. For example, if we pass `--query-filter q6` so as to run the TPCDS q6 only, `TPCDSQueryBenchmark` runs `q6` and `q6-v2.7` because the `filterQueries` method does not respect the name suffix. So, there is no way now to run the TPCDS q6 only. ### Why are the changes needed? Bugfix. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manually checked. Closes #30324 from maropu/FilterBugInTPCDSQueryBenchmark. Authored-by: Takeshi Yamamuro Signed-off-by: Takeshi Yamamuro (cherry picked from commit 4b367976a877adb981f65d546e1522fdf30d0731) Signed-off-by: Takeshi Yamamuro --- .../execution/benchmark/TPCDSQueryBenchmark.scala | 21 ++--- 1 file changed, 14 insertions(+), 7 deletions(-) diff --git a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala index 7bbf079..43bc7c1 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala @@ -98,11 +98,16 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark { } } - def filterQueries( + private def filterQueries( origQueries: Seq[String], - args: TPCDSQueryBenchmarkArguments): Seq[String] = { -if (args.queryFilter.nonEmpty) { - origQueries.filter(args.queryFilter.contains) + queryFilter: Set[String], + nameSuffix: String = ""): Seq[String] = { +if (queryFilter.nonEmpty) { + if (nameSuffix.nonEmpty) { +origQueries.filter { name => queryFilter.contains(s"$name$nameSuffix") } + } else { +origQueries.filter(queryFilter.contains) + } } else { origQueries } @@ -125,6 +130,7 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark { "q91", "q92", "q93", "q94", "q95", "q96", "q97", "q98", "q99") // This list only includes TPC-DS v2.7 queries that are different from v1.4 ones +val nameSuffixForQueriesV2_7 = "-v2.7" val tpcdsQueriesV2_7 = Seq( "q5a", "q6", "q10a", "q11", "q12", "q14", "q14a", "q18a", "q20", "q22", "q22a", "q24", "q27a", "q34", "q35", "q35a", "q36a", "q47", "q49", @@ -132,8 +138,9 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark { "q80a", "q86a", "q98") // If `--query-filter` defined, filters the queries that this option selects -val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs) -val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs) +val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs.queryFilter) +val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs.queryFilter, + nameSuffix = nameSuffixForQueriesV2_7) if ((queriesV1_4ToRun ++ queriesV2_7ToRun).isEmpty) { throw new RuntimeException( @@ -143,6 +150,6 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark { val tableSizes = setupTables(benchmarkArgs.dataLocation) runTpcdsQueries(queryLocation = "tpcds", queries = queriesV1_4ToRun, tableSizes) runTpcdsQueries(queryLocation = "tpcds-v2.7.0", queries = queriesV2_7ToRun, tableSizes, - nameSuffix = "-v2.7") + nameSuffix = nameSuffixForQueriesV2_7) } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (6d5d030 -> 4b36797)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 6d5d030 [SPARK-33414][SQL] Migrate SHOW CREATE TABLE command to use UnresolvedTableOrView to resolve the identifier add 4b36797 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark No new revisions were added by this update. Summary of changes: .../execution/benchmark/TPCDSQueryBenchmark.scala | 21 ++--- 1 file changed, 14 insertions(+), 7 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (4b36797 -> 8760032)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 4b36797 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark add 8760032 [SPARK-33412][SQL] OverwriteByExpression should resolve its delete condition based on the table relation not the input query No new revisions were added by this update. Summary of changes: .../apache/spark/sql/catalyst/analysis/Analyzer.scala | 9 - .../spark/sql/catalyst/plans/logical/v2Commands.scala | 3 ++- .../catalyst/analysis/DataSourceV2AnalysisSuite.scala | 17 - 3 files changed, 18 insertions(+), 11 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-2.4 updated: [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.4 by this push: new fece4a3 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark fece4a3 is described below commit fece4a3a36e23c7b99d6cb64e0c4484c9e17235f Author: Takeshi Yamamuro AuthorDate: Wed Nov 11 15:24:05 2020 +0900 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark ### What changes were proposed in this pull request? This PR intends to fix the behaviour of query filters in `TPCDSQueryBenchmark`. We can use an option `--query-filter` for selecting TPCDS queries to run, e.g., `--query-filter q6,q8,q13`. But, the current master has a weird behaviour about the option. For example, if we pass `--query-filter q6` so as to run the TPCDS q6 only, `TPCDSQueryBenchmark` runs `q6` and `q6-v2.7` because the `filterQueries` method does not respect the name suffix. So, there is no way now to run the TPCDS q6 only. ### Why are the changes needed? Bugfix. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manually checked. Closes #30324 from maropu/FilterBugInTPCDSQueryBenchmark. Authored-by: Takeshi Yamamuro Signed-off-by: Takeshi Yamamuro (cherry picked from commit 4b367976a877adb981f65d546e1522fdf30d0731) Signed-off-by: Takeshi Yamamuro --- .../execution/benchmark/TPCDSQueryBenchmark.scala | 21 ++--- 1 file changed, 14 insertions(+), 7 deletions(-) diff --git a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala index fccee97..1f8b057 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala @@ -90,11 +90,16 @@ object TPCDSQueryBenchmark extends Logging { } } - def filterQueries( + private def filterQueries( origQueries: Seq[String], - args: TPCDSQueryBenchmarkArguments): Seq[String] = { -if (args.queryFilter.nonEmpty) { - origQueries.filter(args.queryFilter.contains) + queryFilter: Set[String], + nameSuffix: String = ""): Seq[String] = { +if (queryFilter.nonEmpty) { + if (nameSuffix.nonEmpty) { +origQueries.filter { name => queryFilter.contains(s"$name$nameSuffix") } + } else { +origQueries.filter(queryFilter.contains) + } } else { origQueries } @@ -117,6 +122,7 @@ object TPCDSQueryBenchmark extends Logging { "q91", "q92", "q93", "q94", "q95", "q96", "q97", "q98", "q99") // This list only includes TPC-DS v2.7 queries that are different from v1.4 ones +val nameSuffixForQueriesV2_7 = "-v2.7" val tpcdsQueriesV2_7 = Seq( "q5a", "q6", "q10a", "q11", "q12", "q14", "q14a", "q18a", "q20", "q22", "q22a", "q24", "q27a", "q34", "q35", "q35a", "q36a", "q47", "q49", @@ -124,8 +130,9 @@ object TPCDSQueryBenchmark extends Logging { "q80a", "q86a", "q98") // If `--query-filter` defined, filters the queries that this option selects -val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs) -val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs) +val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs.queryFilter) +val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs.queryFilter, + nameSuffix = nameSuffixForQueriesV2_7) if ((queriesV1_4ToRun ++ queriesV2_7ToRun).isEmpty) { throw new RuntimeException( @@ -135,6 +142,6 @@ object TPCDSQueryBenchmark extends Logging { val tableSizes = setupTables(benchmarkArgs.dataLocation) runTpcdsQueries(queryLocation = "tpcds", queries = queriesV1_4ToRun, tableSizes) runTpcdsQueries(queryLocation = "tpcds-v2.7.0", queries = queriesV2_7ToRun, tableSizes, - nameSuffix = "-v2.7") + nameSuffix = nameSuffixForQueriesV2_7) } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 577dbb9 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark 577dbb9 is described below commit 577dbb96835f13f4cd92ea4caab9e6dece00be50 Author: Takeshi Yamamuro AuthorDate: Wed Nov 11 15:24:05 2020 +0900 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark ### What changes were proposed in this pull request? This PR intends to fix the behaviour of query filters in `TPCDSQueryBenchmark`. We can use an option `--query-filter` for selecting TPCDS queries to run, e.g., `--query-filter q6,q8,q13`. But, the current master has a weird behaviour about the option. For example, if we pass `--query-filter q6` so as to run the TPCDS q6 only, `TPCDSQueryBenchmark` runs `q6` and `q6-v2.7` because the `filterQueries` method does not respect the name suffix. So, there is no way now to run the TPCDS q6 only. ### Why are the changes needed? Bugfix. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manually checked. Closes #30324 from maropu/FilterBugInTPCDSQueryBenchmark. Authored-by: Takeshi Yamamuro Signed-off-by: Takeshi Yamamuro (cherry picked from commit 4b367976a877adb981f65d546e1522fdf30d0731) Signed-off-by: Takeshi Yamamuro --- .../execution/benchmark/TPCDSQueryBenchmark.scala | 21 ++--- 1 file changed, 14 insertions(+), 7 deletions(-) diff --git a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala index 7bbf079..43bc7c1 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala @@ -98,11 +98,16 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark { } } - def filterQueries( + private def filterQueries( origQueries: Seq[String], - args: TPCDSQueryBenchmarkArguments): Seq[String] = { -if (args.queryFilter.nonEmpty) { - origQueries.filter(args.queryFilter.contains) + queryFilter: Set[String], + nameSuffix: String = ""): Seq[String] = { +if (queryFilter.nonEmpty) { + if (nameSuffix.nonEmpty) { +origQueries.filter { name => queryFilter.contains(s"$name$nameSuffix") } + } else { +origQueries.filter(queryFilter.contains) + } } else { origQueries } @@ -125,6 +130,7 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark { "q91", "q92", "q93", "q94", "q95", "q96", "q97", "q98", "q99") // This list only includes TPC-DS v2.7 queries that are different from v1.4 ones +val nameSuffixForQueriesV2_7 = "-v2.7" val tpcdsQueriesV2_7 = Seq( "q5a", "q6", "q10a", "q11", "q12", "q14", "q14a", "q18a", "q20", "q22", "q22a", "q24", "q27a", "q34", "q35", "q35a", "q36a", "q47", "q49", @@ -132,8 +138,9 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark { "q80a", "q86a", "q98") // If `--query-filter` defined, filters the queries that this option selects -val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs) -val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs) +val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs.queryFilter) +val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs.queryFilter, + nameSuffix = nameSuffixForQueriesV2_7) if ((queriesV1_4ToRun ++ queriesV2_7ToRun).isEmpty) { throw new RuntimeException( @@ -143,6 +150,6 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark { val tableSizes = setupTables(benchmarkArgs.dataLocation) runTpcdsQueries(queryLocation = "tpcds", queries = queriesV1_4ToRun, tableSizes) runTpcdsQueries(queryLocation = "tpcds-v2.7.0", queries = queriesV2_7ToRun, tableSizes, - nameSuffix = "-v2.7") + nameSuffix = nameSuffixForQueriesV2_7) } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (6d5d030 -> 4b36797)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 6d5d030 [SPARK-33414][SQL] Migrate SHOW CREATE TABLE command to use UnresolvedTableOrView to resolve the identifier add 4b36797 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark No new revisions were added by this update. Summary of changes: .../execution/benchmark/TPCDSQueryBenchmark.scala | 21 ++--- 1 file changed, 14 insertions(+), 7 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-2.4 updated: [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.4 by this push: new fece4a3 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark fece4a3 is described below commit fece4a3a36e23c7b99d6cb64e0c4484c9e17235f Author: Takeshi Yamamuro AuthorDate: Wed Nov 11 15:24:05 2020 +0900 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark ### What changes were proposed in this pull request? This PR intends to fix the behaviour of query filters in `TPCDSQueryBenchmark`. We can use an option `--query-filter` for selecting TPCDS queries to run, e.g., `--query-filter q6,q8,q13`. But, the current master has a weird behaviour about the option. For example, if we pass `--query-filter q6` so as to run the TPCDS q6 only, `TPCDSQueryBenchmark` runs `q6` and `q6-v2.7` because the `filterQueries` method does not respect the name suffix. So, there is no way now to run the TPCDS q6 only. ### Why are the changes needed? Bugfix. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manually checked. Closes #30324 from maropu/FilterBugInTPCDSQueryBenchmark. Authored-by: Takeshi Yamamuro Signed-off-by: Takeshi Yamamuro (cherry picked from commit 4b367976a877adb981f65d546e1522fdf30d0731) Signed-off-by: Takeshi Yamamuro --- .../execution/benchmark/TPCDSQueryBenchmark.scala | 21 ++--- 1 file changed, 14 insertions(+), 7 deletions(-) diff --git a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala index fccee97..1f8b057 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala @@ -90,11 +90,16 @@ object TPCDSQueryBenchmark extends Logging { } } - def filterQueries( + private def filterQueries( origQueries: Seq[String], - args: TPCDSQueryBenchmarkArguments): Seq[String] = { -if (args.queryFilter.nonEmpty) { - origQueries.filter(args.queryFilter.contains) + queryFilter: Set[String], + nameSuffix: String = ""): Seq[String] = { +if (queryFilter.nonEmpty) { + if (nameSuffix.nonEmpty) { +origQueries.filter { name => queryFilter.contains(s"$name$nameSuffix") } + } else { +origQueries.filter(queryFilter.contains) + } } else { origQueries } @@ -117,6 +122,7 @@ object TPCDSQueryBenchmark extends Logging { "q91", "q92", "q93", "q94", "q95", "q96", "q97", "q98", "q99") // This list only includes TPC-DS v2.7 queries that are different from v1.4 ones +val nameSuffixForQueriesV2_7 = "-v2.7" val tpcdsQueriesV2_7 = Seq( "q5a", "q6", "q10a", "q11", "q12", "q14", "q14a", "q18a", "q20", "q22", "q22a", "q24", "q27a", "q34", "q35", "q35a", "q36a", "q47", "q49", @@ -124,8 +130,9 @@ object TPCDSQueryBenchmark extends Logging { "q80a", "q86a", "q98") // If `--query-filter` defined, filters the queries that this option selects -val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs) -val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs) +val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs.queryFilter) +val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs.queryFilter, + nameSuffix = nameSuffixForQueriesV2_7) if ((queriesV1_4ToRun ++ queriesV2_7ToRun).isEmpty) { throw new RuntimeException( @@ -135,6 +142,6 @@ object TPCDSQueryBenchmark extends Logging { val tableSizes = setupTables(benchmarkArgs.dataLocation) runTpcdsQueries(queryLocation = "tpcds", queries = queriesV1_4ToRun, tableSizes) runTpcdsQueries(queryLocation = "tpcds-v2.7.0", queries = queriesV2_7ToRun, tableSizes, - nameSuffix = "-v2.7") + nameSuffix = nameSuffixForQueriesV2_7) } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 577dbb9 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark 577dbb9 is described below commit 577dbb96835f13f4cd92ea4caab9e6dece00be50 Author: Takeshi Yamamuro AuthorDate: Wed Nov 11 15:24:05 2020 +0900 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark ### What changes were proposed in this pull request? This PR intends to fix the behaviour of query filters in `TPCDSQueryBenchmark`. We can use an option `--query-filter` for selecting TPCDS queries to run, e.g., `--query-filter q6,q8,q13`. But, the current master has a weird behaviour about the option. For example, if we pass `--query-filter q6` so as to run the TPCDS q6 only, `TPCDSQueryBenchmark` runs `q6` and `q6-v2.7` because the `filterQueries` method does not respect the name suffix. So, there is no way now to run the TPCDS q6 only. ### Why are the changes needed? Bugfix. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manually checked. Closes #30324 from maropu/FilterBugInTPCDSQueryBenchmark. Authored-by: Takeshi Yamamuro Signed-off-by: Takeshi Yamamuro (cherry picked from commit 4b367976a877adb981f65d546e1522fdf30d0731) Signed-off-by: Takeshi Yamamuro --- .../execution/benchmark/TPCDSQueryBenchmark.scala | 21 ++--- 1 file changed, 14 insertions(+), 7 deletions(-) diff --git a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala index 7bbf079..43bc7c1 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala @@ -98,11 +98,16 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark { } } - def filterQueries( + private def filterQueries( origQueries: Seq[String], - args: TPCDSQueryBenchmarkArguments): Seq[String] = { -if (args.queryFilter.nonEmpty) { - origQueries.filter(args.queryFilter.contains) + queryFilter: Set[String], + nameSuffix: String = ""): Seq[String] = { +if (queryFilter.nonEmpty) { + if (nameSuffix.nonEmpty) { +origQueries.filter { name => queryFilter.contains(s"$name$nameSuffix") } + } else { +origQueries.filter(queryFilter.contains) + } } else { origQueries } @@ -125,6 +130,7 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark { "q91", "q92", "q93", "q94", "q95", "q96", "q97", "q98", "q99") // This list only includes TPC-DS v2.7 queries that are different from v1.4 ones +val nameSuffixForQueriesV2_7 = "-v2.7" val tpcdsQueriesV2_7 = Seq( "q5a", "q6", "q10a", "q11", "q12", "q14", "q14a", "q18a", "q20", "q22", "q22a", "q24", "q27a", "q34", "q35", "q35a", "q36a", "q47", "q49", @@ -132,8 +138,9 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark { "q80a", "q86a", "q98") // If `--query-filter` defined, filters the queries that this option selects -val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs) -val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs) +val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs.queryFilter) +val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs.queryFilter, + nameSuffix = nameSuffixForQueriesV2_7) if ((queriesV1_4ToRun ++ queriesV2_7ToRun).isEmpty) { throw new RuntimeException( @@ -143,6 +150,6 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark { val tableSizes = setupTables(benchmarkArgs.dataLocation) runTpcdsQueries(queryLocation = "tpcds", queries = queriesV1_4ToRun, tableSizes) runTpcdsQueries(queryLocation = "tpcds-v2.7.0", queries = queriesV2_7ToRun, tableSizes, - nameSuffix = "-v2.7") + nameSuffix = nameSuffixForQueriesV2_7) } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (6d5d030 -> 4b36797)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 6d5d030 [SPARK-33414][SQL] Migrate SHOW CREATE TABLE command to use UnresolvedTableOrView to resolve the identifier add 4b36797 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark No new revisions were added by this update. Summary of changes: .../execution/benchmark/TPCDSQueryBenchmark.scala | 21 ++--- 1 file changed, 14 insertions(+), 7 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-2.4 updated: [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.4 by this push: new fece4a3 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark fece4a3 is described below commit fece4a3a36e23c7b99d6cb64e0c4484c9e17235f Author: Takeshi Yamamuro AuthorDate: Wed Nov 11 15:24:05 2020 +0900 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark ### What changes were proposed in this pull request? This PR intends to fix the behaviour of query filters in `TPCDSQueryBenchmark`. We can use an option `--query-filter` for selecting TPCDS queries to run, e.g., `--query-filter q6,q8,q13`. But, the current master has a weird behaviour about the option. For example, if we pass `--query-filter q6` so as to run the TPCDS q6 only, `TPCDSQueryBenchmark` runs `q6` and `q6-v2.7` because the `filterQueries` method does not respect the name suffix. So, there is no way now to run the TPCDS q6 only. ### Why are the changes needed? Bugfix. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manually checked. Closes #30324 from maropu/FilterBugInTPCDSQueryBenchmark. Authored-by: Takeshi Yamamuro Signed-off-by: Takeshi Yamamuro (cherry picked from commit 4b367976a877adb981f65d546e1522fdf30d0731) Signed-off-by: Takeshi Yamamuro --- .../execution/benchmark/TPCDSQueryBenchmark.scala | 21 ++--- 1 file changed, 14 insertions(+), 7 deletions(-) diff --git a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala index fccee97..1f8b057 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala @@ -90,11 +90,16 @@ object TPCDSQueryBenchmark extends Logging { } } - def filterQueries( + private def filterQueries( origQueries: Seq[String], - args: TPCDSQueryBenchmarkArguments): Seq[String] = { -if (args.queryFilter.nonEmpty) { - origQueries.filter(args.queryFilter.contains) + queryFilter: Set[String], + nameSuffix: String = ""): Seq[String] = { +if (queryFilter.nonEmpty) { + if (nameSuffix.nonEmpty) { +origQueries.filter { name => queryFilter.contains(s"$name$nameSuffix") } + } else { +origQueries.filter(queryFilter.contains) + } } else { origQueries } @@ -117,6 +122,7 @@ object TPCDSQueryBenchmark extends Logging { "q91", "q92", "q93", "q94", "q95", "q96", "q97", "q98", "q99") // This list only includes TPC-DS v2.7 queries that are different from v1.4 ones +val nameSuffixForQueriesV2_7 = "-v2.7" val tpcdsQueriesV2_7 = Seq( "q5a", "q6", "q10a", "q11", "q12", "q14", "q14a", "q18a", "q20", "q22", "q22a", "q24", "q27a", "q34", "q35", "q35a", "q36a", "q47", "q49", @@ -124,8 +130,9 @@ object TPCDSQueryBenchmark extends Logging { "q80a", "q86a", "q98") // If `--query-filter` defined, filters the queries that this option selects -val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs) -val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs) +val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs.queryFilter) +val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs.queryFilter, + nameSuffix = nameSuffixForQueriesV2_7) if ((queriesV1_4ToRun ++ queriesV2_7ToRun).isEmpty) { throw new RuntimeException( @@ -135,6 +142,6 @@ object TPCDSQueryBenchmark extends Logging { val tableSizes = setupTables(benchmarkArgs.dataLocation) runTpcdsQueries(queryLocation = "tpcds", queries = queriesV1_4ToRun, tableSizes) runTpcdsQueries(queryLocation = "tpcds-v2.7.0", queries = queriesV2_7ToRun, tableSizes, - nameSuffix = "-v2.7") + nameSuffix = nameSuffixForQueriesV2_7) } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 577dbb9 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark 577dbb9 is described below commit 577dbb96835f13f4cd92ea4caab9e6dece00be50 Author: Takeshi Yamamuro AuthorDate: Wed Nov 11 15:24:05 2020 +0900 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark ### What changes were proposed in this pull request? This PR intends to fix the behaviour of query filters in `TPCDSQueryBenchmark`. We can use an option `--query-filter` for selecting TPCDS queries to run, e.g., `--query-filter q6,q8,q13`. But, the current master has a weird behaviour about the option. For example, if we pass `--query-filter q6` so as to run the TPCDS q6 only, `TPCDSQueryBenchmark` runs `q6` and `q6-v2.7` because the `filterQueries` method does not respect the name suffix. So, there is no way now to run the TPCDS q6 only. ### Why are the changes needed? Bugfix. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manually checked. Closes #30324 from maropu/FilterBugInTPCDSQueryBenchmark. Authored-by: Takeshi Yamamuro Signed-off-by: Takeshi Yamamuro (cherry picked from commit 4b367976a877adb981f65d546e1522fdf30d0731) Signed-off-by: Takeshi Yamamuro --- .../execution/benchmark/TPCDSQueryBenchmark.scala | 21 ++--- 1 file changed, 14 insertions(+), 7 deletions(-) diff --git a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala index 7bbf079..43bc7c1 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala @@ -98,11 +98,16 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark { } } - def filterQueries( + private def filterQueries( origQueries: Seq[String], - args: TPCDSQueryBenchmarkArguments): Seq[String] = { -if (args.queryFilter.nonEmpty) { - origQueries.filter(args.queryFilter.contains) + queryFilter: Set[String], + nameSuffix: String = ""): Seq[String] = { +if (queryFilter.nonEmpty) { + if (nameSuffix.nonEmpty) { +origQueries.filter { name => queryFilter.contains(s"$name$nameSuffix") } + } else { +origQueries.filter(queryFilter.contains) + } } else { origQueries } @@ -125,6 +130,7 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark { "q91", "q92", "q93", "q94", "q95", "q96", "q97", "q98", "q99") // This list only includes TPC-DS v2.7 queries that are different from v1.4 ones +val nameSuffixForQueriesV2_7 = "-v2.7" val tpcdsQueriesV2_7 = Seq( "q5a", "q6", "q10a", "q11", "q12", "q14", "q14a", "q18a", "q20", "q22", "q22a", "q24", "q27a", "q34", "q35", "q35a", "q36a", "q47", "q49", @@ -132,8 +138,9 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark { "q80a", "q86a", "q98") // If `--query-filter` defined, filters the queries that this option selects -val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs) -val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs) +val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs.queryFilter) +val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs.queryFilter, + nameSuffix = nameSuffixForQueriesV2_7) if ((queriesV1_4ToRun ++ queriesV2_7ToRun).isEmpty) { throw new RuntimeException( @@ -143,6 +150,6 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark { val tableSizes = setupTables(benchmarkArgs.dataLocation) runTpcdsQueries(queryLocation = "tpcds", queries = queriesV1_4ToRun, tableSizes) runTpcdsQueries(queryLocation = "tpcds-v2.7.0", queries = queriesV2_7ToRun, tableSizes, - nameSuffix = "-v2.7") + nameSuffix = nameSuffixForQueriesV2_7) } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (6d5d030 -> 4b36797)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 6d5d030 [SPARK-33414][SQL] Migrate SHOW CREATE TABLE command to use UnresolvedTableOrView to resolve the identifier add 4b36797 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark No new revisions were added by this update. Summary of changes: .../execution/benchmark/TPCDSQueryBenchmark.scala | 21 ++--- 1 file changed, 14 insertions(+), 7 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (6d5d030 -> 4b36797)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 6d5d030 [SPARK-33414][SQL] Migrate SHOW CREATE TABLE command to use UnresolvedTableOrView to resolve the identifier add 4b36797 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark No new revisions were added by this update. Summary of changes: .../execution/benchmark/TPCDSQueryBenchmark.scala | 21 ++--- 1 file changed, 14 insertions(+), 7 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (1e2eeda -> 6d5d030)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 1e2eeda [SPARK-33382][SQL][TESTS] Unify datasource v1 and v2 SHOW TABLES tests add 6d5d030 [SPARK-33414][SQL] Migrate SHOW CREATE TABLE command to use UnresolvedTableOrView to resolve the identifier No new revisions were added by this update. Summary of changes: .../org/apache/spark/sql/catalyst/parser/AstBuilder.scala | 8 ++-- .../spark/sql/catalyst/plans/logical/statements.scala | 7 --- .../spark/sql/catalyst/plans/logical/v2Commands.scala | 7 +++ .../apache/spark/sql/catalyst/parser/DDLParserSuite.scala | 8 +++- .../spark/sql/catalyst/analysis/ResolveSessionCatalog.scala | 13 ++--- .../sql/execution/datasources/v2/DataSourceV2Strategy.scala | 3 +++ .../scala/org/apache/spark/sql/ShowCreateTableSuite.scala | 7 --- .../apache/spark/sql/connector/DataSourceV2SQLSuite.scala | 3 ++- .../scala/org/apache/spark/sql/execution/SQLViewSuite.scala | 2 +- 9 files changed, 36 insertions(+), 22 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (1e2eeda -> 6d5d030)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 1e2eeda [SPARK-33382][SQL][TESTS] Unify datasource v1 and v2 SHOW TABLES tests add 6d5d030 [SPARK-33414][SQL] Migrate SHOW CREATE TABLE command to use UnresolvedTableOrView to resolve the identifier No new revisions were added by this update. Summary of changes: .../org/apache/spark/sql/catalyst/parser/AstBuilder.scala | 8 ++-- .../spark/sql/catalyst/plans/logical/statements.scala | 7 --- .../spark/sql/catalyst/plans/logical/v2Commands.scala | 7 +++ .../apache/spark/sql/catalyst/parser/DDLParserSuite.scala | 8 +++- .../spark/sql/catalyst/analysis/ResolveSessionCatalog.scala | 13 ++--- .../sql/execution/datasources/v2/DataSourceV2Strategy.scala | 3 +++ .../scala/org/apache/spark/sql/ShowCreateTableSuite.scala | 7 --- .../apache/spark/sql/connector/DataSourceV2SQLSuite.scala | 3 ++- .../scala/org/apache/spark/sql/execution/SQLViewSuite.scala | 2 +- 9 files changed, 36 insertions(+), 22 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (1e2eeda -> 6d5d030)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 1e2eeda [SPARK-33382][SQL][TESTS] Unify datasource v1 and v2 SHOW TABLES tests add 6d5d030 [SPARK-33414][SQL] Migrate SHOW CREATE TABLE command to use UnresolvedTableOrView to resolve the identifier No new revisions were added by this update. Summary of changes: .../org/apache/spark/sql/catalyst/parser/AstBuilder.scala | 8 ++-- .../spark/sql/catalyst/plans/logical/statements.scala | 7 --- .../spark/sql/catalyst/plans/logical/v2Commands.scala | 7 +++ .../apache/spark/sql/catalyst/parser/DDLParserSuite.scala | 8 +++- .../spark/sql/catalyst/analysis/ResolveSessionCatalog.scala | 13 ++--- .../sql/execution/datasources/v2/DataSourceV2Strategy.scala | 3 +++ .../scala/org/apache/spark/sql/ShowCreateTableSuite.scala | 7 --- .../apache/spark/sql/connector/DataSourceV2SQLSuite.scala | 3 ++- .../scala/org/apache/spark/sql/execution/SQLViewSuite.scala | 2 +- 9 files changed, 36 insertions(+), 22 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (1e2eeda -> 6d5d030)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 1e2eeda [SPARK-33382][SQL][TESTS] Unify datasource v1 and v2 SHOW TABLES tests add 6d5d030 [SPARK-33414][SQL] Migrate SHOW CREATE TABLE command to use UnresolvedTableOrView to resolve the identifier No new revisions were added by this update. Summary of changes: .../org/apache/spark/sql/catalyst/parser/AstBuilder.scala | 8 ++-- .../spark/sql/catalyst/plans/logical/statements.scala | 7 --- .../spark/sql/catalyst/plans/logical/v2Commands.scala | 7 +++ .../apache/spark/sql/catalyst/parser/DDLParserSuite.scala | 8 +++- .../spark/sql/catalyst/analysis/ResolveSessionCatalog.scala | 13 ++--- .../sql/execution/datasources/v2/DataSourceV2Strategy.scala | 3 +++ .../scala/org/apache/spark/sql/ShowCreateTableSuite.scala | 7 --- .../apache/spark/sql/connector/DataSourceV2SQLSuite.scala | 3 ++- .../scala/org/apache/spark/sql/execution/SQLViewSuite.scala | 2 +- 9 files changed, 36 insertions(+), 22 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (1e2eeda -> 6d5d030)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 1e2eeda [SPARK-33382][SQL][TESTS] Unify datasource v1 and v2 SHOW TABLES tests add 6d5d030 [SPARK-33414][SQL] Migrate SHOW CREATE TABLE command to use UnresolvedTableOrView to resolve the identifier No new revisions were added by this update. Summary of changes: .../org/apache/spark/sql/catalyst/parser/AstBuilder.scala | 8 ++-- .../spark/sql/catalyst/plans/logical/statements.scala | 7 --- .../spark/sql/catalyst/plans/logical/v2Commands.scala | 7 +++ .../apache/spark/sql/catalyst/parser/DDLParserSuite.scala | 8 +++- .../spark/sql/catalyst/analysis/ResolveSessionCatalog.scala | 13 ++--- .../sql/execution/datasources/v2/DataSourceV2Strategy.scala | 3 +++ .../scala/org/apache/spark/sql/ShowCreateTableSuite.scala | 7 --- .../apache/spark/sql/connector/DataSourceV2SQLSuite.scala | 3 ++- .../scala/org/apache/spark/sql/execution/SQLViewSuite.scala | 2 +- 9 files changed, 36 insertions(+), 22 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (5197c5d -> 1e2eeda)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 5197c5d [SPARK-33390][SQL] Make Literal support char array add 1e2eeda [SPARK-33382][SQL][TESTS] Unify datasource v1 and v2 SHOW TABLES tests No new revisions were added by this update. Summary of changes: .../spark/sql/catalyst/parser/DDLParserSuite.scala | 49 --- .../spark/sql/connector/DataSourceV2SQLSuite.scala | 150 + .../execution/command/ShowTablesParserSuite.scala | 76 +++ .../sql/execution/command/ShowTablesSuite.scala| 122 + .../sql/execution/command/v1/ShowTablesSuite.scala | 95 + .../sql/execution/command/v2/ShowTablesSuite.scala | 115 6 files changed, 409 insertions(+), 198 deletions(-) create mode 100644 sql/core/src/test/scala/org/apache/spark/sql/execution/command/ShowTablesParserSuite.scala create mode 100644 sql/core/src/test/scala/org/apache/spark/sql/execution/command/ShowTablesSuite.scala create mode 100644 sql/core/src/test/scala/org/apache/spark/sql/execution/command/v1/ShowTablesSuite.scala create mode 100644 sql/core/src/test/scala/org/apache/spark/sql/execution/command/v2/ShowTablesSuite.scala - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (5197c5d -> 1e2eeda)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 5197c5d [SPARK-33390][SQL] Make Literal support char array add 1e2eeda [SPARK-33382][SQL][TESTS] Unify datasource v1 and v2 SHOW TABLES tests No new revisions were added by this update. Summary of changes: .../spark/sql/catalyst/parser/DDLParserSuite.scala | 49 --- .../spark/sql/connector/DataSourceV2SQLSuite.scala | 150 + .../execution/command/ShowTablesParserSuite.scala | 76 +++ .../sql/execution/command/ShowTablesSuite.scala| 122 + .../sql/execution/command/v1/ShowTablesSuite.scala | 95 + .../sql/execution/command/v2/ShowTablesSuite.scala | 115 6 files changed, 409 insertions(+), 198 deletions(-) create mode 100644 sql/core/src/test/scala/org/apache/spark/sql/execution/command/ShowTablesParserSuite.scala create mode 100644 sql/core/src/test/scala/org/apache/spark/sql/execution/command/ShowTablesSuite.scala create mode 100644 sql/core/src/test/scala/org/apache/spark/sql/execution/command/v1/ShowTablesSuite.scala create mode 100644 sql/core/src/test/scala/org/apache/spark/sql/execution/command/v2/ShowTablesSuite.scala - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (5197c5d -> 1e2eeda)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 5197c5d [SPARK-33390][SQL] Make Literal support char array add 1e2eeda [SPARK-33382][SQL][TESTS] Unify datasource v1 and v2 SHOW TABLES tests No new revisions were added by this update. Summary of changes: .../spark/sql/catalyst/parser/DDLParserSuite.scala | 49 --- .../spark/sql/connector/DataSourceV2SQLSuite.scala | 150 + .../execution/command/ShowTablesParserSuite.scala | 76 +++ .../sql/execution/command/ShowTablesSuite.scala| 122 + .../sql/execution/command/v1/ShowTablesSuite.scala | 95 + .../sql/execution/command/v2/ShowTablesSuite.scala | 115 6 files changed, 409 insertions(+), 198 deletions(-) create mode 100644 sql/core/src/test/scala/org/apache/spark/sql/execution/command/ShowTablesParserSuite.scala create mode 100644 sql/core/src/test/scala/org/apache/spark/sql/execution/command/ShowTablesSuite.scala create mode 100644 sql/core/src/test/scala/org/apache/spark/sql/execution/command/v1/ShowTablesSuite.scala create mode 100644 sql/core/src/test/scala/org/apache/spark/sql/execution/command/v2/ShowTablesSuite.scala - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (5197c5d -> 1e2eeda)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 5197c5d [SPARK-33390][SQL] Make Literal support char array add 1e2eeda [SPARK-33382][SQL][TESTS] Unify datasource v1 and v2 SHOW TABLES tests No new revisions were added by this update. Summary of changes: .../spark/sql/catalyst/parser/DDLParserSuite.scala | 49 --- .../spark/sql/connector/DataSourceV2SQLSuite.scala | 150 + .../execution/command/ShowTablesParserSuite.scala | 76 +++ .../sql/execution/command/ShowTablesSuite.scala| 122 + .../sql/execution/command/v1/ShowTablesSuite.scala | 95 + .../sql/execution/command/v2/ShowTablesSuite.scala | 115 6 files changed, 409 insertions(+), 198 deletions(-) create mode 100644 sql/core/src/test/scala/org/apache/spark/sql/execution/command/ShowTablesParserSuite.scala create mode 100644 sql/core/src/test/scala/org/apache/spark/sql/execution/command/ShowTablesSuite.scala create mode 100644 sql/core/src/test/scala/org/apache/spark/sql/execution/command/v1/ShowTablesSuite.scala create mode 100644 sql/core/src/test/scala/org/apache/spark/sql/execution/command/v2/ShowTablesSuite.scala - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (5197c5d -> 1e2eeda)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 5197c5d [SPARK-33390][SQL] Make Literal support char array add 1e2eeda [SPARK-33382][SQL][TESTS] Unify datasource v1 and v2 SHOW TABLES tests No new revisions were added by this update. Summary of changes: .../spark/sql/catalyst/parser/DDLParserSuite.scala | 49 --- .../spark/sql/connector/DataSourceV2SQLSuite.scala | 150 + .../execution/command/ShowTablesParserSuite.scala | 76 +++ .../sql/execution/command/ShowTablesSuite.scala| 122 + .../sql/execution/command/v1/ShowTablesSuite.scala | 95 + .../sql/execution/command/v2/ShowTablesSuite.scala | 115 6 files changed, 409 insertions(+), 198 deletions(-) create mode 100644 sql/core/src/test/scala/org/apache/spark/sql/execution/command/ShowTablesParserSuite.scala create mode 100644 sql/core/src/test/scala/org/apache/spark/sql/execution/command/ShowTablesSuite.scala create mode 100644 sql/core/src/test/scala/org/apache/spark/sql/execution/command/v1/ShowTablesSuite.scala create mode 100644 sql/core/src/test/scala/org/apache/spark/sql/execution/command/v2/ShowTablesSuite.scala - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (4634694 -> 5197c5d)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 4634694 [SPARK-33404][SQL] Fix incorrect results in `date_trunc` expression add 5197c5d [SPARK-33390][SQL] Make Literal support char array No new revisions were added by this update. Summary of changes: .../org/apache/spark/sql/catalyst/CatalystTypeConverters.scala | 1 + .../org/apache/spark/sql/catalyst/expressions/literals.scala | 4 .../apache/spark/sql/catalyst/CatalystTypeConvertersSuite.scala | 7 +++ .../spark/sql/catalyst/expressions/LiteralExpressionSuite.scala | 9 + sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala | 8 5 files changed, 29 insertions(+) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (4634694 -> 5197c5d)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 4634694 [SPARK-33404][SQL] Fix incorrect results in `date_trunc` expression add 5197c5d [SPARK-33390][SQL] Make Literal support char array No new revisions were added by this update. Summary of changes: .../org/apache/spark/sql/catalyst/CatalystTypeConverters.scala | 1 + .../org/apache/spark/sql/catalyst/expressions/literals.scala | 4 .../apache/spark/sql/catalyst/CatalystTypeConvertersSuite.scala | 7 +++ .../spark/sql/catalyst/expressions/LiteralExpressionSuite.scala | 9 + sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala | 8 5 files changed, 29 insertions(+) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (4634694 -> 5197c5d)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 4634694 [SPARK-33404][SQL] Fix incorrect results in `date_trunc` expression add 5197c5d [SPARK-33390][SQL] Make Literal support char array No new revisions were added by this update. Summary of changes: .../org/apache/spark/sql/catalyst/CatalystTypeConverters.scala | 1 + .../org/apache/spark/sql/catalyst/expressions/literals.scala | 4 .../apache/spark/sql/catalyst/CatalystTypeConvertersSuite.scala | 7 +++ .../spark/sql/catalyst/expressions/LiteralExpressionSuite.scala | 9 + sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala | 8 5 files changed, 29 insertions(+) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (4634694 -> 5197c5d)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 4634694 [SPARK-33404][SQL] Fix incorrect results in `date_trunc` expression add 5197c5d [SPARK-33390][SQL] Make Literal support char array No new revisions were added by this update. Summary of changes: .../org/apache/spark/sql/catalyst/CatalystTypeConverters.scala | 1 + .../org/apache/spark/sql/catalyst/expressions/literals.scala | 4 .../apache/spark/sql/catalyst/CatalystTypeConvertersSuite.scala | 7 +++ .../spark/sql/catalyst/expressions/LiteralExpressionSuite.scala | 9 + sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala | 8 5 files changed, 29 insertions(+) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (4634694 -> 5197c5d)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 4634694 [SPARK-33404][SQL] Fix incorrect results in `date_trunc` expression add 5197c5d [SPARK-33390][SQL] Make Literal support char array No new revisions were added by this update. Summary of changes: .../org/apache/spark/sql/catalyst/CatalystTypeConverters.scala | 1 + .../org/apache/spark/sql/catalyst/expressions/literals.scala | 4 .../apache/spark/sql/catalyst/CatalystTypeConvertersSuite.scala | 7 +++ .../spark/sql/catalyst/expressions/LiteralExpressionSuite.scala | 9 + sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala | 8 5 files changed, 29 insertions(+) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (6fa80ed -> 4634694)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 6fa80ed [SPARK-7][SQL] Support subexpression elimination in branches of conditional expressions add 4634694 [SPARK-33404][SQL] Fix incorrect results in `date_trunc` expression No new revisions were added by this update. Summary of changes: .../spark/sql/catalyst/util/DateTimeUtils.scala| 6 ++-- .../sql/catalyst/util/DateTimeUtilsSuite.scala | 34 +++--- 2 files changed, 28 insertions(+), 12 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (6fa80ed -> 4634694)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 6fa80ed [SPARK-7][SQL] Support subexpression elimination in branches of conditional expressions add 4634694 [SPARK-33404][SQL] Fix incorrect results in `date_trunc` expression No new revisions were added by this update. Summary of changes: .../spark/sql/catalyst/util/DateTimeUtils.scala| 6 ++-- .../sql/catalyst/util/DateTimeUtilsSuite.scala | 34 +++--- 2 files changed, 28 insertions(+), 12 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (6fa80ed -> 4634694)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 6fa80ed [SPARK-7][SQL] Support subexpression elimination in branches of conditional expressions add 4634694 [SPARK-33404][SQL] Fix incorrect results in `date_trunc` expression No new revisions were added by this update. Summary of changes: .../spark/sql/catalyst/util/DateTimeUtils.scala| 6 ++-- .../sql/catalyst/util/DateTimeUtilsSuite.scala | 34 +++--- 2 files changed, 28 insertions(+), 12 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (6fa80ed -> 4634694)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 6fa80ed [SPARK-7][SQL] Support subexpression elimination in branches of conditional expressions add 4634694 [SPARK-33404][SQL] Fix incorrect results in `date_trunc` expression No new revisions were added by this update. Summary of changes: .../spark/sql/catalyst/util/DateTimeUtils.scala| 6 ++-- .../sql/catalyst/util/DateTimeUtilsSuite.scala | 34 +++--- 2 files changed, 28 insertions(+), 12 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (6fa80ed -> 4634694)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 6fa80ed [SPARK-7][SQL] Support subexpression elimination in branches of conditional expressions add 4634694 [SPARK-33404][SQL] Fix incorrect results in `date_trunc` expression No new revisions were added by this update. Summary of changes: .../spark/sql/catalyst/util/DateTimeUtils.scala| 6 ++-- .../sql/catalyst/util/DateTimeUtilsSuite.scala | 34 +++--- 2 files changed, 28 insertions(+), 12 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (122c899 -> 6fa80ed)
This is an automated email from the ASF dual-hosted git repository. viirya pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 122c899 [SPARK-33251][FOLLOWUP][PYTHON][DOCS][MINOR] Adjusts returns PrefixSpan.findFrequentSequentialPatterns add 6fa80ed [SPARK-7][SQL] Support subexpression elimination in branches of conditional expressions No new revisions were added by this update. Summary of changes: .../expressions/EquivalentExpressions.scala| 96 ++ .../expressions/codegen/CodeGenerator.scala| 2 +- .../SubexpressionEliminationSuite.scala| 111 +++-- 3 files changed, 177 insertions(+), 32 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (122c899 -> 6fa80ed)
This is an automated email from the ASF dual-hosted git repository. viirya pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 122c899 [SPARK-33251][FOLLOWUP][PYTHON][DOCS][MINOR] Adjusts returns PrefixSpan.findFrequentSequentialPatterns add 6fa80ed [SPARK-7][SQL] Support subexpression elimination in branches of conditional expressions No new revisions were added by this update. Summary of changes: .../expressions/EquivalentExpressions.scala| 96 ++ .../expressions/codegen/CodeGenerator.scala| 2 +- .../SubexpressionEliminationSuite.scala| 111 +++-- 3 files changed, 177 insertions(+), 32 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (122c899 -> 6fa80ed)
This is an automated email from the ASF dual-hosted git repository. viirya pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 122c899 [SPARK-33251][FOLLOWUP][PYTHON][DOCS][MINOR] Adjusts returns PrefixSpan.findFrequentSequentialPatterns add 6fa80ed [SPARK-7][SQL] Support subexpression elimination in branches of conditional expressions No new revisions were added by this update. Summary of changes: .../expressions/EquivalentExpressions.scala| 96 ++ .../expressions/codegen/CodeGenerator.scala| 2 +- .../SubexpressionEliminationSuite.scala| 111 +++-- 3 files changed, 177 insertions(+), 32 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (122c899 -> 6fa80ed)
This is an automated email from the ASF dual-hosted git repository. viirya pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 122c899 [SPARK-33251][FOLLOWUP][PYTHON][DOCS][MINOR] Adjusts returns PrefixSpan.findFrequentSequentialPatterns add 6fa80ed [SPARK-7][SQL] Support subexpression elimination in branches of conditional expressions No new revisions were added by this update. Summary of changes: .../expressions/EquivalentExpressions.scala| 96 ++ .../expressions/codegen/CodeGenerator.scala| 2 +- .../SubexpressionEliminationSuite.scala| 111 +++-- 3 files changed, 177 insertions(+), 32 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (122c899 -> 6fa80ed)
This is an automated email from the ASF dual-hosted git repository. viirya pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 122c899 [SPARK-33251][FOLLOWUP][PYTHON][DOCS][MINOR] Adjusts returns PrefixSpan.findFrequentSequentialPatterns add 6fa80ed [SPARK-7][SQL] Support subexpression elimination in branches of conditional expressions No new revisions were added by this update. Summary of changes: .../expressions/EquivalentExpressions.scala| 96 ++ .../expressions/codegen/CodeGenerator.scala| 2 +- .../SubexpressionEliminationSuite.scala| 111 +++-- 3 files changed, 177 insertions(+), 32 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (3165ca7 -> 122c899)
This is an automated email from the ASF dual-hosted git repository. huaxingao pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 3165ca7 [SPARK-33376][SQL] Remove the option of "sharesHadoopClasses" in Hive IsolatedClientLoader add 122c899 [SPARK-33251][FOLLOWUP][PYTHON][DOCS][MINOR] Adjusts returns PrefixSpan.findFrequentSequentialPatterns No new revisions were added by this update. Summary of changes: python/pyspark/ml/fpm.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (3165ca7 -> 122c899)
This is an automated email from the ASF dual-hosted git repository. huaxingao pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 3165ca7 [SPARK-33376][SQL] Remove the option of "sharesHadoopClasses" in Hive IsolatedClientLoader add 122c899 [SPARK-33251][FOLLOWUP][PYTHON][DOCS][MINOR] Adjusts returns PrefixSpan.findFrequentSequentialPatterns No new revisions were added by this update. Summary of changes: python/pyspark/ml/fpm.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (3165ca7 -> 122c899)
This is an automated email from the ASF dual-hosted git repository. huaxingao pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 3165ca7 [SPARK-33376][SQL] Remove the option of "sharesHadoopClasses" in Hive IsolatedClientLoader add 122c899 [SPARK-33251][FOLLOWUP][PYTHON][DOCS][MINOR] Adjusts returns PrefixSpan.findFrequentSequentialPatterns No new revisions were added by this update. Summary of changes: python/pyspark/ml/fpm.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (3165ca7 -> 122c899)
This is an automated email from the ASF dual-hosted git repository. huaxingao pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 3165ca7 [SPARK-33376][SQL] Remove the option of "sharesHadoopClasses" in Hive IsolatedClientLoader add 122c899 [SPARK-33251][FOLLOWUP][PYTHON][DOCS][MINOR] Adjusts returns PrefixSpan.findFrequentSequentialPatterns No new revisions were added by this update. Summary of changes: python/pyspark/ml/fpm.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (3165ca7 -> 122c899)
This is an automated email from the ASF dual-hosted git repository. huaxingao pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 3165ca7 [SPARK-33376][SQL] Remove the option of "sharesHadoopClasses" in Hive IsolatedClientLoader add 122c899 [SPARK-33251][FOLLOWUP][PYTHON][DOCS][MINOR] Adjusts returns PrefixSpan.findFrequentSequentialPatterns No new revisions were added by this update. Summary of changes: python/pyspark/ml/fpm.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (34f5e7c -> 3165ca7)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 34f5e7c [SPARK-33302][SQL] Push down filters through Expand add 3165ca7 [SPARK-33376][SQL] Remove the option of "sharesHadoopClasses" in Hive IsolatedClientLoader No new revisions were added by this update. Summary of changes: .../spark/sql/hive/client/IsolatedClientLoader.scala | 16 .../spark/sql/hive/client/HadoopVersionInfoSuite.scala | 3 +-- .../apache/spark/sql/hive/client/HiveClientBuilder.scala | 6 ++ .../sql/hive/client/HivePartitionFilteringSuite.scala| 4 .../apache/spark/sql/hive/client/HiveVersionSuite.scala | 7 ++- 5 files changed, 9 insertions(+), 27 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (34f5e7c -> 3165ca7)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 34f5e7c [SPARK-33302][SQL] Push down filters through Expand add 3165ca7 [SPARK-33376][SQL] Remove the option of "sharesHadoopClasses" in Hive IsolatedClientLoader No new revisions were added by this update. Summary of changes: .../spark/sql/hive/client/IsolatedClientLoader.scala | 16 .../spark/sql/hive/client/HadoopVersionInfoSuite.scala | 3 +-- .../apache/spark/sql/hive/client/HiveClientBuilder.scala | 6 ++ .../sql/hive/client/HivePartitionFilteringSuite.scala| 4 .../apache/spark/sql/hive/client/HiveVersionSuite.scala | 7 ++- 5 files changed, 9 insertions(+), 27 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (34f5e7c -> 3165ca7)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 34f5e7c [SPARK-33302][SQL] Push down filters through Expand add 3165ca7 [SPARK-33376][SQL] Remove the option of "sharesHadoopClasses" in Hive IsolatedClientLoader No new revisions were added by this update. Summary of changes: .../spark/sql/hive/client/IsolatedClientLoader.scala | 16 .../spark/sql/hive/client/HadoopVersionInfoSuite.scala | 3 +-- .../apache/spark/sql/hive/client/HiveClientBuilder.scala | 6 ++ .../sql/hive/client/HivePartitionFilteringSuite.scala| 4 .../apache/spark/sql/hive/client/HiveVersionSuite.scala | 7 ++- 5 files changed, 9 insertions(+), 27 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (34f5e7c -> 3165ca7)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 34f5e7c [SPARK-33302][SQL] Push down filters through Expand add 3165ca7 [SPARK-33376][SQL] Remove the option of "sharesHadoopClasses" in Hive IsolatedClientLoader No new revisions were added by this update. Summary of changes: .../spark/sql/hive/client/IsolatedClientLoader.scala | 16 .../spark/sql/hive/client/HadoopVersionInfoSuite.scala | 3 +-- .../apache/spark/sql/hive/client/HiveClientBuilder.scala | 6 ++ .../sql/hive/client/HivePartitionFilteringSuite.scala| 4 .../apache/spark/sql/hive/client/HiveVersionSuite.scala | 7 ++- 5 files changed, 9 insertions(+), 27 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (34f5e7c -> 3165ca7)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 34f5e7c [SPARK-33302][SQL] Push down filters through Expand add 3165ca7 [SPARK-33376][SQL] Remove the option of "sharesHadoopClasses" in Hive IsolatedClientLoader No new revisions were added by this update. Summary of changes: .../spark/sql/hive/client/IsolatedClientLoader.scala | 16 .../spark/sql/hive/client/HadoopVersionInfoSuite.scala | 3 +-- .../apache/spark/sql/hive/client/HiveClientBuilder.scala | 6 ++ .../sql/hive/client/HivePartitionFilteringSuite.scala| 4 .../apache/spark/sql/hive/client/HiveVersionSuite.scala | 7 ++- 5 files changed, 9 insertions(+), 27 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (4934da56 -> 34f5e7c)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 4934da56 [SPARK-33305][SQL] DSv2: DROP TABLE command should also invalidate cache add 34f5e7c [SPARK-33302][SQL] Push down filters through Expand No new revisions were added by this update. Summary of changes: .../spark/sql/catalyst/optimizer/Optimizer.scala | 1 + .../catalyst/optimizer/FilterPushdownSuite.scala | 24 +- .../optimizer/LeftSemiAntiJoinPushDownSuite.scala | 15 ++ 3 files changed, 39 insertions(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (4934da56 -> 34f5e7c)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 4934da56 [SPARK-33305][SQL] DSv2: DROP TABLE command should also invalidate cache add 34f5e7c [SPARK-33302][SQL] Push down filters through Expand No new revisions were added by this update. Summary of changes: .../spark/sql/catalyst/optimizer/Optimizer.scala | 1 + .../catalyst/optimizer/FilterPushdownSuite.scala | 24 +- .../optimizer/LeftSemiAntiJoinPushDownSuite.scala | 15 ++ 3 files changed, 39 insertions(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (27bb40b -> 4934da56)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 27bb40b [SPARK-9][PYTHON] Pyspark application will hang due to non Exception error add 4934da56 [SPARK-33305][SQL] DSv2: DROP TABLE command should also invalidate cache No new revisions were added by this update. Summary of changes: .../execution/datasources/v2/DataSourceV2Strategy.scala | 2 +- .../sql/execution/datasources/v2/DropTableExec.scala | 7 ++- .../spark/sql/connector/DataSourceV2SQLSuite.scala | 16 3 files changed, 23 insertions(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (4934da56 -> 34f5e7c)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 4934da56 [SPARK-33305][SQL] DSv2: DROP TABLE command should also invalidate cache add 34f5e7c [SPARK-33302][SQL] Push down filters through Expand No new revisions were added by this update. Summary of changes: .../spark/sql/catalyst/optimizer/Optimizer.scala | 1 + .../catalyst/optimizer/FilterPushdownSuite.scala | 24 +- .../optimizer/LeftSemiAntiJoinPushDownSuite.scala | 15 ++ 3 files changed, 39 insertions(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (27bb40b -> 4934da56)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 27bb40b [SPARK-9][PYTHON] Pyspark application will hang due to non Exception error add 4934da56 [SPARK-33305][SQL] DSv2: DROP TABLE command should also invalidate cache No new revisions were added by this update. Summary of changes: .../execution/datasources/v2/DataSourceV2Strategy.scala | 2 +- .../sql/execution/datasources/v2/DropTableExec.scala | 7 ++- .../spark/sql/connector/DataSourceV2SQLSuite.scala | 16 3 files changed, 23 insertions(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (4934da56 -> 34f5e7c)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 4934da56 [SPARK-33305][SQL] DSv2: DROP TABLE command should also invalidate cache add 34f5e7c [SPARK-33302][SQL] Push down filters through Expand No new revisions were added by this update. Summary of changes: .../spark/sql/catalyst/optimizer/Optimizer.scala | 1 + .../catalyst/optimizer/FilterPushdownSuite.scala | 24 +- .../optimizer/LeftSemiAntiJoinPushDownSuite.scala | 15 ++ 3 files changed, 39 insertions(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (27bb40b -> 4934da56)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 27bb40b [SPARK-9][PYTHON] Pyspark application will hang due to non Exception error add 4934da56 [SPARK-33305][SQL] DSv2: DROP TABLE command should also invalidate cache No new revisions were added by this update. Summary of changes: .../execution/datasources/v2/DataSourceV2Strategy.scala | 2 +- .../sql/execution/datasources/v2/DropTableExec.scala | 7 ++- .../spark/sql/connector/DataSourceV2SQLSuite.scala | 16 3 files changed, 23 insertions(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (4934da56 -> 34f5e7c)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 4934da56 [SPARK-33305][SQL] DSv2: DROP TABLE command should also invalidate cache add 34f5e7c [SPARK-33302][SQL] Push down filters through Expand No new revisions were added by this update. Summary of changes: .../spark/sql/catalyst/optimizer/Optimizer.scala | 1 + .../catalyst/optimizer/FilterPushdownSuite.scala | 24 +- .../optimizer/LeftSemiAntiJoinPushDownSuite.scala | 15 ++ 3 files changed, 39 insertions(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (27bb40b -> 4934da56)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 27bb40b [SPARK-9][PYTHON] Pyspark application will hang due to non Exception error add 4934da56 [SPARK-33305][SQL] DSv2: DROP TABLE command should also invalidate cache No new revisions were added by this update. Summary of changes: .../execution/datasources/v2/DataSourceV2Strategy.scala | 2 +- .../sql/execution/datasources/v2/DropTableExec.scala | 7 ++- .../spark/sql/connector/DataSourceV2SQLSuite.scala | 16 3 files changed, 23 insertions(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-33305][SQL] DSv2: DROP TABLE command should also invalidate cache
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 4934da56 [SPARK-33305][SQL] DSv2: DROP TABLE command should also invalidate cache 4934da56 is described below commit 4934da56bcc13fc61afc8e8cc44fb5290b5e7b32 Author: Chao Sun AuthorDate: Tue Nov 10 14:37:42 2020 + [SPARK-33305][SQL] DSv2: DROP TABLE command should also invalidate cache ### What changes were proposed in this pull request? This changes `DropTableExec` to also invalidate caches referencing the table to be dropped, in a cascading manner. ### Why are the changes needed? In DSv1, `DROP TABLE` command also invalidate caches as described in [SPARK-19765](https://issues.apache.org/jira/browse/SPARK-19765). However in DSv2 the same command only drops the table but doesn't handle the caches. This could lead to correctness issue. ### Does this PR introduce _any_ user-facing change? Yes. Now DSv2 `DROP TABLE` command also invalidates cache. ### How was this patch tested? Added a new UT Closes #30211 from sunchao/SPARK-33305. Authored-by: Chao Sun Signed-off-by: Wenchen Fan --- .../execution/datasources/v2/DataSourceV2Strategy.scala | 2 +- .../sql/execution/datasources/v2/DropTableExec.scala | 7 ++- .../spark/sql/connector/DataSourceV2SQLSuite.scala | 16 3 files changed, 23 insertions(+), 2 deletions(-) diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Strategy.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Strategy.scala index 817b3ce..5695d23 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Strategy.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Strategy.scala @@ -229,7 +229,7 @@ class DataSourceV2Strategy(session: SparkSession) extends Strategy with Predicat throw new AnalysisException("Describing columns is not supported for v2 tables.") case DropTable(r: ResolvedTable, ifExists, purge) => - DropTableExec(r.catalog, r.identifier, ifExists, purge) :: Nil + DropTableExec(session, r.catalog, r.table, r.identifier, ifExists, purge) :: Nil case _: NoopDropTable => LocalTableScanExec(Nil, Nil) :: Nil diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DropTableExec.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DropTableExec.scala index 1fd0cd1..068475f 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DropTableExec.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DropTableExec.scala @@ -17,22 +17,27 @@ package org.apache.spark.sql.execution.datasources.v2 +import org.apache.spark.sql.SparkSession import org.apache.spark.sql.catalyst.InternalRow import org.apache.spark.sql.catalyst.analysis.NoSuchTableException import org.apache.spark.sql.catalyst.expressions.Attribute -import org.apache.spark.sql.connector.catalog.{Identifier, TableCatalog} +import org.apache.spark.sql.connector.catalog.{Identifier, Table, TableCatalog} /** * Physical plan node for dropping a table. */ case class DropTableExec( +session: SparkSession, catalog: TableCatalog, +table: Table, ident: Identifier, ifExists: Boolean, purge: Boolean) extends V2CommandExec { override def run(): Seq[InternalRow] = { if (catalog.tableExists(ident)) { + val v2Relation = DataSourceV2Relation.create(table, Some(catalog), Some(ident)) + session.sharedState.cacheManager.uncacheQuery(session, v2Relation, cascade = true) catalog.dropTable(ident, purge) } else if (!ifExists) { throw new NoSuchTableException(ident) diff --git a/sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala index ee3f7be..dfa32b9 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala @@ -784,6 +784,22 @@ class DataSourceV2SQLSuite } } + test("SPARK-33305: DROP TABLE should also invalidate cache") { +val t = "testcat.ns.t" +val view = "view" +withTable(t) { + withTempView(view) { +sql(s"CREATE TABLE $t USING foo AS SELECT id, data FROM source") +sql(s"CACHE TABLE $view AS SELECT id FROM $t") +checkAnswer(sql(s"SELECT * FROM $t"), spark.table("source")) +checkAnswer(sql(s"SELECT * FROM $view"), spark.table("source").select("id")) + +sql(s"DROP
[spark] branch branch-2.4 updated (bfeaef1 -> efceeee)
This is an automated email from the ASF dual-hosted git repository. yumwang pushed a change to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git. from bfeaef1 [SPARK-33405][BUILD][2.4] Upgrade commons-compress to 1.20 add efc [SPARK-33372][SQL][2.4] Fix InSet bucket pruning No new revisions were added by this update. Summary of changes: .../apache/spark/sql/execution/datasources/FileSourceStrategy.scala | 5 ++--- .../test/scala/org/apache/spark/sql/sources/BucketedReadSuite.scala | 2 +- 2 files changed, 3 insertions(+), 4 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-2.4 updated (bfeaef1 -> efceeee)
This is an automated email from the ASF dual-hosted git repository. yumwang pushed a change to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git. from bfeaef1 [SPARK-33405][BUILD][2.4] Upgrade commons-compress to 1.20 add efc [SPARK-33372][SQL][2.4] Fix InSet bucket pruning No new revisions were added by this update. Summary of changes: .../apache/spark/sql/execution/datasources/FileSourceStrategy.scala | 5 ++--- .../test/scala/org/apache/spark/sql/sources/BucketedReadSuite.scala | 2 +- 2 files changed, 3 insertions(+), 4 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-2.4 updated: [SPARK-33372][SQL][2.4] Fix InSet bucket pruning
This is an automated email from the ASF dual-hosted git repository. yumwang pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.4 by this push: new efc [SPARK-33372][SQL][2.4] Fix InSet bucket pruning efc is described below commit efcd7ebfd498b3010ef35d2c1388c2319c53 Author: Yuming Wang AuthorDate: Tue Nov 10 20:30:53 2020 +0800 [SPARK-33372][SQL][2.4] Fix InSet bucket pruning ### What changes were proposed in this pull request? This is a backport of #30279. This pr fix `InSet` bucket pruning because of it's values should not be `Literal`: https://github.com/apache/spark/blob/cbd3fdea62dab73fc4a96702de8fd1f07722da66/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala#L253-L255 ### Why are the changes needed? Fix bug. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Unit test Closes #30308 from wangyum/SPARK-33372-2.4. Authored-by: Yuming Wang Signed-off-by: Yuming Wang --- .../apache/spark/sql/execution/datasources/FileSourceStrategy.scala | 5 ++--- .../test/scala/org/apache/spark/sql/sources/BucketedReadSuite.scala | 2 +- 2 files changed, 3 insertions(+), 4 deletions(-) diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala index fe27b78..9467293 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala @@ -89,9 +89,8 @@ object FileSourceStrategy extends Strategy with Logging { case expressions.In(a: Attribute, list) if list.forall(_.isInstanceOf[Literal]) && a.name == bucketColumnName => getBucketSetFromIterable(a, list.map(e => e.eval(EmptyRow))) - case expressions.InSet(a: Attribute, hset) -if hset.forall(_.isInstanceOf[Literal]) && a.name == bucketColumnName => -getBucketSetFromIterable(a, hset.map(e => expressions.Literal(e).eval(EmptyRow))) + case expressions.InSet(a: Attribute, hset) if a.name == bucketColumnName => +getBucketSetFromIterable(a, hset) case expressions.IsNull(a: Attribute) if a.name == bucketColumnName => getBucketSetFromValue(a, null) case expressions.And(left, right) => diff --git a/sql/core/src/test/scala/org/apache/spark/sql/sources/BucketedReadSuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/sources/BucketedReadSuite.scala index 42443b0..c01b7db 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/sources/BucketedReadSuite.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/sources/BucketedReadSuite.scala @@ -173,7 +173,7 @@ abstract class BucketedReadSuite extends QueryTest with SQLTestUtils { df) // Case 4: InSet -val inSetExpr = expressions.InSet($"j".expr, Set(j, j + 1, j + 2, j + 3).map(lit(_).expr)) +val inSetExpr = expressions.InSet($"j".expr, Set(j, j + 1, j + 2, j + 3)) checkPrunedAnswers( bucketSpec, bucketValues = Seq(j, j + 1, j + 2, j + 3), - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-2.4 updated: [SPARK-33372][SQL][2.4] Fix InSet bucket pruning
This is an automated email from the ASF dual-hosted git repository. yumwang pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.4 by this push: new efc [SPARK-33372][SQL][2.4] Fix InSet bucket pruning efc is described below commit efcd7ebfd498b3010ef35d2c1388c2319c53 Author: Yuming Wang AuthorDate: Tue Nov 10 20:30:53 2020 +0800 [SPARK-33372][SQL][2.4] Fix InSet bucket pruning ### What changes were proposed in this pull request? This is a backport of #30279. This pr fix `InSet` bucket pruning because of it's values should not be `Literal`: https://github.com/apache/spark/blob/cbd3fdea62dab73fc4a96702de8fd1f07722da66/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala#L253-L255 ### Why are the changes needed? Fix bug. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Unit test Closes #30308 from wangyum/SPARK-33372-2.4. Authored-by: Yuming Wang Signed-off-by: Yuming Wang --- .../apache/spark/sql/execution/datasources/FileSourceStrategy.scala | 5 ++--- .../test/scala/org/apache/spark/sql/sources/BucketedReadSuite.scala | 2 +- 2 files changed, 3 insertions(+), 4 deletions(-) diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala index fe27b78..9467293 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala @@ -89,9 +89,8 @@ object FileSourceStrategy extends Strategy with Logging { case expressions.In(a: Attribute, list) if list.forall(_.isInstanceOf[Literal]) && a.name == bucketColumnName => getBucketSetFromIterable(a, list.map(e => e.eval(EmptyRow))) - case expressions.InSet(a: Attribute, hset) -if hset.forall(_.isInstanceOf[Literal]) && a.name == bucketColumnName => -getBucketSetFromIterable(a, hset.map(e => expressions.Literal(e).eval(EmptyRow))) + case expressions.InSet(a: Attribute, hset) if a.name == bucketColumnName => +getBucketSetFromIterable(a, hset) case expressions.IsNull(a: Attribute) if a.name == bucketColumnName => getBucketSetFromValue(a, null) case expressions.And(left, right) => diff --git a/sql/core/src/test/scala/org/apache/spark/sql/sources/BucketedReadSuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/sources/BucketedReadSuite.scala index 42443b0..c01b7db 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/sources/BucketedReadSuite.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/sources/BucketedReadSuite.scala @@ -173,7 +173,7 @@ abstract class BucketedReadSuite extends QueryTest with SQLTestUtils { df) // Case 4: InSet -val inSetExpr = expressions.InSet($"j".expr, Set(j, j + 1, j + 2, j + 3).map(lit(_).expr)) +val inSetExpr = expressions.InSet($"j".expr, Set(j, j + 1, j + 2, j + 3)) checkPrunedAnswers( bucketSpec, bucketValues = Seq(j, j + 1, j + 2, j + 3), - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-2.4 updated: [SPARK-33372][SQL][2.4] Fix InSet bucket pruning
This is an automated email from the ASF dual-hosted git repository. yumwang pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.4 by this push: new efc [SPARK-33372][SQL][2.4] Fix InSet bucket pruning efc is described below commit efcd7ebfd498b3010ef35d2c1388c2319c53 Author: Yuming Wang AuthorDate: Tue Nov 10 20:30:53 2020 +0800 [SPARK-33372][SQL][2.4] Fix InSet bucket pruning ### What changes were proposed in this pull request? This is a backport of #30279. This pr fix `InSet` bucket pruning because of it's values should not be `Literal`: https://github.com/apache/spark/blob/cbd3fdea62dab73fc4a96702de8fd1f07722da66/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala#L253-L255 ### Why are the changes needed? Fix bug. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Unit test Closes #30308 from wangyum/SPARK-33372-2.4. Authored-by: Yuming Wang Signed-off-by: Yuming Wang --- .../apache/spark/sql/execution/datasources/FileSourceStrategy.scala | 5 ++--- .../test/scala/org/apache/spark/sql/sources/BucketedReadSuite.scala | 2 +- 2 files changed, 3 insertions(+), 4 deletions(-) diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala index fe27b78..9467293 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala @@ -89,9 +89,8 @@ object FileSourceStrategy extends Strategy with Logging { case expressions.In(a: Attribute, list) if list.forall(_.isInstanceOf[Literal]) && a.name == bucketColumnName => getBucketSetFromIterable(a, list.map(e => e.eval(EmptyRow))) - case expressions.InSet(a: Attribute, hset) -if hset.forall(_.isInstanceOf[Literal]) && a.name == bucketColumnName => -getBucketSetFromIterable(a, hset.map(e => expressions.Literal(e).eval(EmptyRow))) + case expressions.InSet(a: Attribute, hset) if a.name == bucketColumnName => +getBucketSetFromIterable(a, hset) case expressions.IsNull(a: Attribute) if a.name == bucketColumnName => getBucketSetFromValue(a, null) case expressions.And(left, right) => diff --git a/sql/core/src/test/scala/org/apache/spark/sql/sources/BucketedReadSuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/sources/BucketedReadSuite.scala index 42443b0..c01b7db 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/sources/BucketedReadSuite.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/sources/BucketedReadSuite.scala @@ -173,7 +173,7 @@ abstract class BucketedReadSuite extends QueryTest with SQLTestUtils { df) // Case 4: InSet -val inSetExpr = expressions.InSet($"j".expr, Set(j, j + 1, j + 2, j + 3).map(lit(_).expr)) +val inSetExpr = expressions.InSet($"j".expr, Set(j, j + 1, j + 2, j + 3)) checkPrunedAnswers( bucketSpec, bucketValues = Seq(j, j + 1, j + 2, j + 3), - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-33339][PYTHON] Pyspark application will hang due to non Exception error
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 4a1c143 [SPARK-9][PYTHON] Pyspark application will hang due to non Exception error 4a1c143 is described below commit 4a1c143f1a042a9a23d00929670eadbdb1afca11 Author: lrz AuthorDate: Tue Nov 10 19:39:18 2020 +0900 [SPARK-9][PYTHON] Pyspark application will hang due to non Exception error ### What changes were proposed in this pull request? When a system.exit exception occurs during the process, the python worker exits abnormally, and then the executor task is still waiting for the worker for reading from socket, causing it to hang. The system.exit exception may be caused by the user's error code, but spark should at least throw an error to remind the user, not get stuck we can run a simple test to reproduce this case: ``` from pyspark.sql import SparkSession def err(line): raise SystemExit spark = SparkSession.builder.appName("test").getOrCreate() spark.sparkContext.parallelize(range(1,2), 2).map(err).collect() spark.stop() ``` ### Why are the changes needed? to make sure pyspark application won't hang if there's non-Exception error in python worker ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? added a new test and also manually tested the case above Closes #30248 from li36909/pyspark. Lead-authored-by: lrz Co-authored-by: Hyukjin Kwon Signed-off-by: HyukjinKwon (cherry picked from commit 27bb40b6297361985e3590687f0332a72b71bc85) Signed-off-by: HyukjinKwon --- python/pyspark/tests/test_worker.py | 9 + python/pyspark/worker.py| 4 ++-- 2 files changed, 11 insertions(+), 2 deletions(-) diff --git a/python/pyspark/tests/test_worker.py b/python/pyspark/tests/test_worker.py index bfcbc43..f51d4b2 100644 --- a/python/pyspark/tests/test_worker.py +++ b/python/pyspark/tests/test_worker.py @@ -98,6 +98,15 @@ class WorkerTests(ReusedPySparkTestCase): self.assertRaises(Exception, lambda: rdd.foreach(raise_exception)) self.assertEqual(100, rdd.map(str).count()) +def test_after_non_exception_error(self): +# SPARK-9: Pyspark application will hang due to non Exception +def raise_system_exit(_): +raise SystemExit() +rdd = self.sc.parallelize(range(100), 1) +with QuietTest(self.sc): +self.assertRaises(Exception, lambda: rdd.foreach(raise_system_exit)) +self.assertEqual(100, rdd.map(str).count()) + def test_after_jvm_exception(self): tempFile = tempfile.NamedTemporaryFile(delete=False) tempFile.write(b"Hello World!") diff --git a/python/pyspark/worker.py b/python/pyspark/worker.py index 814f796..0bce87d 100644 --- a/python/pyspark/worker.py +++ b/python/pyspark/worker.py @@ -608,7 +608,7 @@ def main(infile, outfile): # reuse. TaskContext._setTaskContext(None) BarrierTaskContext._setTaskContext(None) -except Exception: +except BaseException: try: exc_info = traceback.format_exc() if isinstance(exc_info, bytes): @@ -622,7 +622,7 @@ def main(infile, outfile): except IOError: # JVM close the socket pass -except Exception: +except BaseException: # Write the error to stderr if it happened while serializing print("PySpark worker failed with exception:", file=sys.stderr) print(traceback.format_exc(), file=sys.stderr) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (e3a768d -> 27bb40b)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from e3a768d [SPARK-33391][SQL] element_at with CreateArray not respect one based index add 27bb40b [SPARK-9][PYTHON] Pyspark application will hang due to non Exception error No new revisions were added by this update. Summary of changes: python/pyspark/tests/test_worker.py | 9 + python/pyspark/worker.py| 4 ++-- 2 files changed, 11 insertions(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-33339][PYTHON] Pyspark application will hang due to non Exception error
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 4a1c143 [SPARK-9][PYTHON] Pyspark application will hang due to non Exception error 4a1c143 is described below commit 4a1c143f1a042a9a23d00929670eadbdb1afca11 Author: lrz AuthorDate: Tue Nov 10 19:39:18 2020 +0900 [SPARK-9][PYTHON] Pyspark application will hang due to non Exception error ### What changes were proposed in this pull request? When a system.exit exception occurs during the process, the python worker exits abnormally, and then the executor task is still waiting for the worker for reading from socket, causing it to hang. The system.exit exception may be caused by the user's error code, but spark should at least throw an error to remind the user, not get stuck we can run a simple test to reproduce this case: ``` from pyspark.sql import SparkSession def err(line): raise SystemExit spark = SparkSession.builder.appName("test").getOrCreate() spark.sparkContext.parallelize(range(1,2), 2).map(err).collect() spark.stop() ``` ### Why are the changes needed? to make sure pyspark application won't hang if there's non-Exception error in python worker ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? added a new test and also manually tested the case above Closes #30248 from li36909/pyspark. Lead-authored-by: lrz Co-authored-by: Hyukjin Kwon Signed-off-by: HyukjinKwon (cherry picked from commit 27bb40b6297361985e3590687f0332a72b71bc85) Signed-off-by: HyukjinKwon --- python/pyspark/tests/test_worker.py | 9 + python/pyspark/worker.py| 4 ++-- 2 files changed, 11 insertions(+), 2 deletions(-) diff --git a/python/pyspark/tests/test_worker.py b/python/pyspark/tests/test_worker.py index bfcbc43..f51d4b2 100644 --- a/python/pyspark/tests/test_worker.py +++ b/python/pyspark/tests/test_worker.py @@ -98,6 +98,15 @@ class WorkerTests(ReusedPySparkTestCase): self.assertRaises(Exception, lambda: rdd.foreach(raise_exception)) self.assertEqual(100, rdd.map(str).count()) +def test_after_non_exception_error(self): +# SPARK-9: Pyspark application will hang due to non Exception +def raise_system_exit(_): +raise SystemExit() +rdd = self.sc.parallelize(range(100), 1) +with QuietTest(self.sc): +self.assertRaises(Exception, lambda: rdd.foreach(raise_system_exit)) +self.assertEqual(100, rdd.map(str).count()) + def test_after_jvm_exception(self): tempFile = tempfile.NamedTemporaryFile(delete=False) tempFile.write(b"Hello World!") diff --git a/python/pyspark/worker.py b/python/pyspark/worker.py index 814f796..0bce87d 100644 --- a/python/pyspark/worker.py +++ b/python/pyspark/worker.py @@ -608,7 +608,7 @@ def main(infile, outfile): # reuse. TaskContext._setTaskContext(None) BarrierTaskContext._setTaskContext(None) -except Exception: +except BaseException: try: exc_info = traceback.format_exc() if isinstance(exc_info, bytes): @@ -622,7 +622,7 @@ def main(infile, outfile): except IOError: # JVM close the socket pass -except Exception: +except BaseException: # Write the error to stderr if it happened while serializing print("PySpark worker failed with exception:", file=sys.stderr) print(traceback.format_exc(), file=sys.stderr) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (e3a768d -> 27bb40b)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from e3a768d [SPARK-33391][SQL] element_at with CreateArray not respect one based index add 27bb40b [SPARK-9][PYTHON] Pyspark application will hang due to non Exception error No new revisions were added by this update. Summary of changes: python/pyspark/tests/test_worker.py | 9 + python/pyspark/worker.py| 4 ++-- 2 files changed, 11 insertions(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-33339][PYTHON] Pyspark application will hang due to non Exception error
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 4a1c143 [SPARK-9][PYTHON] Pyspark application will hang due to non Exception error 4a1c143 is described below commit 4a1c143f1a042a9a23d00929670eadbdb1afca11 Author: lrz AuthorDate: Tue Nov 10 19:39:18 2020 +0900 [SPARK-9][PYTHON] Pyspark application will hang due to non Exception error ### What changes were proposed in this pull request? When a system.exit exception occurs during the process, the python worker exits abnormally, and then the executor task is still waiting for the worker for reading from socket, causing it to hang. The system.exit exception may be caused by the user's error code, but spark should at least throw an error to remind the user, not get stuck we can run a simple test to reproduce this case: ``` from pyspark.sql import SparkSession def err(line): raise SystemExit spark = SparkSession.builder.appName("test").getOrCreate() spark.sparkContext.parallelize(range(1,2), 2).map(err).collect() spark.stop() ``` ### Why are the changes needed? to make sure pyspark application won't hang if there's non-Exception error in python worker ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? added a new test and also manually tested the case above Closes #30248 from li36909/pyspark. Lead-authored-by: lrz Co-authored-by: Hyukjin Kwon Signed-off-by: HyukjinKwon (cherry picked from commit 27bb40b6297361985e3590687f0332a72b71bc85) Signed-off-by: HyukjinKwon --- python/pyspark/tests/test_worker.py | 9 + python/pyspark/worker.py| 4 ++-- 2 files changed, 11 insertions(+), 2 deletions(-) diff --git a/python/pyspark/tests/test_worker.py b/python/pyspark/tests/test_worker.py index bfcbc43..f51d4b2 100644 --- a/python/pyspark/tests/test_worker.py +++ b/python/pyspark/tests/test_worker.py @@ -98,6 +98,15 @@ class WorkerTests(ReusedPySparkTestCase): self.assertRaises(Exception, lambda: rdd.foreach(raise_exception)) self.assertEqual(100, rdd.map(str).count()) +def test_after_non_exception_error(self): +# SPARK-9: Pyspark application will hang due to non Exception +def raise_system_exit(_): +raise SystemExit() +rdd = self.sc.parallelize(range(100), 1) +with QuietTest(self.sc): +self.assertRaises(Exception, lambda: rdd.foreach(raise_system_exit)) +self.assertEqual(100, rdd.map(str).count()) + def test_after_jvm_exception(self): tempFile = tempfile.NamedTemporaryFile(delete=False) tempFile.write(b"Hello World!") diff --git a/python/pyspark/worker.py b/python/pyspark/worker.py index 814f796..0bce87d 100644 --- a/python/pyspark/worker.py +++ b/python/pyspark/worker.py @@ -608,7 +608,7 @@ def main(infile, outfile): # reuse. TaskContext._setTaskContext(None) BarrierTaskContext._setTaskContext(None) -except Exception: +except BaseException: try: exc_info = traceback.format_exc() if isinstance(exc_info, bytes): @@ -622,7 +622,7 @@ def main(infile, outfile): except IOError: # JVM close the socket pass -except Exception: +except BaseException: # Write the error to stderr if it happened while serializing print("PySpark worker failed with exception:", file=sys.stderr) print(traceback.format_exc(), file=sys.stderr) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (e3a768d -> 27bb40b)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from e3a768d [SPARK-33391][SQL] element_at with CreateArray not respect one based index add 27bb40b [SPARK-9][PYTHON] Pyspark application will hang due to non Exception error No new revisions were added by this update. Summary of changes: python/pyspark/tests/test_worker.py | 9 + python/pyspark/worker.py| 4 ++-- 2 files changed, 11 insertions(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-33339][PYTHON] Pyspark application will hang due to non Exception error
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 4a1c143 [SPARK-9][PYTHON] Pyspark application will hang due to non Exception error 4a1c143 is described below commit 4a1c143f1a042a9a23d00929670eadbdb1afca11 Author: lrz AuthorDate: Tue Nov 10 19:39:18 2020 +0900 [SPARK-9][PYTHON] Pyspark application will hang due to non Exception error ### What changes were proposed in this pull request? When a system.exit exception occurs during the process, the python worker exits abnormally, and then the executor task is still waiting for the worker for reading from socket, causing it to hang. The system.exit exception may be caused by the user's error code, but spark should at least throw an error to remind the user, not get stuck we can run a simple test to reproduce this case: ``` from pyspark.sql import SparkSession def err(line): raise SystemExit spark = SparkSession.builder.appName("test").getOrCreate() spark.sparkContext.parallelize(range(1,2), 2).map(err).collect() spark.stop() ``` ### Why are the changes needed? to make sure pyspark application won't hang if there's non-Exception error in python worker ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? added a new test and also manually tested the case above Closes #30248 from li36909/pyspark. Lead-authored-by: lrz Co-authored-by: Hyukjin Kwon Signed-off-by: HyukjinKwon (cherry picked from commit 27bb40b6297361985e3590687f0332a72b71bc85) Signed-off-by: HyukjinKwon --- python/pyspark/tests/test_worker.py | 9 + python/pyspark/worker.py| 4 ++-- 2 files changed, 11 insertions(+), 2 deletions(-) diff --git a/python/pyspark/tests/test_worker.py b/python/pyspark/tests/test_worker.py index bfcbc43..f51d4b2 100644 --- a/python/pyspark/tests/test_worker.py +++ b/python/pyspark/tests/test_worker.py @@ -98,6 +98,15 @@ class WorkerTests(ReusedPySparkTestCase): self.assertRaises(Exception, lambda: rdd.foreach(raise_exception)) self.assertEqual(100, rdd.map(str).count()) +def test_after_non_exception_error(self): +# SPARK-9: Pyspark application will hang due to non Exception +def raise_system_exit(_): +raise SystemExit() +rdd = self.sc.parallelize(range(100), 1) +with QuietTest(self.sc): +self.assertRaises(Exception, lambda: rdd.foreach(raise_system_exit)) +self.assertEqual(100, rdd.map(str).count()) + def test_after_jvm_exception(self): tempFile = tempfile.NamedTemporaryFile(delete=False) tempFile.write(b"Hello World!") diff --git a/python/pyspark/worker.py b/python/pyspark/worker.py index 814f796..0bce87d 100644 --- a/python/pyspark/worker.py +++ b/python/pyspark/worker.py @@ -608,7 +608,7 @@ def main(infile, outfile): # reuse. TaskContext._setTaskContext(None) BarrierTaskContext._setTaskContext(None) -except Exception: +except BaseException: try: exc_info = traceback.format_exc() if isinstance(exc_info, bytes): @@ -622,7 +622,7 @@ def main(infile, outfile): except IOError: # JVM close the socket pass -except Exception: +except BaseException: # Write the error to stderr if it happened while serializing print("PySpark worker failed with exception:", file=sys.stderr) print(traceback.format_exc(), file=sys.stderr) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (e3a768d -> 27bb40b)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from e3a768d [SPARK-33391][SQL] element_at with CreateArray not respect one based index add 27bb40b [SPARK-9][PYTHON] Pyspark application will hang due to non Exception error No new revisions were added by this update. Summary of changes: python/pyspark/tests/test_worker.py | 9 + python/pyspark/worker.py| 4 ++-- 2 files changed, 11 insertions(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-33339][PYTHON] Pyspark application will hang due to non Exception error
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 4a1c143 [SPARK-9][PYTHON] Pyspark application will hang due to non Exception error 4a1c143 is described below commit 4a1c143f1a042a9a23d00929670eadbdb1afca11 Author: lrz AuthorDate: Tue Nov 10 19:39:18 2020 +0900 [SPARK-9][PYTHON] Pyspark application will hang due to non Exception error ### What changes were proposed in this pull request? When a system.exit exception occurs during the process, the python worker exits abnormally, and then the executor task is still waiting for the worker for reading from socket, causing it to hang. The system.exit exception may be caused by the user's error code, but spark should at least throw an error to remind the user, not get stuck we can run a simple test to reproduce this case: ``` from pyspark.sql import SparkSession def err(line): raise SystemExit spark = SparkSession.builder.appName("test").getOrCreate() spark.sparkContext.parallelize(range(1,2), 2).map(err).collect() spark.stop() ``` ### Why are the changes needed? to make sure pyspark application won't hang if there's non-Exception error in python worker ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? added a new test and also manually tested the case above Closes #30248 from li36909/pyspark. Lead-authored-by: lrz Co-authored-by: Hyukjin Kwon Signed-off-by: HyukjinKwon (cherry picked from commit 27bb40b6297361985e3590687f0332a72b71bc85) Signed-off-by: HyukjinKwon --- python/pyspark/tests/test_worker.py | 9 + python/pyspark/worker.py| 4 ++-- 2 files changed, 11 insertions(+), 2 deletions(-) diff --git a/python/pyspark/tests/test_worker.py b/python/pyspark/tests/test_worker.py index bfcbc43..f51d4b2 100644 --- a/python/pyspark/tests/test_worker.py +++ b/python/pyspark/tests/test_worker.py @@ -98,6 +98,15 @@ class WorkerTests(ReusedPySparkTestCase): self.assertRaises(Exception, lambda: rdd.foreach(raise_exception)) self.assertEqual(100, rdd.map(str).count()) +def test_after_non_exception_error(self): +# SPARK-9: Pyspark application will hang due to non Exception +def raise_system_exit(_): +raise SystemExit() +rdd = self.sc.parallelize(range(100), 1) +with QuietTest(self.sc): +self.assertRaises(Exception, lambda: rdd.foreach(raise_system_exit)) +self.assertEqual(100, rdd.map(str).count()) + def test_after_jvm_exception(self): tempFile = tempfile.NamedTemporaryFile(delete=False) tempFile.write(b"Hello World!") diff --git a/python/pyspark/worker.py b/python/pyspark/worker.py index 814f796..0bce87d 100644 --- a/python/pyspark/worker.py +++ b/python/pyspark/worker.py @@ -608,7 +608,7 @@ def main(infile, outfile): # reuse. TaskContext._setTaskContext(None) BarrierTaskContext._setTaskContext(None) -except Exception: +except BaseException: try: exc_info = traceback.format_exc() if isinstance(exc_info, bytes): @@ -622,7 +622,7 @@ def main(infile, outfile): except IOError: # JVM close the socket pass -except Exception: +except BaseException: # Write the error to stderr if it happened while serializing print("PySpark worker failed with exception:", file=sys.stderr) print(traceback.format_exc(), file=sys.stderr) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (e3a768d -> 27bb40b)
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from e3a768d [SPARK-33391][SQL] element_at with CreateArray not respect one based index add 27bb40b [SPARK-9][PYTHON] Pyspark application will hang due to non Exception error No new revisions were added by this update. Summary of changes: python/pyspark/tests/test_worker.py | 9 + python/pyspark/worker.py| 4 ++-- 2 files changed, 11 insertions(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated (1aa8f4f -> b905d65)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git. from 1aa8f4f [SPARK-33405][BUILD][3.0] Upgrade commons-compress to 1.20 add b905d65 [SPARK-33391][SQL] element_at with CreateArray not respect one based index No new revisions were added by this update. Summary of changes: .../expressions/collectionOperations.scala | 30 + .../expressions/CollectionExpressionsSuite.scala | 38 +- 2 files changed, 60 insertions(+), 8 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org