[spark] branch master updated: [SPARK-33570][SQL][TESTS] Set the proper version of gssapi plugin automatically for MariaDBKrbIntegrationSuite
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new cf98a76 [SPARK-33570][SQL][TESTS] Set the proper version of gssapi plugin automatically for MariaDBKrbIntegrationSuite cf98a76 is described below commit cf98a761de677c733f3c33230e1c63ddb785d5c5 Author: Kousuke Saruta AuthorDate: Sat Nov 28 23:38:11 2020 +0900 [SPARK-33570][SQL][TESTS] Set the proper version of gssapi plugin automatically for MariaDBKrbIntegrationSuite ### What changes were proposed in this pull request? This PR changes mariadb_docker_entrypoint.sh to set the proper version automatically for mariadb-plugin-gssapi-server. The proper version is based on the one of mariadb-server. Also, this PR enables to use arbitrary docker image by setting the environment variable `MARIADB_CONTAINER_IMAGE_NAME`. ### Why are the changes needed? For `MariaDBKrbIntegrationSuite`, the version of `mariadb-plugin-gssapi-server` is currently set to `10.5.5` in `mariadb_docker_entrypoint.sh` but it's no longer available in the official apt repository and `MariaDBKrbIntegrationSuite` doesn't pass for now. It seems that only the most recent three versions are available for each major version and they are `10.5.6`, `10.5.7` and `10.5.8` for now. Further, the release cycle of MariaDB seems to be very rapid (1 ~ 2 months) so I don't think it's a good idea to set to an specific version for `mariadb-plugin-gssapi-server`. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Confirmed that `MariaDBKrbIntegrationSuite` passes with the following commands. ``` $ build/sbt -Pdocker-integration-tests -Phive -Phive-thriftserver package "testOnly org.apache.spark.sql.jdbc.MariaDBKrbIntegrationSuite" ``` In this case, we can see what version of `mariadb-plugin-gssapi-server` is going to be installed in the following container log message. ``` Installing mariadb-plugin-gssapi-server=1:10.5.8+maria~focal ``` Or, we can set MARIADB_CONTAINER_IMAGE_NAME for a specific version of MariaDB. ``` $ MARIADB_DOCKER_IMAGE_NAME=mariadb:10.5.6 build/sbt -Pdocker-integration-tests -Phive -Phive-thriftserver package "testOnly org.apache.spark.sql.jdbc.MariaDBKrbIntegrationSuite" ``` ``` Installing mariadb-plugin-gssapi-server=1:10.5.6+maria~focal ``` Closes #30515 from sarutak/fix-MariaDBKrbIntegrationSuite. Authored-by: Kousuke Saruta Signed-off-by: Takeshi Yamamuro --- .../src/test/resources/mariadb_docker_entrypoint.sh | 4 +++- .../apache/spark/sql/jdbc/MariaDBKrbIntegrationSuite.scala | 12 +--- 2 files changed, 12 insertions(+), 4 deletions(-) diff --git a/external/docker-integration-tests/src/test/resources/mariadb_docker_entrypoint.sh b/external/docker-integration-tests/src/test/resources/mariadb_docker_entrypoint.sh index 97c00a9..ab7d967 100755 --- a/external/docker-integration-tests/src/test/resources/mariadb_docker_entrypoint.sh +++ b/external/docker-integration-tests/src/test/resources/mariadb_docker_entrypoint.sh @@ -18,7 +18,9 @@ dpkg-divert --add /bin/systemctl && ln -sT /bin/true /bin/systemctl apt update -apt install -y mariadb-plugin-gssapi-server=1:10.5.5+maria~focal +GSSAPI_PLUGIN=mariadb-plugin-gssapi-server=$(dpkg -s mariadb-server | sed -n "s/^Version: \(.*\)/\1/p") +echo "Installing $GSSAPI_PLUGIN" +apt install -y "$GSSAPI_PLUGIN" echo "gssapi_keytab_path=/docker-entrypoint-initdb.d/mariadb.keytab" >> /etc/mysql/mariadb.conf.d/auth_gssapi.cnf echo "gssapi_principal_name=mariadb/__ip_address_replace_m...@example.com" >> /etc/mysql/mariadb.conf.d/auth_gssapi.cnf docker-entrypoint.sh mysqld diff --git a/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/MariaDBKrbIntegrationSuite.scala b/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/MariaDBKrbIntegrationSuite.scala index adee2be..59a6f53 100644 --- a/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/MariaDBKrbIntegrationSuite.scala +++ b/external/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/MariaDBKrbIntegrationSuite.scala @@ -24,15 +24,21 @@ import com.spotify.docker.client.messages.{ContainerConfig, HostConfig} import org.apache.spark.sql.execution.datasources.jdbc.connection.SecureConnectionProvider import org.apache.spark.tags.DockerTest +/** + * To run this test suite for a specific version (e.g., mariadb:10.5.8): + * {{{ + * MARIADB_DOCKER_IMAGE_NAME=mariadb:10.5.8 + * ./build/sbt -Pdocker-integration-tests + * "testOnly org.apache.spark.s
[spark] branch master updated (9273d42 -> cf4ad21)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 9273d42 [SPARK-33045][SQL][FOLLOWUP] Support built-in function like_any and fix StackOverflowError issue add cf4ad21 [SPARK-33503][SQL] Refactor SortOrder class to allow multiple childrens No new revisions were added by this update. Summary of changes: .../spark/sql/catalyst/analysis/Analyzer.scala | 2 +- .../apache/spark/sql/catalyst/dsl/package.scala| 4 ++-- .../spark/sql/catalyst/expressions/SortOrder.scala | 10 + .../spark/sql/catalyst/parser/AstBuilder.scala | 2 +- .../main/scala/org/apache/spark/sql/Column.scala | 8 +++ .../sql/execution/AliasAwareOutputExpression.scala | 6 + .../sql/execution/joins/SortMergeJoinExec.scala| 9 .../apache/spark/sql/execution/PlannerSuite.scala | 26 ++ 8 files changed, 46 insertions(+), 21 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (f62e957 -> 7466031)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from f62e957 [SPARK-33873][CORE][TESTS] Test all compression codecs with encrypted spilling add 7466031 [SPARK-32106][SQL] Implement script transform in sql/core No new revisions were added by this update. Summary of changes: .../spark/sql/catalyst/parser/AstBuilder.scala | 52 ++- .../sql/catalyst/parser/PlanParserSuite.scala | 113 ++- .../apache/spark/sql/execution/SparkPlanner.scala | 1 + .../execution/SparkScriptTransformationExec.scala | 91 ++ .../spark/sql/execution/SparkSqlParser.scala | 115 --- .../spark/sql/execution/SparkStrategies.scala | 14 + .../test/resources/sql-tests/inputs/transform.sql | 195 +++ .../resources/sql-tests/results/transform.sql.out | 357 + .../org/apache/spark/sql/SQLQueryTestSuite.scala | 5 +- .../execution/SparkScriptTransformationSuite.scala | 102 ++ .../execution/HiveScriptTransformationExec.scala | 2 + 11 files changed, 982 insertions(+), 65 deletions(-) create mode 100644 sql/core/src/main/scala/org/apache/spark/sql/execution/SparkScriptTransformationExec.scala create mode 100644 sql/core/src/test/resources/sql-tests/inputs/transform.sql create mode 100644 sql/core/src/test/resources/sql-tests/results/transform.sql.out create mode 100644 sql/core/src/test/scala/org/apache/spark/sql/execution/SparkScriptTransformationSuite.scala - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (65a9ac2 -> 10b6466)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 65a9ac2 [SPARK-30027][SQL] Support codegen for aggregate filters in HashAggregateExec add 10b6466 [SPARK-33084][CORE][SQL] Add jar support ivy path No new revisions were added by this update. Summary of changes: .../main/scala/org/apache/spark/SparkContext.scala | 45 --- .../org/apache/spark/deploy/SparkSubmit.scala | 8 +- .../apache/spark/deploy/worker/DriverWrapper.scala | 16 +-- .../spark/{deploy => util}/DependencyUtils.scala | 137 - .../scala/org/apache/spark/SparkContextSuite.scala | 116 + .../org/apache/spark/deploy/SparkSubmitSuite.scala | 2 +- .../spark/deploy/SparkSubmitUtilsSuite.scala | 14 ++- .../org/apache/spark/util/DependencyUtils.scala| 60 + docs/sql-ref-syntax-aux-resource-mgmt-add-jar.md | 16 ++- .../apache/spark/sql/internal/SessionState.scala | 30 +++-- sql/core/src/test/resources/SPARK-33084.jar| Bin 0 -> 6322 bytes .../scala/org/apache/spark/sql/SQLQuerySuite.scala | 54 .../spark/sql/hive/HiveSessionStateBuilder.scala | 9 +- .../sql/hive/client/IsolatedClientLoader.scala | 1 + .../spark/sql/hive/execution/HiveQuerySuite.scala | 17 +++ 15 files changed, 475 insertions(+), 50 deletions(-) rename core/src/main/scala/org/apache/spark/{deploy => util}/DependencyUtils.scala (54%) create mode 100644 core/src/test/scala/org/apache/spark/util/DependencyUtils.scala create mode 100644 sql/core/src/test/resources/SPARK-33084.jar - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (1339168 -> 3c8be39)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 1339168 [SPARK-33756][SQL] Make BytesToBytesMap's MapIterator idempotent add 3c8be39 [SPARK-33850][SQL][FOLLOWUP] Improve and cleanup the test code No new revisions were added by this update. Summary of changes: .../scala/org/apache/spark/sql/ExplainSuite.scala | 25 -- 1 file changed, 9 insertions(+), 16 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (090962c -> 036c11b)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 090962c [SPARK-33251][PYTHON][DOCS] Migration to NumPy documentation style in ML (pyspark.ml.*) add 036c11b [SPARK-33397][YARN][DOC] Fix generating md to html for available-patterns-for-shs-custom-executor-log-url No new revisions were added by this update. Summary of changes: docs/running-on-yarn.md | 30 +++--- 1 file changed, 15 insertions(+), 15 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated (c157fa3 -> a418495)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git. from c157fa3 [SPARK-33372][SQL] Fix InSet bucket pruning add a418495 [SPARK-33397][YARN][DOC] Fix generating md to html for available-patterns-for-shs-custom-executor-log-url No new revisions were added by this update. Summary of changes: docs/running-on-yarn.md | 30 +++--- 1 file changed, 15 insertions(+), 15 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (090962c -> 036c11b)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 090962c [SPARK-33251][PYTHON][DOCS] Migration to NumPy documentation style in ML (pyspark.ml.*) add 036c11b [SPARK-33397][YARN][DOC] Fix generating md to html for available-patterns-for-shs-custom-executor-log-url No new revisions were added by this update. Summary of changes: docs/running-on-yarn.md | 30 +++--- 1 file changed, 15 insertions(+), 15 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated (c157fa3 -> a418495)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git. from c157fa3 [SPARK-33372][SQL] Fix InSet bucket pruning add a418495 [SPARK-33397][YARN][DOC] Fix generating md to html for available-patterns-for-shs-custom-executor-log-url No new revisions were added by this update. Summary of changes: docs/running-on-yarn.md | 30 +++--- 1 file changed, 15 insertions(+), 15 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (090962c -> 036c11b)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 090962c [SPARK-33251][PYTHON][DOCS] Migration to NumPy documentation style in ML (pyspark.ml.*) add 036c11b [SPARK-33397][YARN][DOC] Fix generating md to html for available-patterns-for-shs-custom-executor-log-url No new revisions were added by this update. Summary of changes: docs/running-on-yarn.md | 30 +++--- 1 file changed, 15 insertions(+), 15 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated (c157fa3 -> a418495)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git. from c157fa3 [SPARK-33372][SQL] Fix InSet bucket pruning add a418495 [SPARK-33397][YARN][DOC] Fix generating md to html for available-patterns-for-shs-custom-executor-log-url No new revisions were added by this update. Summary of changes: docs/running-on-yarn.md | 30 +++--- 1 file changed, 15 insertions(+), 15 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated (c157fa3 -> a418495)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git. from c157fa3 [SPARK-33372][SQL] Fix InSet bucket pruning add a418495 [SPARK-33397][YARN][DOC] Fix generating md to html for available-patterns-for-shs-custom-executor-log-url No new revisions were added by this update. Summary of changes: docs/running-on-yarn.md | 30 +++--- 1 file changed, 15 insertions(+), 15 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (090962c -> 036c11b)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 090962c [SPARK-33251][PYTHON][DOCS] Migration to NumPy documentation style in ML (pyspark.ml.*) add 036c11b [SPARK-33397][YARN][DOC] Fix generating md to html for available-patterns-for-shs-custom-executor-log-url No new revisions were added by this update. Summary of changes: docs/running-on-yarn.md | 30 +++--- 1 file changed, 15 insertions(+), 15 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (090962c -> 036c11b)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 090962c [SPARK-33251][PYTHON][DOCS] Migration to NumPy documentation style in ML (pyspark.ml.*) add 036c11b [SPARK-33397][YARN][DOC] Fix generating md to html for available-patterns-for-shs-custom-executor-log-url No new revisions were added by this update. Summary of changes: docs/running-on-yarn.md | 30 +++--- 1 file changed, 15 insertions(+), 15 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated (c157fa3 -> a418495)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git. from c157fa3 [SPARK-33372][SQL] Fix InSet bucket pruning add a418495 [SPARK-33397][YARN][DOC] Fix generating md to html for available-patterns-for-shs-custom-executor-log-url No new revisions were added by this update. Summary of changes: docs/running-on-yarn.md | 30 +++--- 1 file changed, 15 insertions(+), 15 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (6fa80ed -> 4634694)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 6fa80ed [SPARK-7][SQL] Support subexpression elimination in branches of conditional expressions add 4634694 [SPARK-33404][SQL] Fix incorrect results in `date_trunc` expression No new revisions were added by this update. Summary of changes: .../spark/sql/catalyst/util/DateTimeUtils.scala| 6 ++-- .../sql/catalyst/util/DateTimeUtilsSuite.scala | 34 +++--- 2 files changed, 28 insertions(+), 12 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (6fa80ed -> 4634694)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 6fa80ed [SPARK-7][SQL] Support subexpression elimination in branches of conditional expressions add 4634694 [SPARK-33404][SQL] Fix incorrect results in `date_trunc` expression No new revisions were added by this update. Summary of changes: .../spark/sql/catalyst/util/DateTimeUtils.scala| 6 ++-- .../sql/catalyst/util/DateTimeUtilsSuite.scala | 34 +++--- 2 files changed, 28 insertions(+), 12 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (6fa80ed -> 4634694)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 6fa80ed [SPARK-7][SQL] Support subexpression elimination in branches of conditional expressions add 4634694 [SPARK-33404][SQL] Fix incorrect results in `date_trunc` expression No new revisions were added by this update. Summary of changes: .../spark/sql/catalyst/util/DateTimeUtils.scala| 6 ++-- .../sql/catalyst/util/DateTimeUtilsSuite.scala | 34 +++--- 2 files changed, 28 insertions(+), 12 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (6fa80ed -> 4634694)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 6fa80ed [SPARK-7][SQL] Support subexpression elimination in branches of conditional expressions add 4634694 [SPARK-33404][SQL] Fix incorrect results in `date_trunc` expression No new revisions were added by this update. Summary of changes: .../spark/sql/catalyst/util/DateTimeUtils.scala| 6 ++-- .../sql/catalyst/util/DateTimeUtilsSuite.scala | 34 +++--- 2 files changed, 28 insertions(+), 12 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (6fa80ed -> 4634694)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 6fa80ed [SPARK-7][SQL] Support subexpression elimination in branches of conditional expressions add 4634694 [SPARK-33404][SQL] Fix incorrect results in `date_trunc` expression No new revisions were added by this update. Summary of changes: .../spark/sql/catalyst/util/DateTimeUtils.scala| 6 ++-- .../sql/catalyst/util/DateTimeUtilsSuite.scala | 34 +++--- 2 files changed, 28 insertions(+), 12 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-2.4 updated: [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.4 by this push: new fece4a3 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark fece4a3 is described below commit fece4a3a36e23c7b99d6cb64e0c4484c9e17235f Author: Takeshi Yamamuro AuthorDate: Wed Nov 11 15:24:05 2020 +0900 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark ### What changes were proposed in this pull request? This PR intends to fix the behaviour of query filters in `TPCDSQueryBenchmark`. We can use an option `--query-filter` for selecting TPCDS queries to run, e.g., `--query-filter q6,q8,q13`. But, the current master has a weird behaviour about the option. For example, if we pass `--query-filter q6` so as to run the TPCDS q6 only, `TPCDSQueryBenchmark` runs `q6` and `q6-v2.7` because the `filterQueries` method does not respect the name suffix. So, there is no way now to run the TPCDS q6 only. ### Why are the changes needed? Bugfix. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manually checked. Closes #30324 from maropu/FilterBugInTPCDSQueryBenchmark. Authored-by: Takeshi Yamamuro Signed-off-by: Takeshi Yamamuro (cherry picked from commit 4b367976a877adb981f65d546e1522fdf30d0731) Signed-off-by: Takeshi Yamamuro --- .../execution/benchmark/TPCDSQueryBenchmark.scala | 21 ++--- 1 file changed, 14 insertions(+), 7 deletions(-) diff --git a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala index fccee97..1f8b057 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala @@ -90,11 +90,16 @@ object TPCDSQueryBenchmark extends Logging { } } - def filterQueries( + private def filterQueries( origQueries: Seq[String], - args: TPCDSQueryBenchmarkArguments): Seq[String] = { -if (args.queryFilter.nonEmpty) { - origQueries.filter(args.queryFilter.contains) + queryFilter: Set[String], + nameSuffix: String = ""): Seq[String] = { +if (queryFilter.nonEmpty) { + if (nameSuffix.nonEmpty) { +origQueries.filter { name => queryFilter.contains(s"$name$nameSuffix") } + } else { +origQueries.filter(queryFilter.contains) + } } else { origQueries } @@ -117,6 +122,7 @@ object TPCDSQueryBenchmark extends Logging { "q91", "q92", "q93", "q94", "q95", "q96", "q97", "q98", "q99") // This list only includes TPC-DS v2.7 queries that are different from v1.4 ones +val nameSuffixForQueriesV2_7 = "-v2.7" val tpcdsQueriesV2_7 = Seq( "q5a", "q6", "q10a", "q11", "q12", "q14", "q14a", "q18a", "q20", "q22", "q22a", "q24", "q27a", "q34", "q35", "q35a", "q36a", "q47", "q49", @@ -124,8 +130,9 @@ object TPCDSQueryBenchmark extends Logging { "q80a", "q86a", "q98") // If `--query-filter` defined, filters the queries that this option selects -val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs) -val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs) +val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs.queryFilter) +val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs.queryFilter, + nameSuffix = nameSuffixForQueriesV2_7) if ((queriesV1_4ToRun ++ queriesV2_7ToRun).isEmpty) { throw new RuntimeException( @@ -135,6 +142,6 @@ object TPCDSQueryBenchmark extends Logging { val tableSizes = setupTables(benchmarkArgs.dataLocation) runTpcdsQueries(queryLocation = "tpcds", queries = queriesV1_4ToRun, tableSizes) runTpcdsQueries(queryLocation = "tpcds-v2.7.0", queries = queriesV2_7ToRun, tableSizes, - nameSuffix = "-v2.7") + nameSuffix = nameSuffixForQueriesV2_7) } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-2.4 updated: [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.4 by this push: new fece4a3 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark fece4a3 is described below commit fece4a3a36e23c7b99d6cb64e0c4484c9e17235f Author: Takeshi Yamamuro AuthorDate: Wed Nov 11 15:24:05 2020 +0900 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark ### What changes were proposed in this pull request? This PR intends to fix the behaviour of query filters in `TPCDSQueryBenchmark`. We can use an option `--query-filter` for selecting TPCDS queries to run, e.g., `--query-filter q6,q8,q13`. But, the current master has a weird behaviour about the option. For example, if we pass `--query-filter q6` so as to run the TPCDS q6 only, `TPCDSQueryBenchmark` runs `q6` and `q6-v2.7` because the `filterQueries` method does not respect the name suffix. So, there is no way now to run the TPCDS q6 only. ### Why are the changes needed? Bugfix. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manually checked. Closes #30324 from maropu/FilterBugInTPCDSQueryBenchmark. Authored-by: Takeshi Yamamuro Signed-off-by: Takeshi Yamamuro (cherry picked from commit 4b367976a877adb981f65d546e1522fdf30d0731) Signed-off-by: Takeshi Yamamuro --- .../execution/benchmark/TPCDSQueryBenchmark.scala | 21 ++--- 1 file changed, 14 insertions(+), 7 deletions(-) diff --git a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala index fccee97..1f8b057 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala @@ -90,11 +90,16 @@ object TPCDSQueryBenchmark extends Logging { } } - def filterQueries( + private def filterQueries( origQueries: Seq[String], - args: TPCDSQueryBenchmarkArguments): Seq[String] = { -if (args.queryFilter.nonEmpty) { - origQueries.filter(args.queryFilter.contains) + queryFilter: Set[String], + nameSuffix: String = ""): Seq[String] = { +if (queryFilter.nonEmpty) { + if (nameSuffix.nonEmpty) { +origQueries.filter { name => queryFilter.contains(s"$name$nameSuffix") } + } else { +origQueries.filter(queryFilter.contains) + } } else { origQueries } @@ -117,6 +122,7 @@ object TPCDSQueryBenchmark extends Logging { "q91", "q92", "q93", "q94", "q95", "q96", "q97", "q98", "q99") // This list only includes TPC-DS v2.7 queries that are different from v1.4 ones +val nameSuffixForQueriesV2_7 = "-v2.7" val tpcdsQueriesV2_7 = Seq( "q5a", "q6", "q10a", "q11", "q12", "q14", "q14a", "q18a", "q20", "q22", "q22a", "q24", "q27a", "q34", "q35", "q35a", "q36a", "q47", "q49", @@ -124,8 +130,9 @@ object TPCDSQueryBenchmark extends Logging { "q80a", "q86a", "q98") // If `--query-filter` defined, filters the queries that this option selects -val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs) -val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs) +val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs.queryFilter) +val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs.queryFilter, + nameSuffix = nameSuffixForQueriesV2_7) if ((queriesV1_4ToRun ++ queriesV2_7ToRun).isEmpty) { throw new RuntimeException( @@ -135,6 +142,6 @@ object TPCDSQueryBenchmark extends Logging { val tableSizes = setupTables(benchmarkArgs.dataLocation) runTpcdsQueries(queryLocation = "tpcds", queries = queriesV1_4ToRun, tableSizes) runTpcdsQueries(queryLocation = "tpcds-v2.7.0", queries = queriesV2_7ToRun, tableSizes, - nameSuffix = "-v2.7") + nameSuffix = nameSuffixForQueriesV2_7) } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (6d5d030 -> 4b36797)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 6d5d030 [SPARK-33414][SQL] Migrate SHOW CREATE TABLE command to use UnresolvedTableOrView to resolve the identifier add 4b36797 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark No new revisions were added by this update. Summary of changes: .../execution/benchmark/TPCDSQueryBenchmark.scala | 21 ++--- 1 file changed, 14 insertions(+), 7 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 577dbb9 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark 577dbb9 is described below commit 577dbb96835f13f4cd92ea4caab9e6dece00be50 Author: Takeshi Yamamuro AuthorDate: Wed Nov 11 15:24:05 2020 +0900 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark ### What changes were proposed in this pull request? This PR intends to fix the behaviour of query filters in `TPCDSQueryBenchmark`. We can use an option `--query-filter` for selecting TPCDS queries to run, e.g., `--query-filter q6,q8,q13`. But, the current master has a weird behaviour about the option. For example, if we pass `--query-filter q6` so as to run the TPCDS q6 only, `TPCDSQueryBenchmark` runs `q6` and `q6-v2.7` because the `filterQueries` method does not respect the name suffix. So, there is no way now to run the TPCDS q6 only. ### Why are the changes needed? Bugfix. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manually checked. Closes #30324 from maropu/FilterBugInTPCDSQueryBenchmark. Authored-by: Takeshi Yamamuro Signed-off-by: Takeshi Yamamuro (cherry picked from commit 4b367976a877adb981f65d546e1522fdf30d0731) Signed-off-by: Takeshi Yamamuro --- .../execution/benchmark/TPCDSQueryBenchmark.scala | 21 ++--- 1 file changed, 14 insertions(+), 7 deletions(-) diff --git a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala index 7bbf079..43bc7c1 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala @@ -98,11 +98,16 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark { } } - def filterQueries( + private def filterQueries( origQueries: Seq[String], - args: TPCDSQueryBenchmarkArguments): Seq[String] = { -if (args.queryFilter.nonEmpty) { - origQueries.filter(args.queryFilter.contains) + queryFilter: Set[String], + nameSuffix: String = ""): Seq[String] = { +if (queryFilter.nonEmpty) { + if (nameSuffix.nonEmpty) { +origQueries.filter { name => queryFilter.contains(s"$name$nameSuffix") } + } else { +origQueries.filter(queryFilter.contains) + } } else { origQueries } @@ -125,6 +130,7 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark { "q91", "q92", "q93", "q94", "q95", "q96", "q97", "q98", "q99") // This list only includes TPC-DS v2.7 queries that are different from v1.4 ones +val nameSuffixForQueriesV2_7 = "-v2.7" val tpcdsQueriesV2_7 = Seq( "q5a", "q6", "q10a", "q11", "q12", "q14", "q14a", "q18a", "q20", "q22", "q22a", "q24", "q27a", "q34", "q35", "q35a", "q36a", "q47", "q49", @@ -132,8 +138,9 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark { "q80a", "q86a", "q98") // If `--query-filter` defined, filters the queries that this option selects -val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs) -val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs) +val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs.queryFilter) +val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs.queryFilter, + nameSuffix = nameSuffixForQueriesV2_7) if ((queriesV1_4ToRun ++ queriesV2_7ToRun).isEmpty) { throw new RuntimeException( @@ -143,6 +150,6 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark { val tableSizes = setupTables(benchmarkArgs.dataLocation) runTpcdsQueries(queryLocation = "tpcds", queries = queriesV1_4ToRun, tableSizes) runTpcdsQueries(queryLocation = "tpcds-v2.7.0", queries = queriesV2_7ToRun, tableSizes, - nameSuffix = "-v2.7") + nameSuffix = nameSuffixForQueriesV2_7) } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (6d5d030 -> 4b36797)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 6d5d030 [SPARK-33414][SQL] Migrate SHOW CREATE TABLE command to use UnresolvedTableOrView to resolve the identifier add 4b36797 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark No new revisions were added by this update. Summary of changes: .../execution/benchmark/TPCDSQueryBenchmark.scala | 21 ++--- 1 file changed, 14 insertions(+), 7 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-2.4 updated: [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.4 by this push: new fece4a3 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark fece4a3 is described below commit fece4a3a36e23c7b99d6cb64e0c4484c9e17235f Author: Takeshi Yamamuro AuthorDate: Wed Nov 11 15:24:05 2020 +0900 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark ### What changes were proposed in this pull request? This PR intends to fix the behaviour of query filters in `TPCDSQueryBenchmark`. We can use an option `--query-filter` for selecting TPCDS queries to run, e.g., `--query-filter q6,q8,q13`. But, the current master has a weird behaviour about the option. For example, if we pass `--query-filter q6` so as to run the TPCDS q6 only, `TPCDSQueryBenchmark` runs `q6` and `q6-v2.7` because the `filterQueries` method does not respect the name suffix. So, there is no way now to run the TPCDS q6 only. ### Why are the changes needed? Bugfix. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manually checked. Closes #30324 from maropu/FilterBugInTPCDSQueryBenchmark. Authored-by: Takeshi Yamamuro Signed-off-by: Takeshi Yamamuro (cherry picked from commit 4b367976a877adb981f65d546e1522fdf30d0731) Signed-off-by: Takeshi Yamamuro --- .../execution/benchmark/TPCDSQueryBenchmark.scala | 21 ++--- 1 file changed, 14 insertions(+), 7 deletions(-) diff --git a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala index fccee97..1f8b057 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala @@ -90,11 +90,16 @@ object TPCDSQueryBenchmark extends Logging { } } - def filterQueries( + private def filterQueries( origQueries: Seq[String], - args: TPCDSQueryBenchmarkArguments): Seq[String] = { -if (args.queryFilter.nonEmpty) { - origQueries.filter(args.queryFilter.contains) + queryFilter: Set[String], + nameSuffix: String = ""): Seq[String] = { +if (queryFilter.nonEmpty) { + if (nameSuffix.nonEmpty) { +origQueries.filter { name => queryFilter.contains(s"$name$nameSuffix") } + } else { +origQueries.filter(queryFilter.contains) + } } else { origQueries } @@ -117,6 +122,7 @@ object TPCDSQueryBenchmark extends Logging { "q91", "q92", "q93", "q94", "q95", "q96", "q97", "q98", "q99") // This list only includes TPC-DS v2.7 queries that are different from v1.4 ones +val nameSuffixForQueriesV2_7 = "-v2.7" val tpcdsQueriesV2_7 = Seq( "q5a", "q6", "q10a", "q11", "q12", "q14", "q14a", "q18a", "q20", "q22", "q22a", "q24", "q27a", "q34", "q35", "q35a", "q36a", "q47", "q49", @@ -124,8 +130,9 @@ object TPCDSQueryBenchmark extends Logging { "q80a", "q86a", "q98") // If `--query-filter` defined, filters the queries that this option selects -val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs) -val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs) +val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs.queryFilter) +val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs.queryFilter, + nameSuffix = nameSuffixForQueriesV2_7) if ((queriesV1_4ToRun ++ queriesV2_7ToRun).isEmpty) { throw new RuntimeException( @@ -135,6 +142,6 @@ object TPCDSQueryBenchmark extends Logging { val tableSizes = setupTables(benchmarkArgs.dataLocation) runTpcdsQueries(queryLocation = "tpcds", queries = queriesV1_4ToRun, tableSizes) runTpcdsQueries(queryLocation = "tpcds-v2.7.0", queries = queriesV2_7ToRun, tableSizes, - nameSuffix = "-v2.7") + nameSuffix = nameSuffixForQueriesV2_7) } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 577dbb9 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark 577dbb9 is described below commit 577dbb96835f13f4cd92ea4caab9e6dece00be50 Author: Takeshi Yamamuro AuthorDate: Wed Nov 11 15:24:05 2020 +0900 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark ### What changes were proposed in this pull request? This PR intends to fix the behaviour of query filters in `TPCDSQueryBenchmark`. We can use an option `--query-filter` for selecting TPCDS queries to run, e.g., `--query-filter q6,q8,q13`. But, the current master has a weird behaviour about the option. For example, if we pass `--query-filter q6` so as to run the TPCDS q6 only, `TPCDSQueryBenchmark` runs `q6` and `q6-v2.7` because the `filterQueries` method does not respect the name suffix. So, there is no way now to run the TPCDS q6 only. ### Why are the changes needed? Bugfix. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manually checked. Closes #30324 from maropu/FilterBugInTPCDSQueryBenchmark. Authored-by: Takeshi Yamamuro Signed-off-by: Takeshi Yamamuro (cherry picked from commit 4b367976a877adb981f65d546e1522fdf30d0731) Signed-off-by: Takeshi Yamamuro --- .../execution/benchmark/TPCDSQueryBenchmark.scala | 21 ++--- 1 file changed, 14 insertions(+), 7 deletions(-) diff --git a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala index 7bbf079..43bc7c1 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala @@ -98,11 +98,16 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark { } } - def filterQueries( + private def filterQueries( origQueries: Seq[String], - args: TPCDSQueryBenchmarkArguments): Seq[String] = { -if (args.queryFilter.nonEmpty) { - origQueries.filter(args.queryFilter.contains) + queryFilter: Set[String], + nameSuffix: String = ""): Seq[String] = { +if (queryFilter.nonEmpty) { + if (nameSuffix.nonEmpty) { +origQueries.filter { name => queryFilter.contains(s"$name$nameSuffix") } + } else { +origQueries.filter(queryFilter.contains) + } } else { origQueries } @@ -125,6 +130,7 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark { "q91", "q92", "q93", "q94", "q95", "q96", "q97", "q98", "q99") // This list only includes TPC-DS v2.7 queries that are different from v1.4 ones +val nameSuffixForQueriesV2_7 = "-v2.7" val tpcdsQueriesV2_7 = Seq( "q5a", "q6", "q10a", "q11", "q12", "q14", "q14a", "q18a", "q20", "q22", "q22a", "q24", "q27a", "q34", "q35", "q35a", "q36a", "q47", "q49", @@ -132,8 +138,9 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark { "q80a", "q86a", "q98") // If `--query-filter` defined, filters the queries that this option selects -val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs) -val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs) +val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs.queryFilter) +val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs.queryFilter, + nameSuffix = nameSuffixForQueriesV2_7) if ((queriesV1_4ToRun ++ queriesV2_7ToRun).isEmpty) { throw new RuntimeException( @@ -143,6 +150,6 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark { val tableSizes = setupTables(benchmarkArgs.dataLocation) runTpcdsQueries(queryLocation = "tpcds", queries = queriesV1_4ToRun, tableSizes) runTpcdsQueries(queryLocation = "tpcds-v2.7.0", queries = queriesV2_7ToRun, tableSizes, - nameSuffix = "-v2.7") + nameSuffix = nameSuffixForQueriesV2_7) } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated (4a1c143 -> 577dbb9)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git. from 4a1c143 [SPARK-9][PYTHON] Pyspark application will hang due to non Exception error add 577dbb9 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark No new revisions were added by this update. Summary of changes: .../execution/benchmark/TPCDSQueryBenchmark.scala | 21 ++--- 1 file changed, 14 insertions(+), 7 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (318a173 -> 9d58a2f)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 318a173 [SPARK-33402][CORE] Jobs launched in same second have duplicate MapReduce JobIDs add 9d58a2f [MINOR][GRAPHX] Correct typos in the sub-modules: graphx, external, and examples No new revisions were added by this update. Summary of changes: .../apache/spark/examples/streaming/JavaCustomReceiver.java | 2 +- .../spark/examples/streaming/JavaNetworkWordCount.java | 2 +- .../examples/streaming/JavaRecoverableNetworkWordCount.java | 2 +- .../spark/examples/streaming/JavaSqlNetworkWordCount.java| 2 +- examples/src/main/python/ml/train_validation_split.py| 2 +- examples/src/main/python/sql/arrow.py| 4 ++-- .../main/python/streaming/recoverable_network_wordcount.py | 2 +- examples/src/main/python/streaming/sql_network_wordcount.py | 2 +- .../org/apache/spark/examples/streaming/CustomReceiver.scala | 2 +- .../apache/spark/examples/streaming/NetworkWordCount.scala | 2 +- .../examples/streaming/RecoverableNetworkWordCount.scala | 2 +- .../spark/examples/streaming/SqlNetworkWordCount.scala | 2 +- .../spark/examples/streaming/StatefulNetworkWordCount.scala | 2 +- .../apache/spark/sql/jdbc/DockerJDBCIntegrationSuite.scala | 2 +- .../test/scala/org/apache/spark/sql/jdbc/v2/V2JDBCTest.scala | 2 +- .../spark/sql/kafka010/KafkaContinuousSourceSuite.scala | 4 ++-- .../spark/sql/kafka010/KafkaMicroBatchSourceSuite.scala | 12 ++-- .../org/apache/spark/sql/kafka010/KafkaRelationSuite.scala | 4 ++-- .../scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala | 4 ++-- .../org/apache/spark/streaming/kafka010/KafkaRDDSuite.scala | 2 +- .../spark/examples/streaming/JavaKinesisWordCountASL.java| 2 +- .../main/python/examples/streaming/kinesis_wordcount_asl.py | 2 +- .../spark/examples/streaming/KinesisWordCountASL.scala | 6 +++--- .../spark/streaming/kinesis/KinesisUtilsPythonHelper.scala | 2 +- .../scala/org/apache/spark/graphx/lib/PageRankSuite.scala| 6 +++--- 25 files changed, 38 insertions(+), 38 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [MINOR][GRAPHX] Correct typos in the sub-modules: graphx, external, and examples
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 9d58a2f [MINOR][GRAPHX] Correct typos in the sub-modules: graphx, external, and examples 9d58a2f is described below commit 9d58a2f0f0f308a03830bf183959a4743a77b78a Author: Josh Soref AuthorDate: Thu Nov 12 08:29:22 2020 +0900 [MINOR][GRAPHX] Correct typos in the sub-modules: graphx, external, and examples ### What changes were proposed in this pull request? This PR intends to fix typos in the sub-modules: graphx, external, and examples. Split per holdenk https://github.com/apache/spark/pull/30323#issuecomment-725159710 NOTE: The misspellings have been reported at https://github.com/jsoref/spark/commit/706a726f87a0bbf5e31467fae9015218773db85b#commitcomment-44064356 ### Why are the changes needed? Misspelled words make it harder to read / understand content. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? No testing was performed Closes #30326 from jsoref/spelling-graphx. Authored-by: Josh Soref Signed-off-by: Takeshi Yamamuro --- .../apache/spark/examples/streaming/JavaCustomReceiver.java | 2 +- .../spark/examples/streaming/JavaNetworkWordCount.java | 2 +- .../examples/streaming/JavaRecoverableNetworkWordCount.java | 2 +- .../spark/examples/streaming/JavaSqlNetworkWordCount.java| 2 +- examples/src/main/python/ml/train_validation_split.py| 2 +- examples/src/main/python/sql/arrow.py| 4 ++-- .../main/python/streaming/recoverable_network_wordcount.py | 2 +- examples/src/main/python/streaming/sql_network_wordcount.py | 2 +- .../org/apache/spark/examples/streaming/CustomReceiver.scala | 2 +- .../apache/spark/examples/streaming/NetworkWordCount.scala | 2 +- .../examples/streaming/RecoverableNetworkWordCount.scala | 2 +- .../spark/examples/streaming/SqlNetworkWordCount.scala | 2 +- .../spark/examples/streaming/StatefulNetworkWordCount.scala | 2 +- .../apache/spark/sql/jdbc/DockerJDBCIntegrationSuite.scala | 2 +- .../test/scala/org/apache/spark/sql/jdbc/v2/V2JDBCTest.scala | 2 +- .../spark/sql/kafka010/KafkaContinuousSourceSuite.scala | 4 ++-- .../spark/sql/kafka010/KafkaMicroBatchSourceSuite.scala | 12 ++-- .../org/apache/spark/sql/kafka010/KafkaRelationSuite.scala | 4 ++-- .../scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala | 4 ++-- .../org/apache/spark/streaming/kafka010/KafkaRDDSuite.scala | 2 +- .../spark/examples/streaming/JavaKinesisWordCountASL.java| 2 +- .../main/python/examples/streaming/kinesis_wordcount_asl.py | 2 +- .../spark/examples/streaming/KinesisWordCountASL.scala | 6 +++--- .../spark/streaming/kinesis/KinesisUtilsPythonHelper.scala | 2 +- .../scala/org/apache/spark/graphx/lib/PageRankSuite.scala| 6 +++--- 25 files changed, 38 insertions(+), 38 deletions(-) diff --git a/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java b/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java index 47692ec..f84a197 100644 --- a/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java +++ b/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java @@ -67,7 +67,7 @@ public class JavaCustomReceiver extends Receiver { JavaStreamingContext ssc = new JavaStreamingContext(sparkConf, new Duration(1000)); // Create an input stream with the custom receiver on target ip:port and count the -// words in input stream of \n delimited text (eg. generated by 'nc') +// words in input stream of \n delimited text (e.g. generated by 'nc') JavaReceiverInputDStream lines = ssc.receiverStream( new JavaCustomReceiver(args[0], Integer.parseInt(args[1]))); JavaDStream words = lines.flatMap(x -> Arrays.asList(SPACE.split(x)).iterator()); diff --git a/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java b/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java index b217672..d56134b 100644 --- a/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java +++ b/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java @@ -57,7 +57,7 @@ public final class JavaNetworkWordCount { JavaStreamingContext ssc = new JavaStreamingContext(sparkConf, Durations.seconds(1)); // Create a JavaReceiverInputDStream on target ip:port and count the -// words in input stream of \n delimited text (eg. generated by 'nc') +// words in input stream of \n delimited text (e.g. generated by 'nc') // N
[spark] branch master updated (318a173 -> 9d58a2f)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 318a173 [SPARK-33402][CORE] Jobs launched in same second have duplicate MapReduce JobIDs add 9d58a2f [MINOR][GRAPHX] Correct typos in the sub-modules: graphx, external, and examples No new revisions were added by this update. Summary of changes: .../apache/spark/examples/streaming/JavaCustomReceiver.java | 2 +- .../spark/examples/streaming/JavaNetworkWordCount.java | 2 +- .../examples/streaming/JavaRecoverableNetworkWordCount.java | 2 +- .../spark/examples/streaming/JavaSqlNetworkWordCount.java| 2 +- examples/src/main/python/ml/train_validation_split.py| 2 +- examples/src/main/python/sql/arrow.py| 4 ++-- .../main/python/streaming/recoverable_network_wordcount.py | 2 +- examples/src/main/python/streaming/sql_network_wordcount.py | 2 +- .../org/apache/spark/examples/streaming/CustomReceiver.scala | 2 +- .../apache/spark/examples/streaming/NetworkWordCount.scala | 2 +- .../examples/streaming/RecoverableNetworkWordCount.scala | 2 +- .../spark/examples/streaming/SqlNetworkWordCount.scala | 2 +- .../spark/examples/streaming/StatefulNetworkWordCount.scala | 2 +- .../apache/spark/sql/jdbc/DockerJDBCIntegrationSuite.scala | 2 +- .../test/scala/org/apache/spark/sql/jdbc/v2/V2JDBCTest.scala | 2 +- .../spark/sql/kafka010/KafkaContinuousSourceSuite.scala | 4 ++-- .../spark/sql/kafka010/KafkaMicroBatchSourceSuite.scala | 12 ++-- .../org/apache/spark/sql/kafka010/KafkaRelationSuite.scala | 4 ++-- .../scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala | 4 ++-- .../org/apache/spark/streaming/kafka010/KafkaRDDSuite.scala | 2 +- .../spark/examples/streaming/JavaKinesisWordCountASL.java| 2 +- .../main/python/examples/streaming/kinesis_wordcount_asl.py | 2 +- .../spark/examples/streaming/KinesisWordCountASL.scala | 6 +++--- .../spark/streaming/kinesis/KinesisUtilsPythonHelper.scala | 2 +- .../scala/org/apache/spark/graphx/lib/PageRankSuite.scala| 6 +++--- 25 files changed, 38 insertions(+), 38 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (318a173 -> 9d58a2f)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 318a173 [SPARK-33402][CORE] Jobs launched in same second have duplicate MapReduce JobIDs add 9d58a2f [MINOR][GRAPHX] Correct typos in the sub-modules: graphx, external, and examples No new revisions were added by this update. Summary of changes: .../apache/spark/examples/streaming/JavaCustomReceiver.java | 2 +- .../spark/examples/streaming/JavaNetworkWordCount.java | 2 +- .../examples/streaming/JavaRecoverableNetworkWordCount.java | 2 +- .../spark/examples/streaming/JavaSqlNetworkWordCount.java| 2 +- examples/src/main/python/ml/train_validation_split.py| 2 +- examples/src/main/python/sql/arrow.py| 4 ++-- .../main/python/streaming/recoverable_network_wordcount.py | 2 +- examples/src/main/python/streaming/sql_network_wordcount.py | 2 +- .../org/apache/spark/examples/streaming/CustomReceiver.scala | 2 +- .../apache/spark/examples/streaming/NetworkWordCount.scala | 2 +- .../examples/streaming/RecoverableNetworkWordCount.scala | 2 +- .../spark/examples/streaming/SqlNetworkWordCount.scala | 2 +- .../spark/examples/streaming/StatefulNetworkWordCount.scala | 2 +- .../apache/spark/sql/jdbc/DockerJDBCIntegrationSuite.scala | 2 +- .../test/scala/org/apache/spark/sql/jdbc/v2/V2JDBCTest.scala | 2 +- .../spark/sql/kafka010/KafkaContinuousSourceSuite.scala | 4 ++-- .../spark/sql/kafka010/KafkaMicroBatchSourceSuite.scala | 12 ++-- .../org/apache/spark/sql/kafka010/KafkaRelationSuite.scala | 4 ++-- .../scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala | 4 ++-- .../org/apache/spark/streaming/kafka010/KafkaRDDSuite.scala | 2 +- .../spark/examples/streaming/JavaKinesisWordCountASL.java| 2 +- .../main/python/examples/streaming/kinesis_wordcount_asl.py | 2 +- .../spark/examples/streaming/KinesisWordCountASL.scala | 6 +++--- .../spark/streaming/kinesis/KinesisUtilsPythonHelper.scala | 2 +- .../scala/org/apache/spark/graphx/lib/PageRankSuite.scala| 6 +++--- 25 files changed, 38 insertions(+), 38 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (318a173 -> 9d58a2f)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 318a173 [SPARK-33402][CORE] Jobs launched in same second have duplicate MapReduce JobIDs add 9d58a2f [MINOR][GRAPHX] Correct typos in the sub-modules: graphx, external, and examples No new revisions were added by this update. Summary of changes: .../apache/spark/examples/streaming/JavaCustomReceiver.java | 2 +- .../spark/examples/streaming/JavaNetworkWordCount.java | 2 +- .../examples/streaming/JavaRecoverableNetworkWordCount.java | 2 +- .../spark/examples/streaming/JavaSqlNetworkWordCount.java| 2 +- examples/src/main/python/ml/train_validation_split.py| 2 +- examples/src/main/python/sql/arrow.py| 4 ++-- .../main/python/streaming/recoverable_network_wordcount.py | 2 +- examples/src/main/python/streaming/sql_network_wordcount.py | 2 +- .../org/apache/spark/examples/streaming/CustomReceiver.scala | 2 +- .../apache/spark/examples/streaming/NetworkWordCount.scala | 2 +- .../examples/streaming/RecoverableNetworkWordCount.scala | 2 +- .../spark/examples/streaming/SqlNetworkWordCount.scala | 2 +- .../spark/examples/streaming/StatefulNetworkWordCount.scala | 2 +- .../apache/spark/sql/jdbc/DockerJDBCIntegrationSuite.scala | 2 +- .../test/scala/org/apache/spark/sql/jdbc/v2/V2JDBCTest.scala | 2 +- .../spark/sql/kafka010/KafkaContinuousSourceSuite.scala | 4 ++-- .../spark/sql/kafka010/KafkaMicroBatchSourceSuite.scala | 12 ++-- .../org/apache/spark/sql/kafka010/KafkaRelationSuite.scala | 4 ++-- .../scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala | 4 ++-- .../org/apache/spark/streaming/kafka010/KafkaRDDSuite.scala | 2 +- .../spark/examples/streaming/JavaKinesisWordCountASL.java| 2 +- .../main/python/examples/streaming/kinesis_wordcount_asl.py | 2 +- .../spark/examples/streaming/KinesisWordCountASL.scala | 6 +++--- .../spark/streaming/kinesis/KinesisUtilsPythonHelper.scala | 2 +- .../scala/org/apache/spark/graphx/lib/PageRankSuite.scala| 6 +++--- 25 files changed, 38 insertions(+), 38 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (a70a2b0 -> 82a21d2)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from a70a2b0 [SPARK-33439][INFRA] Use SERIAL_SBT_TESTS=1 for SQL modules add 82a21d2 [SPARK-33433][SQL] Change Aggregate max rows to 1 if grouping is empty No new revisions were added by this update. Summary of changes: .../catalyst/plans/logical/basicLogicalOperators.scala | 8 +++- .../sql/catalyst/optimizer/LimitPushdownSuite.scala| 18 ++ 2 files changed, 25 insertions(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (a70a2b0 -> 82a21d2)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from a70a2b0 [SPARK-33439][INFRA] Use SERIAL_SBT_TESTS=1 for SQL modules add 82a21d2 [SPARK-33433][SQL] Change Aggregate max rows to 1 if grouping is empty No new revisions were added by this update. Summary of changes: .../catalyst/plans/logical/basicLogicalOperators.scala | 8 +++- .../sql/catalyst/optimizer/LimitPushdownSuite.scala| 18 ++ 2 files changed, 25 insertions(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (a70a2b0 -> 82a21d2)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from a70a2b0 [SPARK-33439][INFRA] Use SERIAL_SBT_TESTS=1 for SQL modules add 82a21d2 [SPARK-33433][SQL] Change Aggregate max rows to 1 if grouping is empty No new revisions were added by this update. Summary of changes: .../catalyst/plans/logical/basicLogicalOperators.scala | 8 +++- .../sql/catalyst/optimizer/LimitPushdownSuite.scala| 18 ++ 2 files changed, 25 insertions(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (9ab0f82 -> f5e3302)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 9ab0f82 [SPARK-23499][MESOS] Support for priority queues in Mesos scheduler add f5e3302 [SPARK-33399][SQL] Normalize output partitioning and sortorder with respect to aliases to avoid unneeded exchange/sort nodes No new revisions were added by this update. Summary of changes: .../sql/execution/AliasAwareOutputExpression.scala | 32 +- .../approved-plans-v1_4/q2.sf100/explain.txt | 169 ++- .../approved-plans-v1_4/q2.sf100/simplified.txt| 97 +- .../approved-plans-v1_4/q23a.sf100/explain.txt | 782 +++--- .../approved-plans-v1_4/q23a.sf100/simplified.txt | 155 +-- .../approved-plans-v1_4/q23b.sf100/explain.txt | 1132 ++-- .../approved-plans-v1_4/q23b.sf100/simplified.txt | 241 +++-- .../approved-plans-v1_4/q95.sf100/explain.txt | 350 +++--- .../approved-plans-v1_4/q95.sf100/simplified.txt | 82 +- .../apache/spark/sql/execution/PlannerSuite.scala | 164 +++ 10 files changed, 1718 insertions(+), 1486 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (9ab0f82 -> f5e3302)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 9ab0f82 [SPARK-23499][MESOS] Support for priority queues in Mesos scheduler add f5e3302 [SPARK-33399][SQL] Normalize output partitioning and sortorder with respect to aliases to avoid unneeded exchange/sort nodes No new revisions were added by this update. Summary of changes: .../sql/execution/AliasAwareOutputExpression.scala | 32 +- .../approved-plans-v1_4/q2.sf100/explain.txt | 169 ++- .../approved-plans-v1_4/q2.sf100/simplified.txt| 97 +- .../approved-plans-v1_4/q23a.sf100/explain.txt | 782 +++--- .../approved-plans-v1_4/q23a.sf100/simplified.txt | 155 +-- .../approved-plans-v1_4/q23b.sf100/explain.txt | 1132 ++-- .../approved-plans-v1_4/q23b.sf100/simplified.txt | 241 +++-- .../approved-plans-v1_4/q95.sf100/explain.txt | 350 +++--- .../approved-plans-v1_4/q95.sf100/simplified.txt | 82 +- .../apache/spark/sql/execution/PlannerSuite.scala | 164 +++ 10 files changed, 1718 insertions(+), 1486 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (9ab0f82 -> f5e3302)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 9ab0f82 [SPARK-23499][MESOS] Support for priority queues in Mesos scheduler add f5e3302 [SPARK-33399][SQL] Normalize output partitioning and sortorder with respect to aliases to avoid unneeded exchange/sort nodes No new revisions were added by this update. Summary of changes: .../sql/execution/AliasAwareOutputExpression.scala | 32 +- .../approved-plans-v1_4/q2.sf100/explain.txt | 169 ++- .../approved-plans-v1_4/q2.sf100/simplified.txt| 97 +- .../approved-plans-v1_4/q23a.sf100/explain.txt | 782 +++--- .../approved-plans-v1_4/q23a.sf100/simplified.txt | 155 +-- .../approved-plans-v1_4/q23b.sf100/explain.txt | 1132 ++-- .../approved-plans-v1_4/q23b.sf100/simplified.txt | 241 +++-- .../approved-plans-v1_4/q95.sf100/explain.txt | 350 +++--- .../approved-plans-v1_4/q95.sf100/simplified.txt | 82 +- .../apache/spark/sql/execution/PlannerSuite.scala | 164 +++ 10 files changed, 1718 insertions(+), 1486 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (9ab0f82 -> f5e3302)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 9ab0f82 [SPARK-23499][MESOS] Support for priority queues in Mesos scheduler add f5e3302 [SPARK-33399][SQL] Normalize output partitioning and sortorder with respect to aliases to avoid unneeded exchange/sort nodes No new revisions were added by this update. Summary of changes: .../sql/execution/AliasAwareOutputExpression.scala | 32 +- .../approved-plans-v1_4/q2.sf100/explain.txt | 169 ++- .../approved-plans-v1_4/q2.sf100/simplified.txt| 97 +- .../approved-plans-v1_4/q23a.sf100/explain.txt | 782 +++--- .../approved-plans-v1_4/q23a.sf100/simplified.txt | 155 +-- .../approved-plans-v1_4/q23b.sf100/explain.txt | 1132 ++-- .../approved-plans-v1_4/q23b.sf100/simplified.txt | 241 +++-- .../approved-plans-v1_4/q95.sf100/explain.txt | 350 +++--- .../approved-plans-v1_4/q95.sf100/simplified.txt | 82 +- .../apache/spark/sql/execution/PlannerSuite.scala | 164 +++ 10 files changed, 1718 insertions(+), 1486 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (9ab0f82 -> f5e3302)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 9ab0f82 [SPARK-23499][MESOS] Support for priority queues in Mesos scheduler add f5e3302 [SPARK-33399][SQL] Normalize output partitioning and sortorder with respect to aliases to avoid unneeded exchange/sort nodes No new revisions were added by this update. Summary of changes: .../sql/execution/AliasAwareOutputExpression.scala | 32 +- .../approved-plans-v1_4/q2.sf100/explain.txt | 169 ++- .../approved-plans-v1_4/q2.sf100/simplified.txt| 97 +- .../approved-plans-v1_4/q23a.sf100/explain.txt | 782 +++--- .../approved-plans-v1_4/q23a.sf100/simplified.txt | 155 +-- .../approved-plans-v1_4/q23b.sf100/explain.txt | 1132 ++-- .../approved-plans-v1_4/q23b.sf100/simplified.txt | 241 +++-- .../approved-plans-v1_4/q95.sf100/explain.txt | 350 +++--- .../approved-plans-v1_4/q95.sf100/simplified.txt | 82 +- .../apache/spark/sql/execution/PlannerSuite.scala | 164 +++ 10 files changed, 1718 insertions(+), 1486 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (fbfc0bf -> 9a4c790)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from fbfc0bf [SPARK-33464][INFRA] Add/remove (un)necessary cache and restructure GitHub Actions yaml add 9a4c790 [SPARK-33354][SQL] New explicit cast syntax rules in ANSI mode No new revisions were added by this update. Summary of changes: docs/sql-ref-ansi-compliance.md| 21 + .../spark/sql/catalyst/expressions/Cast.scala | 118 ++- .../spark/sql/catalyst/expressions/CastSuite.scala | 850 +++-- .../org/apache/spark/sql/sources/InsertSuite.scala | 41 + 4 files changed, 635 insertions(+), 395 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (fbfc0bf -> 9a4c790)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from fbfc0bf [SPARK-33464][INFRA] Add/remove (un)necessary cache and restructure GitHub Actions yaml add 9a4c790 [SPARK-33354][SQL] New explicit cast syntax rules in ANSI mode No new revisions were added by this update. Summary of changes: docs/sql-ref-ansi-compliance.md| 21 + .../spark/sql/catalyst/expressions/Cast.scala | 118 ++- .../spark/sql/catalyst/expressions/CastSuite.scala | 850 +++-- .../org/apache/spark/sql/sources/InsertSuite.scala | 41 + 4 files changed, 635 insertions(+), 395 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (fbfc0bf -> 9a4c790)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from fbfc0bf [SPARK-33464][INFRA] Add/remove (un)necessary cache and restructure GitHub Actions yaml add 9a4c790 [SPARK-33354][SQL] New explicit cast syntax rules in ANSI mode No new revisions were added by this update. Summary of changes: docs/sql-ref-ansi-compliance.md| 21 + .../spark/sql/catalyst/expressions/Cast.scala | 118 ++- .../spark/sql/catalyst/expressions/CastSuite.scala | 850 +++-- .../org/apache/spark/sql/sources/InsertSuite.scala | 41 + 4 files changed, 635 insertions(+), 395 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (fbfc0bf -> 9a4c790)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from fbfc0bf [SPARK-33464][INFRA] Add/remove (un)necessary cache and restructure GitHub Actions yaml add 9a4c790 [SPARK-33354][SQL] New explicit cast syntax rules in ANSI mode No new revisions were added by this update. Summary of changes: docs/sql-ref-ansi-compliance.md| 21 + .../spark/sql/catalyst/expressions/Cast.scala | 118 ++- .../spark/sql/catalyst/expressions/CastSuite.scala | 850 +++-- .../org/apache/spark/sql/sources/InsertSuite.scala | 41 + 4 files changed, 635 insertions(+), 395 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (fbfc0bf -> 9a4c790)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from fbfc0bf [SPARK-33464][INFRA] Add/remove (un)necessary cache and restructure GitHub Actions yaml add 9a4c790 [SPARK-33354][SQL] New explicit cast syntax rules in ANSI mode No new revisions were added by this update. Summary of changes: docs/sql-ref-ansi-compliance.md| 21 + .../spark/sql/catalyst/expressions/Cast.scala | 118 ++- .../spark/sql/catalyst/expressions/CastSuite.scala | 850 +++-- .../org/apache/spark/sql/sources/InsertSuite.scala | 41 + 4 files changed, 635 insertions(+), 395 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (56a8510 -> 4267ca9)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 56a8510 [SPARK-33304][R][SQL] Add from_avro and to_avro functions to SparkR add 4267ca9 [SPARK-33479][DOC] Make the API Key of DocSearch configurable No new revisions were added by this update. Summary of changes: docs/_config.yml | 12 docs/_layouts/global.html | 8 +--- 2 files changed, 13 insertions(+), 7 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (56a8510 -> 4267ca9)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 56a8510 [SPARK-33304][R][SQL] Add from_avro and to_avro functions to SparkR add 4267ca9 [SPARK-33479][DOC] Make the API Key of DocSearch configurable No new revisions were added by this update. Summary of changes: docs/_config.yml | 12 docs/_layouts/global.html | 8 +--- 2 files changed, 13 insertions(+), 7 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (56a8510 -> 4267ca9)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 56a8510 [SPARK-33304][R][SQL] Add from_avro and to_avro functions to SparkR add 4267ca9 [SPARK-33479][DOC] Make the API Key of DocSearch configurable No new revisions were added by this update. Summary of changes: docs/_config.yml | 12 docs/_layouts/global.html | 8 +--- 2 files changed, 13 insertions(+), 7 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (56a8510 -> 4267ca9)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 56a8510 [SPARK-33304][R][SQL] Add from_avro and to_avro functions to SparkR add 4267ca9 [SPARK-33479][DOC] Make the API Key of DocSearch configurable No new revisions were added by this update. Summary of changes: docs/_config.yml | 12 docs/_layouts/global.html | 8 +--- 2 files changed, 13 insertions(+), 7 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (56a8510 -> 4267ca9)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 56a8510 [SPARK-33304][R][SQL] Add from_avro and to_avro functions to SparkR add 4267ca9 [SPARK-33479][DOC] Make the API Key of DocSearch configurable No new revisions were added by this update. Summary of changes: docs/_config.yml | 12 docs/_layouts/global.html | 8 +--- 2 files changed, 13 insertions(+), 7 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [MINOR][GRAPHX][3.0] Correct typos in the sub-modules: graphx, external, and examples
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 26c0404 [MINOR][GRAPHX][3.0] Correct typos in the sub-modules: graphx, external, and examples 26c0404 is described below commit 26c0404214563bb558662e68ea73357c4f4021ed Author: Josh Soref AuthorDate: Tue Nov 17 15:25:42 2020 +0900 [MINOR][GRAPHX][3.0] Correct typos in the sub-modules: graphx, external, and examples ### What changes were proposed in this pull request? This PR intends to fix typos in the sub-modules: graphx, external, and examples. Split per holdenk https://github.com/apache/spark/pull/30323#issuecomment-725159710 NOTE: The misspellings have been reported at https://github.com/jsoref/spark/commit/706a726f87a0bbf5e31467fae9015218773db85b#commitcomment-44064356 Backport of #30326 ### Why are the changes needed? Misspelled words make it harder to read / understand content. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? No testing was performed Closes #30342 from jsoref/branch-3.0-30326. Authored-by: Josh Soref Signed-off-by: Takeshi Yamamuro --- .../apache/spark/examples/streaming/JavaCustomReceiver.java | 2 +- .../spark/examples/streaming/JavaNetworkWordCount.java | 2 +- .../examples/streaming/JavaRecoverableNetworkWordCount.java | 2 +- .../spark/examples/streaming/JavaSqlNetworkWordCount.java| 2 +- examples/src/main/python/ml/train_validation_split.py| 2 +- examples/src/main/python/sql/arrow.py| 4 ++-- .../main/python/streaming/recoverable_network_wordcount.py | 2 +- examples/src/main/python/streaming/sql_network_wordcount.py | 2 +- .../org/apache/spark/examples/streaming/CustomReceiver.scala | 2 +- .../apache/spark/examples/streaming/NetworkWordCount.scala | 2 +- .../examples/streaming/RecoverableNetworkWordCount.scala | 2 +- .../spark/examples/streaming/SqlNetworkWordCount.scala | 2 +- .../spark/examples/streaming/StatefulNetworkWordCount.scala | 2 +- .../apache/spark/sql/jdbc/DockerJDBCIntegrationSuite.scala | 2 +- .../spark/sql/kafka010/KafkaContinuousSourceSuite.scala | 4 ++-- .../spark/sql/kafka010/KafkaMicroBatchSourceSuite.scala | 12 ++-- .../org/apache/spark/sql/kafka010/KafkaRelationSuite.scala | 4 ++-- .../scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala | 4 ++-- .../org/apache/spark/streaming/kafka010/KafkaRDDSuite.scala | 2 +- .../spark/examples/streaming/JavaKinesisWordCountASL.java| 2 +- .../main/python/examples/streaming/kinesis_wordcount_asl.py | 2 +- .../spark/examples/streaming/KinesisWordCountASL.scala | 6 +++--- .../spark/streaming/kinesis/KinesisUtilsPythonHelper.scala | 2 +- .../scala/org/apache/spark/graphx/lib/PageRankSuite.scala| 6 +++--- 24 files changed, 37 insertions(+), 37 deletions(-) diff --git a/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java b/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java index 47692ec..f84a197 100644 --- a/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java +++ b/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java @@ -67,7 +67,7 @@ public class JavaCustomReceiver extends Receiver { JavaStreamingContext ssc = new JavaStreamingContext(sparkConf, new Duration(1000)); // Create an input stream with the custom receiver on target ip:port and count the -// words in input stream of \n delimited text (eg. generated by 'nc') +// words in input stream of \n delimited text (e.g. generated by 'nc') JavaReceiverInputDStream lines = ssc.receiverStream( new JavaCustomReceiver(args[0], Integer.parseInt(args[1]))); JavaDStream words = lines.flatMap(x -> Arrays.asList(SPACE.split(x)).iterator()); diff --git a/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java b/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java index b217672..d56134b 100644 --- a/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java +++ b/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java @@ -57,7 +57,7 @@ public final class JavaNetworkWordCount { JavaStreamingContext ssc = new JavaStreamingContext(sparkConf, Durations.seconds(1)); // Create a JavaReceiverInputDStream on target ip:port and count the -// words in input stream of \n delimited text (eg. generated by 'nc') +// words in input stream of \n delimited text (e.g. generated by 'nc') // Note that no duplication in stor
[spark] branch branch-3.0 updated: [MINOR][GRAPHX][3.0] Correct typos in the sub-modules: graphx, external, and examples
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 26c0404 [MINOR][GRAPHX][3.0] Correct typos in the sub-modules: graphx, external, and examples 26c0404 is described below commit 26c0404214563bb558662e68ea73357c4f4021ed Author: Josh Soref AuthorDate: Tue Nov 17 15:25:42 2020 +0900 [MINOR][GRAPHX][3.0] Correct typos in the sub-modules: graphx, external, and examples ### What changes were proposed in this pull request? This PR intends to fix typos in the sub-modules: graphx, external, and examples. Split per holdenk https://github.com/apache/spark/pull/30323#issuecomment-725159710 NOTE: The misspellings have been reported at https://github.com/jsoref/spark/commit/706a726f87a0bbf5e31467fae9015218773db85b#commitcomment-44064356 Backport of #30326 ### Why are the changes needed? Misspelled words make it harder to read / understand content. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? No testing was performed Closes #30342 from jsoref/branch-3.0-30326. Authored-by: Josh Soref Signed-off-by: Takeshi Yamamuro --- .../apache/spark/examples/streaming/JavaCustomReceiver.java | 2 +- .../spark/examples/streaming/JavaNetworkWordCount.java | 2 +- .../examples/streaming/JavaRecoverableNetworkWordCount.java | 2 +- .../spark/examples/streaming/JavaSqlNetworkWordCount.java| 2 +- examples/src/main/python/ml/train_validation_split.py| 2 +- examples/src/main/python/sql/arrow.py| 4 ++-- .../main/python/streaming/recoverable_network_wordcount.py | 2 +- examples/src/main/python/streaming/sql_network_wordcount.py | 2 +- .../org/apache/spark/examples/streaming/CustomReceiver.scala | 2 +- .../apache/spark/examples/streaming/NetworkWordCount.scala | 2 +- .../examples/streaming/RecoverableNetworkWordCount.scala | 2 +- .../spark/examples/streaming/SqlNetworkWordCount.scala | 2 +- .../spark/examples/streaming/StatefulNetworkWordCount.scala | 2 +- .../apache/spark/sql/jdbc/DockerJDBCIntegrationSuite.scala | 2 +- .../spark/sql/kafka010/KafkaContinuousSourceSuite.scala | 4 ++-- .../spark/sql/kafka010/KafkaMicroBatchSourceSuite.scala | 12 ++-- .../org/apache/spark/sql/kafka010/KafkaRelationSuite.scala | 4 ++-- .../scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala | 4 ++-- .../org/apache/spark/streaming/kafka010/KafkaRDDSuite.scala | 2 +- .../spark/examples/streaming/JavaKinesisWordCountASL.java| 2 +- .../main/python/examples/streaming/kinesis_wordcount_asl.py | 2 +- .../spark/examples/streaming/KinesisWordCountASL.scala | 6 +++--- .../spark/streaming/kinesis/KinesisUtilsPythonHelper.scala | 2 +- .../scala/org/apache/spark/graphx/lib/PageRankSuite.scala| 6 +++--- 24 files changed, 37 insertions(+), 37 deletions(-) diff --git a/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java b/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java index 47692ec..f84a197 100644 --- a/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java +++ b/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java @@ -67,7 +67,7 @@ public class JavaCustomReceiver extends Receiver { JavaStreamingContext ssc = new JavaStreamingContext(sparkConf, new Duration(1000)); // Create an input stream with the custom receiver on target ip:port and count the -// words in input stream of \n delimited text (eg. generated by 'nc') +// words in input stream of \n delimited text (e.g. generated by 'nc') JavaReceiverInputDStream lines = ssc.receiverStream( new JavaCustomReceiver(args[0], Integer.parseInt(args[1]))); JavaDStream words = lines.flatMap(x -> Arrays.asList(SPACE.split(x)).iterator()); diff --git a/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java b/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java index b217672..d56134b 100644 --- a/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java +++ b/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java @@ -57,7 +57,7 @@ public final class JavaNetworkWordCount { JavaStreamingContext ssc = new JavaStreamingContext(sparkConf, Durations.seconds(1)); // Create a JavaReceiverInputDStream on target ip:port and count the -// words in input stream of \n delimited text (eg. generated by 'nc') +// words in input stream of \n delimited text (e.g. generated by 'nc') // Note that no duplication in stor
[spark] branch branch-3.0 updated: [MINOR][GRAPHX][3.0] Correct typos in the sub-modules: graphx, external, and examples
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 26c0404 [MINOR][GRAPHX][3.0] Correct typos in the sub-modules: graphx, external, and examples 26c0404 is described below commit 26c0404214563bb558662e68ea73357c4f4021ed Author: Josh Soref AuthorDate: Tue Nov 17 15:25:42 2020 +0900 [MINOR][GRAPHX][3.0] Correct typos in the sub-modules: graphx, external, and examples ### What changes were proposed in this pull request? This PR intends to fix typos in the sub-modules: graphx, external, and examples. Split per holdenk https://github.com/apache/spark/pull/30323#issuecomment-725159710 NOTE: The misspellings have been reported at https://github.com/jsoref/spark/commit/706a726f87a0bbf5e31467fae9015218773db85b#commitcomment-44064356 Backport of #30326 ### Why are the changes needed? Misspelled words make it harder to read / understand content. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? No testing was performed Closes #30342 from jsoref/branch-3.0-30326. Authored-by: Josh Soref Signed-off-by: Takeshi Yamamuro --- .../apache/spark/examples/streaming/JavaCustomReceiver.java | 2 +- .../spark/examples/streaming/JavaNetworkWordCount.java | 2 +- .../examples/streaming/JavaRecoverableNetworkWordCount.java | 2 +- .../spark/examples/streaming/JavaSqlNetworkWordCount.java| 2 +- examples/src/main/python/ml/train_validation_split.py| 2 +- examples/src/main/python/sql/arrow.py| 4 ++-- .../main/python/streaming/recoverable_network_wordcount.py | 2 +- examples/src/main/python/streaming/sql_network_wordcount.py | 2 +- .../org/apache/spark/examples/streaming/CustomReceiver.scala | 2 +- .../apache/spark/examples/streaming/NetworkWordCount.scala | 2 +- .../examples/streaming/RecoverableNetworkWordCount.scala | 2 +- .../spark/examples/streaming/SqlNetworkWordCount.scala | 2 +- .../spark/examples/streaming/StatefulNetworkWordCount.scala | 2 +- .../apache/spark/sql/jdbc/DockerJDBCIntegrationSuite.scala | 2 +- .../spark/sql/kafka010/KafkaContinuousSourceSuite.scala | 4 ++-- .../spark/sql/kafka010/KafkaMicroBatchSourceSuite.scala | 12 ++-- .../org/apache/spark/sql/kafka010/KafkaRelationSuite.scala | 4 ++-- .../scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala | 4 ++-- .../org/apache/spark/streaming/kafka010/KafkaRDDSuite.scala | 2 +- .../spark/examples/streaming/JavaKinesisWordCountASL.java| 2 +- .../main/python/examples/streaming/kinesis_wordcount_asl.py | 2 +- .../spark/examples/streaming/KinesisWordCountASL.scala | 6 +++--- .../spark/streaming/kinesis/KinesisUtilsPythonHelper.scala | 2 +- .../scala/org/apache/spark/graphx/lib/PageRankSuite.scala| 6 +++--- 24 files changed, 37 insertions(+), 37 deletions(-) diff --git a/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java b/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java index 47692ec..f84a197 100644 --- a/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java +++ b/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java @@ -67,7 +67,7 @@ public class JavaCustomReceiver extends Receiver { JavaStreamingContext ssc = new JavaStreamingContext(sparkConf, new Duration(1000)); // Create an input stream with the custom receiver on target ip:port and count the -// words in input stream of \n delimited text (eg. generated by 'nc') +// words in input stream of \n delimited text (e.g. generated by 'nc') JavaReceiverInputDStream lines = ssc.receiverStream( new JavaCustomReceiver(args[0], Integer.parseInt(args[1]))); JavaDStream words = lines.flatMap(x -> Arrays.asList(SPACE.split(x)).iterator()); diff --git a/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java b/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java index b217672..d56134b 100644 --- a/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java +++ b/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java @@ -57,7 +57,7 @@ public final class JavaNetworkWordCount { JavaStreamingContext ssc = new JavaStreamingContext(sparkConf, Durations.seconds(1)); // Create a JavaReceiverInputDStream on target ip:port and count the -// words in input stream of \n delimited text (eg. generated by 'nc') +// words in input stream of \n delimited text (e.g. generated by 'nc') // Note that no duplication in stor
[spark] branch branch-3.0 updated: [MINOR][GRAPHX][3.0] Correct typos in the sub-modules: graphx, external, and examples
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 26c0404 [MINOR][GRAPHX][3.0] Correct typos in the sub-modules: graphx, external, and examples 26c0404 is described below commit 26c0404214563bb558662e68ea73357c4f4021ed Author: Josh Soref AuthorDate: Tue Nov 17 15:25:42 2020 +0900 [MINOR][GRAPHX][3.0] Correct typos in the sub-modules: graphx, external, and examples ### What changes were proposed in this pull request? This PR intends to fix typos in the sub-modules: graphx, external, and examples. Split per holdenk https://github.com/apache/spark/pull/30323#issuecomment-725159710 NOTE: The misspellings have been reported at https://github.com/jsoref/spark/commit/706a726f87a0bbf5e31467fae9015218773db85b#commitcomment-44064356 Backport of #30326 ### Why are the changes needed? Misspelled words make it harder to read / understand content. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? No testing was performed Closes #30342 from jsoref/branch-3.0-30326. Authored-by: Josh Soref Signed-off-by: Takeshi Yamamuro --- .../apache/spark/examples/streaming/JavaCustomReceiver.java | 2 +- .../spark/examples/streaming/JavaNetworkWordCount.java | 2 +- .../examples/streaming/JavaRecoverableNetworkWordCount.java | 2 +- .../spark/examples/streaming/JavaSqlNetworkWordCount.java| 2 +- examples/src/main/python/ml/train_validation_split.py| 2 +- examples/src/main/python/sql/arrow.py| 4 ++-- .../main/python/streaming/recoverable_network_wordcount.py | 2 +- examples/src/main/python/streaming/sql_network_wordcount.py | 2 +- .../org/apache/spark/examples/streaming/CustomReceiver.scala | 2 +- .../apache/spark/examples/streaming/NetworkWordCount.scala | 2 +- .../examples/streaming/RecoverableNetworkWordCount.scala | 2 +- .../spark/examples/streaming/SqlNetworkWordCount.scala | 2 +- .../spark/examples/streaming/StatefulNetworkWordCount.scala | 2 +- .../apache/spark/sql/jdbc/DockerJDBCIntegrationSuite.scala | 2 +- .../spark/sql/kafka010/KafkaContinuousSourceSuite.scala | 4 ++-- .../spark/sql/kafka010/KafkaMicroBatchSourceSuite.scala | 12 ++-- .../org/apache/spark/sql/kafka010/KafkaRelationSuite.scala | 4 ++-- .../scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala | 4 ++-- .../org/apache/spark/streaming/kafka010/KafkaRDDSuite.scala | 2 +- .../spark/examples/streaming/JavaKinesisWordCountASL.java| 2 +- .../main/python/examples/streaming/kinesis_wordcount_asl.py | 2 +- .../spark/examples/streaming/KinesisWordCountASL.scala | 6 +++--- .../spark/streaming/kinesis/KinesisUtilsPythonHelper.scala | 2 +- .../scala/org/apache/spark/graphx/lib/PageRankSuite.scala| 6 +++--- 24 files changed, 37 insertions(+), 37 deletions(-) diff --git a/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java b/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java index 47692ec..f84a197 100644 --- a/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java +++ b/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java @@ -67,7 +67,7 @@ public class JavaCustomReceiver extends Receiver { JavaStreamingContext ssc = new JavaStreamingContext(sparkConf, new Duration(1000)); // Create an input stream with the custom receiver on target ip:port and count the -// words in input stream of \n delimited text (eg. generated by 'nc') +// words in input stream of \n delimited text (e.g. generated by 'nc') JavaReceiverInputDStream lines = ssc.receiverStream( new JavaCustomReceiver(args[0], Integer.parseInt(args[1]))); JavaDStream words = lines.flatMap(x -> Arrays.asList(SPACE.split(x)).iterator()); diff --git a/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java b/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java index b217672..d56134b 100644 --- a/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java +++ b/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java @@ -57,7 +57,7 @@ public final class JavaNetworkWordCount { JavaStreamingContext ssc = new JavaStreamingContext(sparkConf, Durations.seconds(1)); // Create a JavaReceiverInputDStream on target ip:port and count the -// words in input stream of \n delimited text (eg. generated by 'nc') +// words in input stream of \n delimited text (e.g. generated by 'nc') // Note that no duplication in stor
[spark] branch branch-3.0 updated: [MINOR][GRAPHX][3.0] Correct typos in the sub-modules: graphx, external, and examples
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 26c0404 [MINOR][GRAPHX][3.0] Correct typos in the sub-modules: graphx, external, and examples 26c0404 is described below commit 26c0404214563bb558662e68ea73357c4f4021ed Author: Josh Soref AuthorDate: Tue Nov 17 15:25:42 2020 +0900 [MINOR][GRAPHX][3.0] Correct typos in the sub-modules: graphx, external, and examples ### What changes were proposed in this pull request? This PR intends to fix typos in the sub-modules: graphx, external, and examples. Split per holdenk https://github.com/apache/spark/pull/30323#issuecomment-725159710 NOTE: The misspellings have been reported at https://github.com/jsoref/spark/commit/706a726f87a0bbf5e31467fae9015218773db85b#commitcomment-44064356 Backport of #30326 ### Why are the changes needed? Misspelled words make it harder to read / understand content. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? No testing was performed Closes #30342 from jsoref/branch-3.0-30326. Authored-by: Josh Soref Signed-off-by: Takeshi Yamamuro --- .../apache/spark/examples/streaming/JavaCustomReceiver.java | 2 +- .../spark/examples/streaming/JavaNetworkWordCount.java | 2 +- .../examples/streaming/JavaRecoverableNetworkWordCount.java | 2 +- .../spark/examples/streaming/JavaSqlNetworkWordCount.java| 2 +- examples/src/main/python/ml/train_validation_split.py| 2 +- examples/src/main/python/sql/arrow.py| 4 ++-- .../main/python/streaming/recoverable_network_wordcount.py | 2 +- examples/src/main/python/streaming/sql_network_wordcount.py | 2 +- .../org/apache/spark/examples/streaming/CustomReceiver.scala | 2 +- .../apache/spark/examples/streaming/NetworkWordCount.scala | 2 +- .../examples/streaming/RecoverableNetworkWordCount.scala | 2 +- .../spark/examples/streaming/SqlNetworkWordCount.scala | 2 +- .../spark/examples/streaming/StatefulNetworkWordCount.scala | 2 +- .../apache/spark/sql/jdbc/DockerJDBCIntegrationSuite.scala | 2 +- .../spark/sql/kafka010/KafkaContinuousSourceSuite.scala | 4 ++-- .../spark/sql/kafka010/KafkaMicroBatchSourceSuite.scala | 12 ++-- .../org/apache/spark/sql/kafka010/KafkaRelationSuite.scala | 4 ++-- .../scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala | 4 ++-- .../org/apache/spark/streaming/kafka010/KafkaRDDSuite.scala | 2 +- .../spark/examples/streaming/JavaKinesisWordCountASL.java| 2 +- .../main/python/examples/streaming/kinesis_wordcount_asl.py | 2 +- .../spark/examples/streaming/KinesisWordCountASL.scala | 6 +++--- .../spark/streaming/kinesis/KinesisUtilsPythonHelper.scala | 2 +- .../scala/org/apache/spark/graphx/lib/PageRankSuite.scala| 6 +++--- 24 files changed, 37 insertions(+), 37 deletions(-) diff --git a/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java b/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java index 47692ec..f84a197 100644 --- a/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java +++ b/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java @@ -67,7 +67,7 @@ public class JavaCustomReceiver extends Receiver { JavaStreamingContext ssc = new JavaStreamingContext(sparkConf, new Duration(1000)); // Create an input stream with the custom receiver on target ip:port and count the -// words in input stream of \n delimited text (eg. generated by 'nc') +// words in input stream of \n delimited text (e.g. generated by 'nc') JavaReceiverInputDStream lines = ssc.receiverStream( new JavaCustomReceiver(args[0], Integer.parseInt(args[1]))); JavaDStream words = lines.flatMap(x -> Arrays.asList(SPACE.split(x)).iterator()); diff --git a/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java b/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java index b217672..d56134b 100644 --- a/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java +++ b/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java @@ -57,7 +57,7 @@ public final class JavaNetworkWordCount { JavaStreamingContext ssc = new JavaStreamingContext(sparkConf, Durations.seconds(1)); // Create a JavaReceiverInputDStream on target ip:port and count the -// words in input stream of \n delimited text (eg. generated by 'nc') +// words in input stream of \n delimited text (e.g. generated by 'nc') // Note that no duplication in stor
[spark] branch branch-3.0 updated (2eadedc -> 5ee76e6)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git. from 2eadedc [SPARK-33408][K8S][R][3.0] Use R 3.6.3 in K8s R image add 5ee76e6 [MINOR][DOC] spark.executor.memoryOverhead is not cluster-mode only No new revisions were added by this update. Summary of changes: core/src/main/scala/org/apache/spark/internal/config/package.scala | 4 ++-- docs/configuration.md | 7 +++ 2 files changed, 5 insertions(+), 6 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (6d31dae -> 4335af0)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 6d31dae [SPARK-33386][SQL] Accessing array elements in ElementAt/Elt/GetArrayItem should failed if index is out of bound add 4335af0 [MINOR][DOC] spark.executor.memoryOverhead is not cluster-mode only No new revisions were added by this update. Summary of changes: core/src/main/scala/org/apache/spark/internal/config/package.scala | 4 ++-- docs/configuration.md | 7 +++ 2 files changed, 5 insertions(+), 6 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated (2eadedc -> 5ee76e6)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git. from 2eadedc [SPARK-33408][K8S][R][3.0] Use R 3.6.3 in K8s R image add 5ee76e6 [MINOR][DOC] spark.executor.memoryOverhead is not cluster-mode only No new revisions were added by this update. Summary of changes: core/src/main/scala/org/apache/spark/internal/config/package.scala | 4 ++-- docs/configuration.md | 7 +++ 2 files changed, 5 insertions(+), 6 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (6d31dae -> 4335af0)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 6d31dae [SPARK-33386][SQL] Accessing array elements in ElementAt/Elt/GetArrayItem should failed if index is out of bound add 4335af0 [MINOR][DOC] spark.executor.memoryOverhead is not cluster-mode only No new revisions were added by this update. Summary of changes: core/src/main/scala/org/apache/spark/internal/config/package.scala | 4 ++-- docs/configuration.md | 7 +++ 2 files changed, 5 insertions(+), 6 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated (2eadedc -> 5ee76e6)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git. from 2eadedc [SPARK-33408][K8S][R][3.0] Use R 3.6.3 in K8s R image add 5ee76e6 [MINOR][DOC] spark.executor.memoryOverhead is not cluster-mode only No new revisions were added by this update. Summary of changes: core/src/main/scala/org/apache/spark/internal/config/package.scala | 4 ++-- docs/configuration.md | 7 +++ 2 files changed, 5 insertions(+), 6 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (6d31dae -> 4335af0)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 6d31dae [SPARK-33386][SQL] Accessing array elements in ElementAt/Elt/GetArrayItem should failed if index is out of bound add 4335af0 [MINOR][DOC] spark.executor.memoryOverhead is not cluster-mode only No new revisions were added by this update. Summary of changes: core/src/main/scala/org/apache/spark/internal/config/package.scala | 4 ++-- docs/configuration.md | 7 +++ 2 files changed, 5 insertions(+), 6 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated (2eadedc -> 5ee76e6)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git. from 2eadedc [SPARK-33408][K8S][R][3.0] Use R 3.6.3 in K8s R image add 5ee76e6 [MINOR][DOC] spark.executor.memoryOverhead is not cluster-mode only No new revisions were added by this update. Summary of changes: core/src/main/scala/org/apache/spark/internal/config/package.scala | 4 ++-- docs/configuration.md | 7 +++ 2 files changed, 5 insertions(+), 6 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (6d31dae -> 4335af0)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 6d31dae [SPARK-33386][SQL] Accessing array elements in ElementAt/Elt/GetArrayItem should failed if index is out of bound add 4335af0 [MINOR][DOC] spark.executor.memoryOverhead is not cluster-mode only No new revisions were added by this update. Summary of changes: core/src/main/scala/org/apache/spark/internal/config/package.scala | 4 ++-- docs/configuration.md | 7 +++ 2 files changed, 5 insertions(+), 6 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated (2eadedc -> 5ee76e6)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git. from 2eadedc [SPARK-33408][K8S][R][3.0] Use R 3.6.3 in K8s R image add 5ee76e6 [MINOR][DOC] spark.executor.memoryOverhead is not cluster-mode only No new revisions were added by this update. Summary of changes: core/src/main/scala/org/apache/spark/internal/config/package.scala | 4 ++-- docs/configuration.md | 7 +++ 2 files changed, 5 insertions(+), 6 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (6d31dae -> 4335af0)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 6d31dae [SPARK-33386][SQL] Accessing array elements in ElementAt/Elt/GetArrayItem should failed if index is out of bound add 4335af0 [MINOR][DOC] spark.executor.memoryOverhead is not cluster-mode only No new revisions were added by this update. Summary of changes: core/src/main/scala/org/apache/spark/internal/config/package.scala | 4 ++-- docs/configuration.md | 7 +++ 2 files changed, 5 insertions(+), 6 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (6d5d030 -> 4b36797)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 6d5d030 [SPARK-33414][SQL] Migrate SHOW CREATE TABLE command to use UnresolvedTableOrView to resolve the identifier add 4b36797 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark No new revisions were added by this update. Summary of changes: .../execution/benchmark/TPCDSQueryBenchmark.scala | 21 ++--- 1 file changed, 14 insertions(+), 7 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (6d5d030 -> 4b36797)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 6d5d030 [SPARK-33414][SQL] Migrate SHOW CREATE TABLE command to use UnresolvedTableOrView to resolve the identifier add 4b36797 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark No new revisions were added by this update. Summary of changes: .../execution/benchmark/TPCDSQueryBenchmark.scala | 21 ++--- 1 file changed, 14 insertions(+), 7 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 577dbb9 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark 577dbb9 is described below commit 577dbb96835f13f4cd92ea4caab9e6dece00be50 Author: Takeshi Yamamuro AuthorDate: Wed Nov 11 15:24:05 2020 +0900 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark ### What changes were proposed in this pull request? This PR intends to fix the behaviour of query filters in `TPCDSQueryBenchmark`. We can use an option `--query-filter` for selecting TPCDS queries to run, e.g., `--query-filter q6,q8,q13`. But, the current master has a weird behaviour about the option. For example, if we pass `--query-filter q6` so as to run the TPCDS q6 only, `TPCDSQueryBenchmark` runs `q6` and `q6-v2.7` because the `filterQueries` method does not respect the name suffix. So, there is no way now to run the TPCDS q6 only. ### Why are the changes needed? Bugfix. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manually checked. Closes #30324 from maropu/FilterBugInTPCDSQueryBenchmark. Authored-by: Takeshi Yamamuro Signed-off-by: Takeshi Yamamuro (cherry picked from commit 4b367976a877adb981f65d546e1522fdf30d0731) Signed-off-by: Takeshi Yamamuro --- .../execution/benchmark/TPCDSQueryBenchmark.scala | 21 ++--- 1 file changed, 14 insertions(+), 7 deletions(-) diff --git a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala index 7bbf079..43bc7c1 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala @@ -98,11 +98,16 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark { } } - def filterQueries( + private def filterQueries( origQueries: Seq[String], - args: TPCDSQueryBenchmarkArguments): Seq[String] = { -if (args.queryFilter.nonEmpty) { - origQueries.filter(args.queryFilter.contains) + queryFilter: Set[String], + nameSuffix: String = ""): Seq[String] = { +if (queryFilter.nonEmpty) { + if (nameSuffix.nonEmpty) { +origQueries.filter { name => queryFilter.contains(s"$name$nameSuffix") } + } else { +origQueries.filter(queryFilter.contains) + } } else { origQueries } @@ -125,6 +130,7 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark { "q91", "q92", "q93", "q94", "q95", "q96", "q97", "q98", "q99") // This list only includes TPC-DS v2.7 queries that are different from v1.4 ones +val nameSuffixForQueriesV2_7 = "-v2.7" val tpcdsQueriesV2_7 = Seq( "q5a", "q6", "q10a", "q11", "q12", "q14", "q14a", "q18a", "q20", "q22", "q22a", "q24", "q27a", "q34", "q35", "q35a", "q36a", "q47", "q49", @@ -132,8 +138,9 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark { "q80a", "q86a", "q98") // If `--query-filter` defined, filters the queries that this option selects -val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs) -val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs) +val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs.queryFilter) +val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs.queryFilter, + nameSuffix = nameSuffixForQueriesV2_7) if ((queriesV1_4ToRun ++ queriesV2_7ToRun).isEmpty) { throw new RuntimeException( @@ -143,6 +150,6 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark { val tableSizes = setupTables(benchmarkArgs.dataLocation) runTpcdsQueries(queryLocation = "tpcds", queries = queriesV1_4ToRun, tableSizes) runTpcdsQueries(queryLocation = "tpcds-v2.7.0", queries = queriesV2_7ToRun, tableSizes, - nameSuffix = "-v2.7") + nameSuffix = nameSuffixForQueriesV2_7) } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-2.4 updated: [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.4 by this push: new fece4a3 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark fece4a3 is described below commit fece4a3a36e23c7b99d6cb64e0c4484c9e17235f Author: Takeshi Yamamuro AuthorDate: Wed Nov 11 15:24:05 2020 +0900 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark ### What changes were proposed in this pull request? This PR intends to fix the behaviour of query filters in `TPCDSQueryBenchmark`. We can use an option `--query-filter` for selecting TPCDS queries to run, e.g., `--query-filter q6,q8,q13`. But, the current master has a weird behaviour about the option. For example, if we pass `--query-filter q6` so as to run the TPCDS q6 only, `TPCDSQueryBenchmark` runs `q6` and `q6-v2.7` because the `filterQueries` method does not respect the name suffix. So, there is no way now to run the TPCDS q6 only. ### Why are the changes needed? Bugfix. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manually checked. Closes #30324 from maropu/FilterBugInTPCDSQueryBenchmark. Authored-by: Takeshi Yamamuro Signed-off-by: Takeshi Yamamuro (cherry picked from commit 4b367976a877adb981f65d546e1522fdf30d0731) Signed-off-by: Takeshi Yamamuro --- .../execution/benchmark/TPCDSQueryBenchmark.scala | 21 ++--- 1 file changed, 14 insertions(+), 7 deletions(-) diff --git a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala index fccee97..1f8b057 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala @@ -90,11 +90,16 @@ object TPCDSQueryBenchmark extends Logging { } } - def filterQueries( + private def filterQueries( origQueries: Seq[String], - args: TPCDSQueryBenchmarkArguments): Seq[String] = { -if (args.queryFilter.nonEmpty) { - origQueries.filter(args.queryFilter.contains) + queryFilter: Set[String], + nameSuffix: String = ""): Seq[String] = { +if (queryFilter.nonEmpty) { + if (nameSuffix.nonEmpty) { +origQueries.filter { name => queryFilter.contains(s"$name$nameSuffix") } + } else { +origQueries.filter(queryFilter.contains) + } } else { origQueries } @@ -117,6 +122,7 @@ object TPCDSQueryBenchmark extends Logging { "q91", "q92", "q93", "q94", "q95", "q96", "q97", "q98", "q99") // This list only includes TPC-DS v2.7 queries that are different from v1.4 ones +val nameSuffixForQueriesV2_7 = "-v2.7" val tpcdsQueriesV2_7 = Seq( "q5a", "q6", "q10a", "q11", "q12", "q14", "q14a", "q18a", "q20", "q22", "q22a", "q24", "q27a", "q34", "q35", "q35a", "q36a", "q47", "q49", @@ -124,8 +130,9 @@ object TPCDSQueryBenchmark extends Logging { "q80a", "q86a", "q98") // If `--query-filter` defined, filters the queries that this option selects -val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs) -val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs) +val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs.queryFilter) +val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs.queryFilter, + nameSuffix = nameSuffixForQueriesV2_7) if ((queriesV1_4ToRun ++ queriesV2_7ToRun).isEmpty) { throw new RuntimeException( @@ -135,6 +142,6 @@ object TPCDSQueryBenchmark extends Logging { val tableSizes = setupTables(benchmarkArgs.dataLocation) runTpcdsQueries(queryLocation = "tpcds", queries = queriesV1_4ToRun, tableSizes) runTpcdsQueries(queryLocation = "tpcds-v2.7.0", queries = queriesV2_7ToRun, tableSizes, - nameSuffix = "-v2.7") + nameSuffix = nameSuffixForQueriesV2_7) } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (6d5d030 -> 4b36797)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 6d5d030 [SPARK-33414][SQL] Migrate SHOW CREATE TABLE command to use UnresolvedTableOrView to resolve the identifier add 4b36797 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark No new revisions were added by this update. Summary of changes: .../execution/benchmark/TPCDSQueryBenchmark.scala | 21 ++--- 1 file changed, 14 insertions(+), 7 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new 577dbb9 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark 577dbb9 is described below commit 577dbb96835f13f4cd92ea4caab9e6dece00be50 Author: Takeshi Yamamuro AuthorDate: Wed Nov 11 15:24:05 2020 +0900 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark ### What changes were proposed in this pull request? This PR intends to fix the behaviour of query filters in `TPCDSQueryBenchmark`. We can use an option `--query-filter` for selecting TPCDS queries to run, e.g., `--query-filter q6,q8,q13`. But, the current master has a weird behaviour about the option. For example, if we pass `--query-filter q6` so as to run the TPCDS q6 only, `TPCDSQueryBenchmark` runs `q6` and `q6-v2.7` because the `filterQueries` method does not respect the name suffix. So, there is no way now to run the TPCDS q6 only. ### Why are the changes needed? Bugfix. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manually checked. Closes #30324 from maropu/FilterBugInTPCDSQueryBenchmark. Authored-by: Takeshi Yamamuro Signed-off-by: Takeshi Yamamuro (cherry picked from commit 4b367976a877adb981f65d546e1522fdf30d0731) Signed-off-by: Takeshi Yamamuro --- .../execution/benchmark/TPCDSQueryBenchmark.scala | 21 ++--- 1 file changed, 14 insertions(+), 7 deletions(-) diff --git a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala index 7bbf079..43bc7c1 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala @@ -98,11 +98,16 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark { } } - def filterQueries( + private def filterQueries( origQueries: Seq[String], - args: TPCDSQueryBenchmarkArguments): Seq[String] = { -if (args.queryFilter.nonEmpty) { - origQueries.filter(args.queryFilter.contains) + queryFilter: Set[String], + nameSuffix: String = ""): Seq[String] = { +if (queryFilter.nonEmpty) { + if (nameSuffix.nonEmpty) { +origQueries.filter { name => queryFilter.contains(s"$name$nameSuffix") } + } else { +origQueries.filter(queryFilter.contains) + } } else { origQueries } @@ -125,6 +130,7 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark { "q91", "q92", "q93", "q94", "q95", "q96", "q97", "q98", "q99") // This list only includes TPC-DS v2.7 queries that are different from v1.4 ones +val nameSuffixForQueriesV2_7 = "-v2.7" val tpcdsQueriesV2_7 = Seq( "q5a", "q6", "q10a", "q11", "q12", "q14", "q14a", "q18a", "q20", "q22", "q22a", "q24", "q27a", "q34", "q35", "q35a", "q36a", "q47", "q49", @@ -132,8 +138,9 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark { "q80a", "q86a", "q98") // If `--query-filter` defined, filters the queries that this option selects -val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs) -val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs) +val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs.queryFilter) +val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs.queryFilter, + nameSuffix = nameSuffixForQueriesV2_7) if ((queriesV1_4ToRun ++ queriesV2_7ToRun).isEmpty) { throw new RuntimeException( @@ -143,6 +150,6 @@ object TPCDSQueryBenchmark extends SqlBasedBenchmark { val tableSizes = setupTables(benchmarkArgs.dataLocation) runTpcdsQueries(queryLocation = "tpcds", queries = queriesV1_4ToRun, tableSizes) runTpcdsQueries(queryLocation = "tpcds-v2.7.0", queries = queriesV2_7ToRun, tableSizes, - nameSuffix = "-v2.7") + nameSuffix = nameSuffixForQueriesV2_7) } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-2.4 updated: [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.4 by this push: new fece4a3 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark fece4a3 is described below commit fece4a3a36e23c7b99d6cb64e0c4484c9e17235f Author: Takeshi Yamamuro AuthorDate: Wed Nov 11 15:24:05 2020 +0900 [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark ### What changes were proposed in this pull request? This PR intends to fix the behaviour of query filters in `TPCDSQueryBenchmark`. We can use an option `--query-filter` for selecting TPCDS queries to run, e.g., `--query-filter q6,q8,q13`. But, the current master has a weird behaviour about the option. For example, if we pass `--query-filter q6` so as to run the TPCDS q6 only, `TPCDSQueryBenchmark` runs `q6` and `q6-v2.7` because the `filterQueries` method does not respect the name suffix. So, there is no way now to run the TPCDS q6 only. ### Why are the changes needed? Bugfix. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manually checked. Closes #30324 from maropu/FilterBugInTPCDSQueryBenchmark. Authored-by: Takeshi Yamamuro Signed-off-by: Takeshi Yamamuro (cherry picked from commit 4b367976a877adb981f65d546e1522fdf30d0731) Signed-off-by: Takeshi Yamamuro --- .../execution/benchmark/TPCDSQueryBenchmark.scala | 21 ++--- 1 file changed, 14 insertions(+), 7 deletions(-) diff --git a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala index fccee97..1f8b057 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmark.scala @@ -90,11 +90,16 @@ object TPCDSQueryBenchmark extends Logging { } } - def filterQueries( + private def filterQueries( origQueries: Seq[String], - args: TPCDSQueryBenchmarkArguments): Seq[String] = { -if (args.queryFilter.nonEmpty) { - origQueries.filter(args.queryFilter.contains) + queryFilter: Set[String], + nameSuffix: String = ""): Seq[String] = { +if (queryFilter.nonEmpty) { + if (nameSuffix.nonEmpty) { +origQueries.filter { name => queryFilter.contains(s"$name$nameSuffix") } + } else { +origQueries.filter(queryFilter.contains) + } } else { origQueries } @@ -117,6 +122,7 @@ object TPCDSQueryBenchmark extends Logging { "q91", "q92", "q93", "q94", "q95", "q96", "q97", "q98", "q99") // This list only includes TPC-DS v2.7 queries that are different from v1.4 ones +val nameSuffixForQueriesV2_7 = "-v2.7" val tpcdsQueriesV2_7 = Seq( "q5a", "q6", "q10a", "q11", "q12", "q14", "q14a", "q18a", "q20", "q22", "q22a", "q24", "q27a", "q34", "q35", "q35a", "q36a", "q47", "q49", @@ -124,8 +130,9 @@ object TPCDSQueryBenchmark extends Logging { "q80a", "q86a", "q98") // If `--query-filter` defined, filters the queries that this option selects -val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs) -val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs) +val queriesV1_4ToRun = filterQueries(tpcdsQueries, benchmarkArgs.queryFilter) +val queriesV2_7ToRun = filterQueries(tpcdsQueriesV2_7, benchmarkArgs.queryFilter, + nameSuffix = nameSuffixForQueriesV2_7) if ((queriesV1_4ToRun ++ queriesV2_7ToRun).isEmpty) { throw new RuntimeException( @@ -135,6 +142,6 @@ object TPCDSQueryBenchmark extends Logging { val tableSizes = setupTables(benchmarkArgs.dataLocation) runTpcdsQueries(queryLocation = "tpcds", queries = queriesV1_4ToRun, tableSizes) runTpcdsQueries(queryLocation = "tpcds-v2.7.0", queries = queriesV2_7ToRun, tableSizes, - nameSuffix = "-v2.7") + nameSuffix = nameSuffixForQueriesV2_7) } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-2.4 updated: [MINOR][GRAPHX][2.4] Correct typos in the sub-modules: graphx, external, and examples
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.4 by this push: new 1e177c7 [MINOR][GRAPHX][2.4] Correct typos in the sub-modules: graphx, external, and examples 1e177c7 is described below commit 1e177c73a26967b1effc1c8ba59c2fd57b52951f Author: Josh Soref AuthorDate: Thu Nov 12 21:02:27 2020 +0900 [MINOR][GRAPHX][2.4] Correct typos in the sub-modules: graphx, external, and examples ### What changes were proposed in this pull request? This PR intends to fix typos in the sub-modules: graphx, external, and examples. Split per holdenk https://github.com/apache/spark/pull/30323#issuecomment-725159710 NOTE: The misspellings have been reported at https://github.com/jsoref/spark/commit/706a726f87a0bbf5e31467fae9015218773db85b#commitcomment-44064356 Backport of #30326 ### Why are the changes needed? Misspelled words make it harder to read / understand content. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? No testing was performed Closes #30343 from jsoref/branch-2.4-30326. Authored-by: Josh Soref Signed-off-by: Takeshi Yamamuro --- .../apache/spark/examples/streaming/JavaCustomReceiver.java | 2 +- .../spark/examples/streaming/JavaNetworkWordCount.java | 2 +- .../examples/streaming/JavaRecoverableNetworkWordCount.java | 2 +- .../spark/examples/streaming/JavaSqlNetworkWordCount.java| 2 +- examples/src/main/python/ml/train_validation_split.py| 2 +- .../main/python/streaming/recoverable_network_wordcount.py | 2 +- examples/src/main/python/streaming/sql_network_wordcount.py | 2 +- .../org/apache/spark/examples/streaming/CustomReceiver.scala | 2 +- .../apache/spark/examples/streaming/NetworkWordCount.scala | 2 +- .../examples/streaming/RecoverableNetworkWordCount.scala | 2 +- .../spark/examples/streaming/SqlNetworkWordCount.scala | 2 +- .../spark/examples/streaming/StatefulNetworkWordCount.scala | 2 +- .../apache/spark/sql/jdbc/DockerJDBCIntegrationSuite.scala | 2 +- .../spark/sql/kafka010/KafkaContinuousSourceSuite.scala | 4 ++-- .../spark/sql/kafka010/KafkaMicroBatchSourceSuite.scala | 12 ++-- .../org/apache/spark/sql/kafka010/KafkaRelationSuite.scala | 4 ++-- .../scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala | 4 ++-- .../spark/examples/streaming/JavaKinesisWordCountASL.java| 2 +- .../main/python/examples/streaming/kinesis_wordcount_asl.py | 2 +- .../spark/examples/streaming/KinesisWordCountASL.scala | 6 +++--- .../org/apache/spark/streaming/kinesis/KinesisUtils.scala| 2 +- .../scala/org/apache/spark/graphx/lib/PageRankSuite.scala| 6 +++--- 22 files changed, 34 insertions(+), 34 deletions(-) diff --git a/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java b/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java index 47692ec..f84a197 100644 --- a/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java +++ b/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java @@ -67,7 +67,7 @@ public class JavaCustomReceiver extends Receiver { JavaStreamingContext ssc = new JavaStreamingContext(sparkConf, new Duration(1000)); // Create an input stream with the custom receiver on target ip:port and count the -// words in input stream of \n delimited text (eg. generated by 'nc') +// words in input stream of \n delimited text (e.g. generated by 'nc') JavaReceiverInputDStream lines = ssc.receiverStream( new JavaCustomReceiver(args[0], Integer.parseInt(args[1]))); JavaDStream words = lines.flatMap(x -> Arrays.asList(SPACE.split(x)).iterator()); diff --git a/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java b/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java index b217672..d56134b 100644 --- a/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java +++ b/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java @@ -57,7 +57,7 @@ public final class JavaNetworkWordCount { JavaStreamingContext ssc = new JavaStreamingContext(sparkConf, Durations.seconds(1)); // Create a JavaReceiverInputDStream on target ip:port and count the -// words in input stream of \n delimited text (eg. generated by 'nc') +// words in input stream of \n delimited text (e.g. generated by 'nc') // Note that no duplication in storage level only for running locally. // Replication necessary in distributed scenario for fault tolerance. JavaReceiverInputDStream li
[spark] branch branch-2.4 updated: [MINOR][GRAPHX][2.4] Correct typos in the sub-modules: graphx, external, and examples
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.4 by this push: new 1e177c7 [MINOR][GRAPHX][2.4] Correct typos in the sub-modules: graphx, external, and examples 1e177c7 is described below commit 1e177c73a26967b1effc1c8ba59c2fd57b52951f Author: Josh Soref AuthorDate: Thu Nov 12 21:02:27 2020 +0900 [MINOR][GRAPHX][2.4] Correct typos in the sub-modules: graphx, external, and examples ### What changes were proposed in this pull request? This PR intends to fix typos in the sub-modules: graphx, external, and examples. Split per holdenk https://github.com/apache/spark/pull/30323#issuecomment-725159710 NOTE: The misspellings have been reported at https://github.com/jsoref/spark/commit/706a726f87a0bbf5e31467fae9015218773db85b#commitcomment-44064356 Backport of #30326 ### Why are the changes needed? Misspelled words make it harder to read / understand content. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? No testing was performed Closes #30343 from jsoref/branch-2.4-30326. Authored-by: Josh Soref Signed-off-by: Takeshi Yamamuro --- .../apache/spark/examples/streaming/JavaCustomReceiver.java | 2 +- .../spark/examples/streaming/JavaNetworkWordCount.java | 2 +- .../examples/streaming/JavaRecoverableNetworkWordCount.java | 2 +- .../spark/examples/streaming/JavaSqlNetworkWordCount.java| 2 +- examples/src/main/python/ml/train_validation_split.py| 2 +- .../main/python/streaming/recoverable_network_wordcount.py | 2 +- examples/src/main/python/streaming/sql_network_wordcount.py | 2 +- .../org/apache/spark/examples/streaming/CustomReceiver.scala | 2 +- .../apache/spark/examples/streaming/NetworkWordCount.scala | 2 +- .../examples/streaming/RecoverableNetworkWordCount.scala | 2 +- .../spark/examples/streaming/SqlNetworkWordCount.scala | 2 +- .../spark/examples/streaming/StatefulNetworkWordCount.scala | 2 +- .../apache/spark/sql/jdbc/DockerJDBCIntegrationSuite.scala | 2 +- .../spark/sql/kafka010/KafkaContinuousSourceSuite.scala | 4 ++-- .../spark/sql/kafka010/KafkaMicroBatchSourceSuite.scala | 12 ++-- .../org/apache/spark/sql/kafka010/KafkaRelationSuite.scala | 4 ++-- .../scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala | 4 ++-- .../spark/examples/streaming/JavaKinesisWordCountASL.java| 2 +- .../main/python/examples/streaming/kinesis_wordcount_asl.py | 2 +- .../spark/examples/streaming/KinesisWordCountASL.scala | 6 +++--- .../org/apache/spark/streaming/kinesis/KinesisUtils.scala| 2 +- .../scala/org/apache/spark/graphx/lib/PageRankSuite.scala| 6 +++--- 22 files changed, 34 insertions(+), 34 deletions(-) diff --git a/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java b/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java index 47692ec..f84a197 100644 --- a/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java +++ b/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java @@ -67,7 +67,7 @@ public class JavaCustomReceiver extends Receiver { JavaStreamingContext ssc = new JavaStreamingContext(sparkConf, new Duration(1000)); // Create an input stream with the custom receiver on target ip:port and count the -// words in input stream of \n delimited text (eg. generated by 'nc') +// words in input stream of \n delimited text (e.g. generated by 'nc') JavaReceiverInputDStream lines = ssc.receiverStream( new JavaCustomReceiver(args[0], Integer.parseInt(args[1]))); JavaDStream words = lines.flatMap(x -> Arrays.asList(SPACE.split(x)).iterator()); diff --git a/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java b/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java index b217672..d56134b 100644 --- a/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java +++ b/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java @@ -57,7 +57,7 @@ public final class JavaNetworkWordCount { JavaStreamingContext ssc = new JavaStreamingContext(sparkConf, Durations.seconds(1)); // Create a JavaReceiverInputDStream on target ip:port and count the -// words in input stream of \n delimited text (eg. generated by 'nc') +// words in input stream of \n delimited text (e.g. generated by 'nc') // Note that no duplication in storage level only for running locally. // Replication necessary in distributed scenario for fault tolerance. JavaReceiverInputDStream li
[spark] branch branch-2.4 updated: [MINOR][GRAPHX][2.4] Correct typos in the sub-modules: graphx, external, and examples
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.4 by this push: new 1e177c7 [MINOR][GRAPHX][2.4] Correct typos in the sub-modules: graphx, external, and examples 1e177c7 is described below commit 1e177c73a26967b1effc1c8ba59c2fd57b52951f Author: Josh Soref AuthorDate: Thu Nov 12 21:02:27 2020 +0900 [MINOR][GRAPHX][2.4] Correct typos in the sub-modules: graphx, external, and examples ### What changes were proposed in this pull request? This PR intends to fix typos in the sub-modules: graphx, external, and examples. Split per holdenk https://github.com/apache/spark/pull/30323#issuecomment-725159710 NOTE: The misspellings have been reported at https://github.com/jsoref/spark/commit/706a726f87a0bbf5e31467fae9015218773db85b#commitcomment-44064356 Backport of #30326 ### Why are the changes needed? Misspelled words make it harder to read / understand content. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? No testing was performed Closes #30343 from jsoref/branch-2.4-30326. Authored-by: Josh Soref Signed-off-by: Takeshi Yamamuro --- .../apache/spark/examples/streaming/JavaCustomReceiver.java | 2 +- .../spark/examples/streaming/JavaNetworkWordCount.java | 2 +- .../examples/streaming/JavaRecoverableNetworkWordCount.java | 2 +- .../spark/examples/streaming/JavaSqlNetworkWordCount.java| 2 +- examples/src/main/python/ml/train_validation_split.py| 2 +- .../main/python/streaming/recoverable_network_wordcount.py | 2 +- examples/src/main/python/streaming/sql_network_wordcount.py | 2 +- .../org/apache/spark/examples/streaming/CustomReceiver.scala | 2 +- .../apache/spark/examples/streaming/NetworkWordCount.scala | 2 +- .../examples/streaming/RecoverableNetworkWordCount.scala | 2 +- .../spark/examples/streaming/SqlNetworkWordCount.scala | 2 +- .../spark/examples/streaming/StatefulNetworkWordCount.scala | 2 +- .../apache/spark/sql/jdbc/DockerJDBCIntegrationSuite.scala | 2 +- .../spark/sql/kafka010/KafkaContinuousSourceSuite.scala | 4 ++-- .../spark/sql/kafka010/KafkaMicroBatchSourceSuite.scala | 12 ++-- .../org/apache/spark/sql/kafka010/KafkaRelationSuite.scala | 4 ++-- .../scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala | 4 ++-- .../spark/examples/streaming/JavaKinesisWordCountASL.java| 2 +- .../main/python/examples/streaming/kinesis_wordcount_asl.py | 2 +- .../spark/examples/streaming/KinesisWordCountASL.scala | 6 +++--- .../org/apache/spark/streaming/kinesis/KinesisUtils.scala| 2 +- .../scala/org/apache/spark/graphx/lib/PageRankSuite.scala| 6 +++--- 22 files changed, 34 insertions(+), 34 deletions(-) diff --git a/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java b/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java index 47692ec..f84a197 100644 --- a/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java +++ b/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java @@ -67,7 +67,7 @@ public class JavaCustomReceiver extends Receiver { JavaStreamingContext ssc = new JavaStreamingContext(sparkConf, new Duration(1000)); // Create an input stream with the custom receiver on target ip:port and count the -// words in input stream of \n delimited text (eg. generated by 'nc') +// words in input stream of \n delimited text (e.g. generated by 'nc') JavaReceiverInputDStream lines = ssc.receiverStream( new JavaCustomReceiver(args[0], Integer.parseInt(args[1]))); JavaDStream words = lines.flatMap(x -> Arrays.asList(SPACE.split(x)).iterator()); diff --git a/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java b/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java index b217672..d56134b 100644 --- a/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java +++ b/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java @@ -57,7 +57,7 @@ public final class JavaNetworkWordCount { JavaStreamingContext ssc = new JavaStreamingContext(sparkConf, Durations.seconds(1)); // Create a JavaReceiverInputDStream on target ip:port and count the -// words in input stream of \n delimited text (eg. generated by 'nc') +// words in input stream of \n delimited text (e.g. generated by 'nc') // Note that no duplication in storage level only for running locally. // Replication necessary in distributed scenario for fault tolerance. JavaReceiverInputDStream li
[spark] branch branch-2.4 updated: [MINOR][GRAPHX][2.4] Correct typos in the sub-modules: graphx, external, and examples
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.4 by this push: new 1e177c7 [MINOR][GRAPHX][2.4] Correct typos in the sub-modules: graphx, external, and examples 1e177c7 is described below commit 1e177c73a26967b1effc1c8ba59c2fd57b52951f Author: Josh Soref AuthorDate: Thu Nov 12 21:02:27 2020 +0900 [MINOR][GRAPHX][2.4] Correct typos in the sub-modules: graphx, external, and examples ### What changes were proposed in this pull request? This PR intends to fix typos in the sub-modules: graphx, external, and examples. Split per holdenk https://github.com/apache/spark/pull/30323#issuecomment-725159710 NOTE: The misspellings have been reported at https://github.com/jsoref/spark/commit/706a726f87a0bbf5e31467fae9015218773db85b#commitcomment-44064356 Backport of #30326 ### Why are the changes needed? Misspelled words make it harder to read / understand content. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? No testing was performed Closes #30343 from jsoref/branch-2.4-30326. Authored-by: Josh Soref Signed-off-by: Takeshi Yamamuro --- .../apache/spark/examples/streaming/JavaCustomReceiver.java | 2 +- .../spark/examples/streaming/JavaNetworkWordCount.java | 2 +- .../examples/streaming/JavaRecoverableNetworkWordCount.java | 2 +- .../spark/examples/streaming/JavaSqlNetworkWordCount.java| 2 +- examples/src/main/python/ml/train_validation_split.py| 2 +- .../main/python/streaming/recoverable_network_wordcount.py | 2 +- examples/src/main/python/streaming/sql_network_wordcount.py | 2 +- .../org/apache/spark/examples/streaming/CustomReceiver.scala | 2 +- .../apache/spark/examples/streaming/NetworkWordCount.scala | 2 +- .../examples/streaming/RecoverableNetworkWordCount.scala | 2 +- .../spark/examples/streaming/SqlNetworkWordCount.scala | 2 +- .../spark/examples/streaming/StatefulNetworkWordCount.scala | 2 +- .../apache/spark/sql/jdbc/DockerJDBCIntegrationSuite.scala | 2 +- .../spark/sql/kafka010/KafkaContinuousSourceSuite.scala | 4 ++-- .../spark/sql/kafka010/KafkaMicroBatchSourceSuite.scala | 12 ++-- .../org/apache/spark/sql/kafka010/KafkaRelationSuite.scala | 4 ++-- .../scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala | 4 ++-- .../spark/examples/streaming/JavaKinesisWordCountASL.java| 2 +- .../main/python/examples/streaming/kinesis_wordcount_asl.py | 2 +- .../spark/examples/streaming/KinesisWordCountASL.scala | 6 +++--- .../org/apache/spark/streaming/kinesis/KinesisUtils.scala| 2 +- .../scala/org/apache/spark/graphx/lib/PageRankSuite.scala| 6 +++--- 22 files changed, 34 insertions(+), 34 deletions(-) diff --git a/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java b/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java index 47692ec..f84a197 100644 --- a/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java +++ b/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java @@ -67,7 +67,7 @@ public class JavaCustomReceiver extends Receiver { JavaStreamingContext ssc = new JavaStreamingContext(sparkConf, new Duration(1000)); // Create an input stream with the custom receiver on target ip:port and count the -// words in input stream of \n delimited text (eg. generated by 'nc') +// words in input stream of \n delimited text (e.g. generated by 'nc') JavaReceiverInputDStream lines = ssc.receiverStream( new JavaCustomReceiver(args[0], Integer.parseInt(args[1]))); JavaDStream words = lines.flatMap(x -> Arrays.asList(SPACE.split(x)).iterator()); diff --git a/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java b/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java index b217672..d56134b 100644 --- a/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java +++ b/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java @@ -57,7 +57,7 @@ public final class JavaNetworkWordCount { JavaStreamingContext ssc = new JavaStreamingContext(sparkConf, Durations.seconds(1)); // Create a JavaReceiverInputDStream on target ip:port and count the -// words in input stream of \n delimited text (eg. generated by 'nc') +// words in input stream of \n delimited text (e.g. generated by 'nc') // Note that no duplication in storage level only for running locally. // Replication necessary in distributed scenario for fault tolerance. JavaReceiverInputDStream li
[spark] branch branch-2.4 updated: [MINOR][GRAPHX][2.4] Correct typos in the sub-modules: graphx, external, and examples
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.4 by this push: new 1e177c7 [MINOR][GRAPHX][2.4] Correct typos in the sub-modules: graphx, external, and examples 1e177c7 is described below commit 1e177c73a26967b1effc1c8ba59c2fd57b52951f Author: Josh Soref AuthorDate: Thu Nov 12 21:02:27 2020 +0900 [MINOR][GRAPHX][2.4] Correct typos in the sub-modules: graphx, external, and examples ### What changes were proposed in this pull request? This PR intends to fix typos in the sub-modules: graphx, external, and examples. Split per holdenk https://github.com/apache/spark/pull/30323#issuecomment-725159710 NOTE: The misspellings have been reported at https://github.com/jsoref/spark/commit/706a726f87a0bbf5e31467fae9015218773db85b#commitcomment-44064356 Backport of #30326 ### Why are the changes needed? Misspelled words make it harder to read / understand content. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? No testing was performed Closes #30343 from jsoref/branch-2.4-30326. Authored-by: Josh Soref Signed-off-by: Takeshi Yamamuro --- .../apache/spark/examples/streaming/JavaCustomReceiver.java | 2 +- .../spark/examples/streaming/JavaNetworkWordCount.java | 2 +- .../examples/streaming/JavaRecoverableNetworkWordCount.java | 2 +- .../spark/examples/streaming/JavaSqlNetworkWordCount.java| 2 +- examples/src/main/python/ml/train_validation_split.py| 2 +- .../main/python/streaming/recoverable_network_wordcount.py | 2 +- examples/src/main/python/streaming/sql_network_wordcount.py | 2 +- .../org/apache/spark/examples/streaming/CustomReceiver.scala | 2 +- .../apache/spark/examples/streaming/NetworkWordCount.scala | 2 +- .../examples/streaming/RecoverableNetworkWordCount.scala | 2 +- .../spark/examples/streaming/SqlNetworkWordCount.scala | 2 +- .../spark/examples/streaming/StatefulNetworkWordCount.scala | 2 +- .../apache/spark/sql/jdbc/DockerJDBCIntegrationSuite.scala | 2 +- .../spark/sql/kafka010/KafkaContinuousSourceSuite.scala | 4 ++-- .../spark/sql/kafka010/KafkaMicroBatchSourceSuite.scala | 12 ++-- .../org/apache/spark/sql/kafka010/KafkaRelationSuite.scala | 4 ++-- .../scala/org/apache/spark/sql/kafka010/KafkaTestUtils.scala | 4 ++-- .../spark/examples/streaming/JavaKinesisWordCountASL.java| 2 +- .../main/python/examples/streaming/kinesis_wordcount_asl.py | 2 +- .../spark/examples/streaming/KinesisWordCountASL.scala | 6 +++--- .../org/apache/spark/streaming/kinesis/KinesisUtils.scala| 2 +- .../scala/org/apache/spark/graphx/lib/PageRankSuite.scala| 6 +++--- 22 files changed, 34 insertions(+), 34 deletions(-) diff --git a/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java b/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java index 47692ec..f84a197 100644 --- a/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java +++ b/examples/src/main/java/org/apache/spark/examples/streaming/JavaCustomReceiver.java @@ -67,7 +67,7 @@ public class JavaCustomReceiver extends Receiver { JavaStreamingContext ssc = new JavaStreamingContext(sparkConf, new Duration(1000)); // Create an input stream with the custom receiver on target ip:port and count the -// words in input stream of \n delimited text (eg. generated by 'nc') +// words in input stream of \n delimited text (e.g. generated by 'nc') JavaReceiverInputDStream lines = ssc.receiverStream( new JavaCustomReceiver(args[0], Integer.parseInt(args[1]))); JavaDStream words = lines.flatMap(x -> Arrays.asList(SPACE.split(x)).iterator()); diff --git a/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java b/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java index b217672..d56134b 100644 --- a/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java +++ b/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java @@ -57,7 +57,7 @@ public final class JavaNetworkWordCount { JavaStreamingContext ssc = new JavaStreamingContext(sparkConf, Durations.seconds(1)); // Create a JavaReceiverInputDStream on target ip:port and count the -// words in input stream of \n delimited text (eg. generated by 'nc') +// words in input stream of \n delimited text (e.g. generated by 'nc') // Note that no duplication in storage level only for running locally. // Replication necessary in distributed scenario for fault tolerance. JavaReceiverInputDStream li
[spark] branch master updated (a744fea -> 2639ad4)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from a744fea [SPARK-33267][SQL] Fix NPE issue on 'In' filter when one of values contains null add 2639ad4 [SPARK-33272][SQL] prune the attributes mapping in QueryPlan.transformUpWithNewOutput No new revisions were added by this update. Summary of changes: .../org/apache/spark/sql/catalyst/plans/QueryPlan.scala | 17 +++-- .../sql/catalyst/plans/logical/AnalysisHelper.scala | 2 +- 2 files changed, 16 insertions(+), 3 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (a744fea -> 2639ad4)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from a744fea [SPARK-33267][SQL] Fix NPE issue on 'In' filter when one of values contains null add 2639ad4 [SPARK-33272][SQL] prune the attributes mapping in QueryPlan.transformUpWithNewOutput No new revisions were added by this update. Summary of changes: .../org/apache/spark/sql/catalyst/plans/QueryPlan.scala | 17 +++-- .../sql/catalyst/plans/logical/AnalysisHelper.scala | 2 +- 2 files changed, 16 insertions(+), 3 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (a744fea -> 2639ad4)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from a744fea [SPARK-33267][SQL] Fix NPE issue on 'In' filter when one of values contains null add 2639ad4 [SPARK-33272][SQL] prune the attributes mapping in QueryPlan.transformUpWithNewOutput No new revisions were added by this update. Summary of changes: .../org/apache/spark/sql/catalyst/plans/QueryPlan.scala | 17 +++-- .../sql/catalyst/plans/logical/AnalysisHelper.scala | 2 +- 2 files changed, 16 insertions(+), 3 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (a744fea -> 2639ad4)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from a744fea [SPARK-33267][SQL] Fix NPE issue on 'In' filter when one of values contains null add 2639ad4 [SPARK-33272][SQL] prune the attributes mapping in QueryPlan.transformUpWithNewOutput No new revisions were added by this update. Summary of changes: .../org/apache/spark/sql/catalyst/plans/QueryPlan.scala | 17 +++-- .../sql/catalyst/plans/logical/AnalysisHelper.scala | 2 +- 2 files changed, 16 insertions(+), 3 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (a744fea -> 2639ad4)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from a744fea [SPARK-33267][SQL] Fix NPE issue on 'In' filter when one of values contains null add 2639ad4 [SPARK-33272][SQL] prune the attributes mapping in QueryPlan.transformUpWithNewOutput No new revisions were added by this update. Summary of changes: .../org/apache/spark/sql/catalyst/plans/QueryPlan.scala | 17 +++-- .../sql/catalyst/plans/logical/AnalysisHelper.scala | 2 +- 2 files changed, 16 insertions(+), 3 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (47a6568 -> dcb08204)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 47a6568 [SPARK-33189][PYTHON][TESTS] Add env var to tests for legacy nested timestamps in pyarrow add dcb08204 [SPARK-32785][SQL][DOCS][FOLLOWUP] Update migaration guide for incomplete interval literals No new revisions were added by this update. Summary of changes: docs/sql-migration-guide.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (47a6568 -> dcb08204)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 47a6568 [SPARK-33189][PYTHON][TESTS] Add env var to tests for legacy nested timestamps in pyarrow add dcb08204 [SPARK-32785][SQL][DOCS][FOLLOWUP] Update migaration guide for incomplete interval literals No new revisions were added by this update. Summary of changes: docs/sql-migration-guide.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (47a6568 -> dcb08204)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 47a6568 [SPARK-33189][PYTHON][TESTS] Add env var to tests for legacy nested timestamps in pyarrow add dcb08204 [SPARK-32785][SQL][DOCS][FOLLOWUP] Update migaration guide for incomplete interval literals No new revisions were added by this update. Summary of changes: docs/sql-migration-guide.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (47a6568 -> dcb08204)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 47a6568 [SPARK-33189][PYTHON][TESTS] Add env var to tests for legacy nested timestamps in pyarrow add dcb08204 [SPARK-32785][SQL][DOCS][FOLLOWUP] Update migaration guide for incomplete interval literals No new revisions were added by this update. Summary of changes: docs/sql-migration-guide.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-32785][SQL][DOCS][FOLLOWUP][3.0] Update migration guide for incomplete interval literals
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new a36b3c4 [SPARK-32785][SQL][DOCS][FOLLOWUP][3.0] Update migration guide for incomplete interval literals a36b3c4 is described below commit a36b3c438607f922d944ee8e773eefbe76aae7fb Author: Kent Yao AuthorDate: Wed Oct 21 17:31:19 2020 +0900 [SPARK-32785][SQL][DOCS][FOLLOWUP][3.0] Update migration guide for incomplete interval literals ### What changes were proposed in this pull request? Address comments https://github.com/apache/spark/pull/29635#discussion_r507241899 to improve migration guide ### Why are the changes needed? improve migration guide ### Does this PR introduce _any_ user-facing change? NO,only doc update ### How was this patch tested? passing GitHub action Closes #30117 from yaooqinn/SPARK-32785-F30. Authored-by: Kent Yao Signed-off-by: Takeshi Yamamuro --- docs/sql-migration-guide.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/sql-migration-guide.md b/docs/sql-migration-guide.md index e64c037..85d7073 100644 --- a/docs/sql-migration-guide.md +++ b/docs/sql-migration-guide.md @@ -24,7 +24,7 @@ license: | ## Upgrading from Spark SQL 3.0.1 to 3.0.2 - - In Spark 3.0.2, incomplete interval literals, e.g. `INTERVAL '1'`, `INTERVAL '1 DAY 2'` will fail with IllegalArgumentException. In Spark 3.0.1 and earlier, they result `NULL`s. + - In Spark 3.0.2, `IllegalArgumentException` is returned for the incomplete interval literals, e.g. `INTERVAL '1'`, `INTERVAL '1 DAY 2'`, which are invalid. In Spark 3.0.1, these literals result in `NULL`s. ## Upgrading from Spark SQL 3.0 to 3.0.1 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (47a6568 -> dcb08204)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 47a6568 [SPARK-33189][PYTHON][TESTS] Add env var to tests for legacy nested timestamps in pyarrow add dcb08204 [SPARK-32785][SQL][DOCS][FOLLOWUP] Update migaration guide for incomplete interval literals No new revisions were added by this update. Summary of changes: docs/sql-migration-guide.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-32785][SQL][DOCS][FOLLOWUP][3.0] Update migration guide for incomplete interval literals
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new a36b3c4 [SPARK-32785][SQL][DOCS][FOLLOWUP][3.0] Update migration guide for incomplete interval literals a36b3c4 is described below commit a36b3c438607f922d944ee8e773eefbe76aae7fb Author: Kent Yao AuthorDate: Wed Oct 21 17:31:19 2020 +0900 [SPARK-32785][SQL][DOCS][FOLLOWUP][3.0] Update migration guide for incomplete interval literals ### What changes were proposed in this pull request? Address comments https://github.com/apache/spark/pull/29635#discussion_r507241899 to improve migration guide ### Why are the changes needed? improve migration guide ### Does this PR introduce _any_ user-facing change? NO,only doc update ### How was this patch tested? passing GitHub action Closes #30117 from yaooqinn/SPARK-32785-F30. Authored-by: Kent Yao Signed-off-by: Takeshi Yamamuro --- docs/sql-migration-guide.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/sql-migration-guide.md b/docs/sql-migration-guide.md index e64c037..85d7073 100644 --- a/docs/sql-migration-guide.md +++ b/docs/sql-migration-guide.md @@ -24,7 +24,7 @@ license: | ## Upgrading from Spark SQL 3.0.1 to 3.0.2 - - In Spark 3.0.2, incomplete interval literals, e.g. `INTERVAL '1'`, `INTERVAL '1 DAY 2'` will fail with IllegalArgumentException. In Spark 3.0.1 and earlier, they result `NULL`s. + - In Spark 3.0.2, `IllegalArgumentException` is returned for the incomplete interval literals, e.g. `INTERVAL '1'`, `INTERVAL '1 DAY 2'`, which are invalid. In Spark 3.0.1, these literals result in `NULL`s. ## Upgrading from Spark SQL 3.0 to 3.0.1 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-32785][SQL][DOCS][FOLLOWUP][3.0] Update migration guide for incomplete interval literals
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new a36b3c4 [SPARK-32785][SQL][DOCS][FOLLOWUP][3.0] Update migration guide for incomplete interval literals a36b3c4 is described below commit a36b3c438607f922d944ee8e773eefbe76aae7fb Author: Kent Yao AuthorDate: Wed Oct 21 17:31:19 2020 +0900 [SPARK-32785][SQL][DOCS][FOLLOWUP][3.0] Update migration guide for incomplete interval literals ### What changes were proposed in this pull request? Address comments https://github.com/apache/spark/pull/29635#discussion_r507241899 to improve migration guide ### Why are the changes needed? improve migration guide ### Does this PR introduce _any_ user-facing change? NO,only doc update ### How was this patch tested? passing GitHub action Closes #30117 from yaooqinn/SPARK-32785-F30. Authored-by: Kent Yao Signed-off-by: Takeshi Yamamuro --- docs/sql-migration-guide.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/sql-migration-guide.md b/docs/sql-migration-guide.md index e64c037..85d7073 100644 --- a/docs/sql-migration-guide.md +++ b/docs/sql-migration-guide.md @@ -24,7 +24,7 @@ license: | ## Upgrading from Spark SQL 3.0.1 to 3.0.2 - - In Spark 3.0.2, incomplete interval literals, e.g. `INTERVAL '1'`, `INTERVAL '1 DAY 2'` will fail with IllegalArgumentException. In Spark 3.0.1 and earlier, they result `NULL`s. + - In Spark 3.0.2, `IllegalArgumentException` is returned for the incomplete interval literals, e.g. `INTERVAL '1'`, `INTERVAL '1 DAY 2'`, which are invalid. In Spark 3.0.1, these literals result in `NULL`s. ## Upgrading from Spark SQL 3.0 to 3.0.1 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-32785][SQL][DOCS][FOLLOWUP][3.0] Update migration guide for incomplete interval literals
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new a36b3c4 [SPARK-32785][SQL][DOCS][FOLLOWUP][3.0] Update migration guide for incomplete interval literals a36b3c4 is described below commit a36b3c438607f922d944ee8e773eefbe76aae7fb Author: Kent Yao AuthorDate: Wed Oct 21 17:31:19 2020 +0900 [SPARK-32785][SQL][DOCS][FOLLOWUP][3.0] Update migration guide for incomplete interval literals ### What changes were proposed in this pull request? Address comments https://github.com/apache/spark/pull/29635#discussion_r507241899 to improve migration guide ### Why are the changes needed? improve migration guide ### Does this PR introduce _any_ user-facing change? NO,only doc update ### How was this patch tested? passing GitHub action Closes #30117 from yaooqinn/SPARK-32785-F30. Authored-by: Kent Yao Signed-off-by: Takeshi Yamamuro --- docs/sql-migration-guide.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/sql-migration-guide.md b/docs/sql-migration-guide.md index e64c037..85d7073 100644 --- a/docs/sql-migration-guide.md +++ b/docs/sql-migration-guide.md @@ -24,7 +24,7 @@ license: | ## Upgrading from Spark SQL 3.0.1 to 3.0.2 - - In Spark 3.0.2, incomplete interval literals, e.g. `INTERVAL '1'`, `INTERVAL '1 DAY 2'` will fail with IllegalArgumentException. In Spark 3.0.1 and earlier, they result `NULL`s. + - In Spark 3.0.2, `IllegalArgumentException` is returned for the incomplete interval literals, e.g. `INTERVAL '1'`, `INTERVAL '1 DAY 2'`, which are invalid. In Spark 3.0.1, these literals result in `NULL`s. ## Upgrading from Spark SQL 3.0 to 3.0.1 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.0 updated: [SPARK-32785][SQL][DOCS][FOLLOWUP][3.0] Update migration guide for incomplete interval literals
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.0 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.0 by this push: new a36b3c4 [SPARK-32785][SQL][DOCS][FOLLOWUP][3.0] Update migration guide for incomplete interval literals a36b3c4 is described below commit a36b3c438607f922d944ee8e773eefbe76aae7fb Author: Kent Yao AuthorDate: Wed Oct 21 17:31:19 2020 +0900 [SPARK-32785][SQL][DOCS][FOLLOWUP][3.0] Update migration guide for incomplete interval literals ### What changes were proposed in this pull request? Address comments https://github.com/apache/spark/pull/29635#discussion_r507241899 to improve migration guide ### Why are the changes needed? improve migration guide ### Does this PR introduce _any_ user-facing change? NO,only doc update ### How was this patch tested? passing GitHub action Closes #30117 from yaooqinn/SPARK-32785-F30. Authored-by: Kent Yao Signed-off-by: Takeshi Yamamuro --- docs/sql-migration-guide.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/sql-migration-guide.md b/docs/sql-migration-guide.md index e64c037..85d7073 100644 --- a/docs/sql-migration-guide.md +++ b/docs/sql-migration-guide.md @@ -24,7 +24,7 @@ license: | ## Upgrading from Spark SQL 3.0.1 to 3.0.2 - - In Spark 3.0.2, incomplete interval literals, e.g. `INTERVAL '1'`, `INTERVAL '1 DAY 2'` will fail with IllegalArgumentException. In Spark 3.0.1 and earlier, they result `NULL`s. + - In Spark 3.0.2, `IllegalArgumentException` is returned for the incomplete interval literals, e.g. `INTERVAL '1'`, `INTERVAL '1 DAY 2'`, which are invalid. In Spark 3.0.1, these literals result in `NULL`s. ## Upgrading from Spark SQL 3.0 to 3.0.1 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.1 updated: [SPARK-33100][SQL] Ignore a semicolon inside a bracketed comment in spark-sql
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.1 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.1 by this push: new f702a95 [SPARK-33100][SQL] Ignore a semicolon inside a bracketed comment in spark-sql f702a95 is described below commit f702a95e81e4b3318dec701d5a8eb2898bbd8ff6 Author: fwang12 AuthorDate: Tue Jan 5 15:55:30 2021 +0900 [SPARK-33100][SQL] Ignore a semicolon inside a bracketed comment in spark-sql ### What changes were proposed in this pull request? Now the spark-sql does not support parse the sql statements with bracketed comments. For the sql statements: ``` /* SELECT 'test'; */ SELECT 'test'; ``` Would be split to two statements: The first one: `/* SELECT 'test'` The second one: `*/ SELECT 'test'` Then it would throw an exception because the first one is illegal. In this PR, we ignore the content in bracketed comments while splitting the sql statements. Besides, we ignore the comment without any content. ### Why are the changes needed? Spark-sql might split the statements inside bracketed comments and it is not correct. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Added UT. Closes #29982 from turboFei/SPARK-33110. Lead-authored-by: fwang12 Co-authored-by: turbofei Signed-off-by: Takeshi Yamamuro (cherry picked from commit a071826f72cd717a58bf37b877f805490f7a147f) Signed-off-by: Takeshi Yamamuro --- .../sql/hive/thriftserver/SparkSQLCLIDriver.scala | 40 +- .../spark/sql/hive/thriftserver/CliSuite.scala | 23 + 2 files changed, 55 insertions(+), 8 deletions(-) diff --git a/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIDriver.scala b/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIDriver.scala index f2fd373..9155eac 100644 --- a/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIDriver.scala +++ b/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIDriver.scala @@ -522,14 +522,22 @@ private[hive] class SparkSQLCLIDriver extends CliDriver with Logging { // Note: [SPARK-31595] if there is a `'` in a double quoted string, or a `"` in a single quoted // string, the origin implementation from Hive will not drop the trailing semicolon as expected, // hence we refined this function a little bit. + // Note: [SPARK-33100] Ignore a semicolon inside a bracketed comment in spark-sql. private def splitSemiColon(line: String): JList[String] = { var insideSingleQuote = false var insideDoubleQuote = false -var insideComment = false +var insideSimpleComment = false +var bracketedCommentLevel = 0 var escape = false var beginIndex = 0 +var includingStatement = false val ret = new JArrayList[String] +def insideBracketedComment: Boolean = bracketedCommentLevel > 0 +def insideComment: Boolean = insideSimpleComment || insideBracketedComment +def statementBegin(index: Int): Boolean = includingStatement || (!insideComment && + index > beginIndex && !s"${line.charAt(index)}".trim.isEmpty) + for (index <- 0 until line.length) { if (line.charAt(index) == '\'' && !insideComment) { // take a look to see if it is escaped @@ -553,21 +561,33 @@ private[hive] class SparkSQLCLIDriver extends CliDriver with Logging { // Sample query: select "quoted value --" //^^ avoids starting a comment if it's inside quotes. } else if (hasNext && line.charAt(index + 1) == '-') { - // ignore quotes and ; - insideComment = true + // ignore quotes and ; in simple comment + insideSimpleComment = true } } else if (line.charAt(index) == ';') { if (insideSingleQuote || insideDoubleQuote || insideComment) { // do not split } else { - // split, do not include ; itself - ret.add(line.substring(beginIndex, index)) + if (includingStatement) { +// split, do not include ; itself +ret.add(line.substring(beginIndex, index)) + } beginIndex = index + 1 + includingStatement = false } } else if (line.charAt(index) == '\n') { -// with a new line the inline comment should end. +// with a new line the inline simple comment should end. if (!escape) { - insideComment = false + insideSimpleComment = false +} + } else if (line.charAt(index) == '/' && !insideSimpleComment) { +
[spark] branch master updated (a7d3fcd -> a071826)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from a7d3fcd [SPARK-34000][CORE] Fix stageAttemptToNumSpeculativeTasks java.util.NoSuchElementException add a071826 [SPARK-33100][SQL] Ignore a semicolon inside a bracketed comment in spark-sql No new revisions were added by this update. Summary of changes: .../sql/hive/thriftserver/SparkSQLCLIDriver.scala | 40 +- .../spark/sql/hive/thriftserver/CliSuite.scala | 23 + 2 files changed, 55 insertions(+), 8 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (a071826 -> f252a93)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from a071826 [SPARK-33100][SQL] Ignore a semicolon inside a bracketed comment in spark-sql add f252a93 [SPARK-33935][SQL] Fix CBO cost function No new revisions were added by this update. Summary of changes: .../catalyst/optimizer/CostBasedJoinReorder.scala | 13 +- .../optimizer/joinReorder/JoinReorderSuite.scala | 15 + .../StarJoinCostBasedReorderSuite.scala| 8 +- .../approved-plans-v1_4/q13.sf100/explain.txt | 132 ++--- .../approved-plans-v1_4/q13.sf100/simplified.txt | 34 +- .../approved-plans-v1_4/q17.sf100/explain.txt | 194 +++ .../approved-plans-v1_4/q17.sf100/simplified.txt | 130 ++--- .../approved-plans-v1_4/q18.sf100/explain.txt | 158 +++--- .../approved-plans-v1_4/q18.sf100/simplified.txt | 50 +- .../approved-plans-v1_4/q19.sf100/explain.txt | 368 ++--- .../approved-plans-v1_4/q19.sf100/simplified.txt | 116 ++--- .../approved-plans-v1_4/q24a.sf100/explain.txt | 118 ++--- .../approved-plans-v1_4/q24a.sf100/simplified.txt | 34 +- .../approved-plans-v1_4/q24b.sf100/explain.txt | 118 ++--- .../approved-plans-v1_4/q24b.sf100/simplified.txt | 34 +- .../approved-plans-v1_4/q25.sf100/explain.txt | 194 +++ .../approved-plans-v1_4/q25.sf100/simplified.txt | 130 ++--- .../approved-plans-v1_4/q33.sf100/explain.txt | 264 +- .../approved-plans-v1_4/q33.sf100/simplified.txt | 58 +-- .../approved-plans-v1_4/q52.sf100/explain.txt | 138 ++--- .../approved-plans-v1_4/q52.sf100/simplified.txt | 26 +- .../approved-plans-v1_4/q55.sf100/explain.txt | 134 ++--- .../approved-plans-v1_4/q55.sf100/simplified.txt | 26 +- .../approved-plans-v1_4/q72.sf100/explain.txt | 264 +- .../approved-plans-v1_4/q72.sf100/simplified.txt | 150 +++--- .../approved-plans-v1_4/q81.sf100/explain.txt | 570 ++--- .../approved-plans-v1_4/q81.sf100/simplified.txt | 142 ++--- .../approved-plans-v1_4/q91.sf100/explain.txt | 306 +-- .../approved-plans-v1_4/q91.sf100/simplified.txt | 62 +-- .../approved-plans-v2_7/q18a.sf100/explain.txt | 306 +-- .../approved-plans-v2_7/q18a.sf100/simplified.txt | 54 +- .../approved-plans-v2_7/q72.sf100/explain.txt | 264 +- .../approved-plans-v2_7/q72.sf100/simplified.txt | 150 +++--- 33 files changed, 2386 insertions(+), 2374 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.1 updated: [SPARK-34012][SQL] Keep behavior consistent when conf `spark.sql.legacy.parser.havingWithoutGroupByAsWhere` is true with migration guide
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-3.1 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.1 by this push: new d729158 [SPARK-34012][SQL] Keep behavior consistent when conf `spark.sql.legacy.parser.havingWithoutGroupByAsWhere` is true with migration guide d729158 is described below commit d7291582ebaf815f89474c76d8a35b49172b1ecf Author: angerszhu AuthorDate: Wed Jan 6 08:48:24 2021 +0900 [SPARK-34012][SQL] Keep behavior consistent when conf `spark.sql.legacy.parser.havingWithoutGroupByAsWhere` is true with migration guide ### What changes were proposed in this pull request? In https://github.com/apache/spark/pull/22696 we support HAVING without GROUP BY means global aggregate But since we treat having as Filter before, in this way will cause a lot of analyze error, after https://github.com/apache/spark/pull/28294 we use `UnresolvedHaving` to instead `Filter` to solve such problem, but break origin logical about treat `SELECT 1 FROM range(10) HAVING true` as `SELECT 1 FROM range(10) WHERE true` . This PR fix this issue and add UT. ### Why are the changes needed? Keep consistent behavior of migration guide. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? added UT Closes #31039 from AngersZh/SPARK-25780-Follow-up. Authored-by: angerszhu Signed-off-by: Takeshi Yamamuro (cherry picked from commit e279ed304475a6d5a9fbf739fe9ed32ef58171cb) Signed-off-by: Takeshi Yamamuro --- .../spark/sql/catalyst/parser/AstBuilder.scala | 6 ++- .../test/resources/sql-tests/inputs/group-by.sql | 10 .../resources/sql-tests/results/group-by.sql.out | 63 +- 3 files changed, 77 insertions(+), 2 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala index a22383c..9d74ac9 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala @@ -714,7 +714,11 @@ class AstBuilder extends SqlBaseBaseVisitor[AnyRef] with SQLConfHelper with Logg val withProject = if (aggregationClause == null && havingClause != null) { if (conf.getConf(SQLConf.LEGACY_HAVING_WITHOUT_GROUP_BY_AS_WHERE)) { // If the legacy conf is set, treat HAVING without GROUP BY as WHERE. -withHavingClause(havingClause, createProject()) +val predicate = expression(havingClause.booleanExpression) match { + case p: Predicate => p + case e => Cast(e, BooleanType) +} +Filter(predicate, createProject()) } else { // According to SQL standard, HAVING without GROUP BY means global aggregate. withHavingClause(havingClause, Aggregate(Nil, namedExpressions, withFilter)) diff --git a/sql/core/src/test/resources/sql-tests/inputs/group-by.sql b/sql/core/src/test/resources/sql-tests/inputs/group-by.sql index 81e2204..6ee1014 100644 --- a/sql/core/src/test/resources/sql-tests/inputs/group-by.sql +++ b/sql/core/src/test/resources/sql-tests/inputs/group-by.sql @@ -86,6 +86,16 @@ SELECT 1 FROM range(10) HAVING MAX(id) > 0; SELECT id FROM range(10) HAVING id > 0; +SET spark.sql.legacy.parser.havingWithoutGroupByAsWhere=true; + +SELECT 1 FROM range(10) HAVING true; + +SELECT 1 FROM range(10) HAVING MAX(id) > 0; + +SELECT id FROM range(10) HAVING id > 0; + +SET spark.sql.legacy.parser.havingWithoutGroupByAsWhere=false; + -- Test data CREATE OR REPLACE TEMPORARY VIEW test_agg AS SELECT * FROM VALUES (1, true), (1, false), diff --git a/sql/core/src/test/resources/sql-tests/results/group-by.sql.out b/sql/core/src/test/resources/sql-tests/results/group-by.sql.out index 75bda87..cc07cd6 100644 --- a/sql/core/src/test/resources/sql-tests/results/group-by.sql.out +++ b/sql/core/src/test/resources/sql-tests/results/group-by.sql.out @@ -1,5 +1,5 @@ -- Automatically generated by SQLQueryTestSuite --- Number of queries: 57 +-- Number of queries: 62 -- !query @@ -278,6 +278,67 @@ grouping expressions sequence is empty, and '`id`' is not an aggregate function. -- !query +SET spark.sql.legacy.parser.havingWithoutGroupByAsWhere=true +-- !query schema +struct +-- !query output +spark.sql.legacy.parser.havingWithoutGroupByAsWheretrue + + +-- !query +SELECT 1 FROM range(10) HAVING true +-- !query schema +struct<1:int> +-- !query output +1 +1 +1 +1 +1 +1 +1 +1 +1 +1 + + +-- !query +SELECT 1 FROM range(10) HAVING MAX(id) > 0 +-- !query schema +struct<> +-- !query output +org.apache.spark.sql.AnalysisException + +Aggregate/Window/Generate expressions ar
[spark] branch master updated (171db85 -> e279ed3)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 171db85 [SPARK-33874][K8S][FOLLOWUP] Handle long lived sidecars - clean up logging add e279ed3 [SPARK-34012][SQL] Keep behavior consistent when conf `spark.sql.legacy.parser.havingWithoutGroupByAsWhere` is true with migration guide No new revisions were added by this update. Summary of changes: .../spark/sql/catalyst/parser/AstBuilder.scala | 6 ++- .../test/resources/sql-tests/inputs/group-by.sql | 10 .../resources/sql-tests/results/group-by.sql.out | 63 +- 3 files changed, 77 insertions(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-2.4 updated (45e19bb -> 3e6a6b7)
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a change to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git. from 45e19bb [SPARK-33911][SQL][DOCS][2.4] Update the SQL migration guide about changes in `HiveClientImpl` add 3e6a6b7 [SPARK-33935][SQL][2.4] Fix CBO cost function No new revisions were added by this update. Summary of changes: .../sql/catalyst/optimizer/CostBasedJoinReorder.scala | 13 + .../spark/sql/catalyst/optimizer/JoinReorderSuite.scala | 15 +++ .../optimizer/StarJoinCostBasedReorderSuite.scala | 8 3 files changed, 24 insertions(+), 12 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-2.4 updated: [SPARK-34012][SQL][2.4] Keep behavior consistent when conf `spark.sqllegacy.parser.havingWithoutGroupByAsWhere` is true with migration guide
This is an automated email from the ASF dual-hosted git repository. yamamuro pushed a commit to branch branch-2.4 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-2.4 by this push: new d442146 [SPARK-34012][SQL][2.4] Keep behavior consistent when conf `spark.sqllegacy.parser.havingWithoutGroupByAsWhere` is true with migration guide d442146 is described below commit d442146964a981dd7f074c4954f7fed2752124e8 Author: angerszhu AuthorDate: Wed Jan 6 20:54:47 2021 +0900 [SPARK-34012][SQL][2.4] Keep behavior consistent when conf `spark.sqllegacy.parser.havingWithoutGroupByAsWhere` is true with migration guide ### What changes were proposed in this pull request? In https://github.com/apache/spark/pull/22696 we support HAVING without GROUP BY means global aggregate But since we treat having as Filter before, in this way will cause a lot of analyze error, after https://github.com/apache/spark/pull/28294 we use `UnresolvedHaving` to instead `Filter` to solve such problem, but break origin logical about treat `SELECT 1 FROM range(10) HAVING true` as `SELECT 1 FROM range(10) WHERE true` . This PR fix this issue and add UT. NOTE: This backport comes from #31039 ### Why are the changes needed? Keep consistent behavior of migration guide. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? added UT Closes #31050 from AngersZh/SPARK-34012-2.4. Authored-by: angerszhu Signed-off-by: Takeshi Yamamuro --- .../spark/sql/catalyst/parser/AstBuilder.scala | 6 ++- .../test/resources/sql-tests/inputs/group-by.sql | 10 .../resources/sql-tests/results/group-by.sql.out | 60 +- 3 files changed, 74 insertions(+), 2 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala index 90e7d1c..4c4e4f1 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala @@ -467,7 +467,11 @@ class AstBuilder(conf: SQLConf) extends SqlBaseBaseVisitor[AnyRef] with Logging val withProject = if (aggregation == null && having != null) { if (conf.getConf(SQLConf.LEGACY_HAVING_WITHOUT_GROUP_BY_AS_WHERE)) { // If the legacy conf is set, treat HAVING without GROUP BY as WHERE. -withHaving(having, createProject()) +val predicate = expression(having) match { + case p: Predicate => p + case e => Cast(e, BooleanType) +} +Filter(predicate, createProject()) } else { // According to SQL standard, HAVING without GROUP BY means global aggregate. withHaving(having, Aggregate(Nil, namedExpressions, withFilter)) diff --git a/sql/core/src/test/resources/sql-tests/inputs/group-by.sql b/sql/core/src/test/resources/sql-tests/inputs/group-by.sql index 433db71..0c40a8c 100644 --- a/sql/core/src/test/resources/sql-tests/inputs/group-by.sql +++ b/sql/core/src/test/resources/sql-tests/inputs/group-by.sql @@ -80,3 +80,13 @@ SELECT 1 FROM range(10) HAVING true; SELECT 1 FROM range(10) HAVING MAX(id) > 0; SELECT id FROM range(10) HAVING id > 0; + +SET spark.sql.legacy.parser.havingWithoutGroupByAsWhere=true; + +SELECT 1 FROM range(10) HAVING true; + +SELECT 1 FROM range(10) HAVING MAX(id) > 0; + +SELECT id FROM range(10) HAVING id > 0; + +SET spark.sql.legacy.parser.havingWithoutGroupByAsWhere=false; diff --git a/sql/core/src/test/resources/sql-tests/results/group-by.sql.out b/sql/core/src/test/resources/sql-tests/results/group-by.sql.out index f9d1ee8..d23a58a 100644 --- a/sql/core/src/test/resources/sql-tests/results/group-by.sql.out +++ b/sql/core/src/test/resources/sql-tests/results/group-by.sql.out @@ -1,5 +1,5 @@ -- Automatically generated by SQLQueryTestSuite --- Number of queries: 30 +-- Number of queries: 35 -- !query 0 @@ -275,3 +275,61 @@ struct<> -- !query 29 output org.apache.spark.sql.AnalysisException grouping expressions sequence is empty, and '`id`' is not an aggregate function. Wrap '()' in windowing function(s) or wrap '`id`' in first() (or first_value) if you don't care which value you get.; + + +-- !query 30 +SET spark.sql.legacy.parser.havingWithoutGroupByAsWhere=true +-- !query 30 schema +struct +-- !query 30 output +spark.sql.legacy.parser.havingWithoutGroupByAsWheretrue + + +-- !query 31 +SELECT 1 FROM range(10) HAVING true +-- !query 31 schema +struct<1:int> +-- !query 31 output +1 +1 +1 +1 +1 +1 +1 +1 +1 +1 + + +-- !query 32 +SELECT 1 FROM range(10) HAVING MAX(id) > 0 +-- !query 32 schema +struct<> +-- !query 32 output +java.lang