[GitHub] [spark-website] gengliangwang commented on pull request #400: [SPARK-39512] Document docker image release steps
gengliangwang commented on PR #400: URL: https://github.com/apache/spark-website/pull/400#issuecomment-1166192744 FYI I just published docker images for Spark 3.3. release https://hub.docker.com/r/apache/spark https://hub.docker.com/r/apache/spark-py https://hub.docker.com/r/apache/spark-r I will do send an email to the dev/user list if no issues found during the weekend cc @holdenk -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] gengliangwang commented on pull request #400: [SPARK-39512] Document docker image release steps
gengliangwang commented on PR #400: URL: https://github.com/apache/spark-website/pull/400#issuecomment-1166190151 > Maybe, unlike maven repos though we don't have a staging location set up, I think we could ask ASF Infra to make us a staging location? We can publish RC images with a different tag, e.g. v3.4.0-rc1. After release, the images can be deleted. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] holdenk commented on pull request #400: [SPARK-39512] Document docker image release steps
holdenk commented on PR #400: URL: https://github.com/apache/spark-website/pull/400#issuecomment-1166189409 Maybe, unlike maven repos though we don't have a staging location set up, I think we could ask ASF Infra to make us a staging location? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] gengliangwang commented on pull request #400: [SPARK-39512] Document docker image release steps
gengliangwang commented on PR #400: URL: https://github.com/apache/spark-website/pull/400#issuecomment-116616 BTW, I think we should add the docker images into the RC vote email and let the community test them as well. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] gengliangwang commented on pull request #400: [SPARK-39512] Document docker image release steps
gengliangwang commented on PR #400: URL: https://github.com/apache/spark-website/pull/400#issuecomment-1166188788 @holdenk I followed the steps and it works! I have built docker images on https://hub.docker.com/u/gengliangwang If @MaxGekk doesn't have permission to publish it, I can do it for him this time. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] gengliangwang commented on a diff in pull request #400: [SPARK-39512] Document docker image release steps
gengliangwang commented on code in PR #400: URL: https://github.com/apache/spark-website/pull/400#discussion_r906444239 ## site/sitemap.xml: ## @@ -941,27 +941,27 @@ weekly - https://spark.apache.org/graphx/ + https://spark.apache.org/news/ Review Comment: +1 @srowen. The changes on this file seem not necessary. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-39576][INFRA] Support GitHub Actions generate benchmark results using Scala 2.13
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 4bc6e19dde0 [SPARK-39576][INFRA] Support GitHub Actions generate benchmark results using Scala 2.13 4bc6e19dde0 is described below commit 4bc6e19dde0eae5d100b7bfdfcf22e719fd59cb5 Author: yangjie01 AuthorDate: Fri Jun 24 09:09:34 2022 -0700 [SPARK-39576][INFRA] Support GitHub Actions generate benchmark results using Scala 2.13 ### What changes were proposed in this pull request? This pr aims let `benchmark` GitHub Actions support the specified Scala version, then it can produce benchmark results using Scala 2.13. ### Why are the changes needed? Help us check the microbenchmark results using Scala 2.13 and ensure they are not slower than using Scala 2.12 ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - Pass Github Actions - Closes #36975 from LuciferYang/213-bench. Authored-by: yangjie01 Signed-off-by: Dongjoon Hyun --- .github/workflows/benchmark.yml | 17 - 1 file changed, 12 insertions(+), 5 deletions(-) diff --git a/.github/workflows/benchmark.yml b/.github/workflows/benchmark.yml index 91e168210fb..a322fe065b5 100644 --- a/.github/workflows/benchmark.yml +++ b/.github/workflows/benchmark.yml @@ -30,6 +30,10 @@ on: description: 'JDK version: 8, 11 or 17' required: true default: '8' + scala: +description: 'Scala version: 2.12 or 2.13' +required: true +default: '2.12' failfast: description: 'Failfast: true or false' required: true @@ -53,7 +57,7 @@ jobs: run: echo "::set-output name=matrix::["`seq -s, 1 $SPARK_BENCHMARK_NUM_SPLITS`"]" benchmark: -name: "Run benchmarks: ${{ github.event.inputs.class }} (JDK ${{ github.event.inputs.jdk }}, ${{ matrix.split }} out of ${{ github.event.inputs.num-splits }} splits)" +name: "Run benchmarks: ${{ github.event.inputs.class }} (JDK ${{ github.event.inputs.jdk }}, Scala ${{ github.event.inputs.scala }}, ${{ matrix.split }} out of ${{ github.event.inputs.num-splits }} splits)" needs: matrix-gen # Ubuntu 20.04 is the latest LTS. The next LTS is 22.04. runs-on: ubuntu-20.04 @@ -99,7 +103,8 @@ jobs: java-version: ${{ github.event.inputs.jdk }} - name: Run benchmarks run: | -./build/sbt -Pyarn -Pmesos -Pkubernetes -Phive -Phive-thriftserver -Phadoop-cloud -Pkinesis-asl -Pspark-ganglia-lgpl test:package +dev/change-scala-version.sh ${{ github.event.inputs.scala }} +./build/sbt -Pscala-${{ github.event.inputs.scala }} -Pyarn -Pmesos -Pkubernetes -Phive -Phive-thriftserver -Phadoop-cloud -Pkinesis-asl -Pspark-ganglia-lgpl test:package # Make less noisy cp conf/log4j2.properties.template conf/log4j2.properties sed -i 's/rootLogger.level = info/rootLogger.level = warn/g' conf/log4j2.properties @@ -109,13 +114,15 @@ jobs: --jars "`find . -name '*-SNAPSHOT-tests.jar' -o -name '*avro*-SNAPSHOT.jar' | paste -sd ',' -`" \ "`find . -name 'spark-core*-SNAPSHOT-tests.jar'`" \ "${{ github.event.inputs.class }}" +# Revert to default Scala version to clean up unnecessary git diff +dev/change-scala-version.sh 2.12 # To keep the directory structure and file permissions, tar them # See also https://github.com/actions/upload-artifact#maintaining-file-permissions-and-case-sensitive-files echo "Preparing the benchmark results:" -tar -cvf benchmark-results-${{ github.event.inputs.jdk }}.tar `git diff --name-only` `git ls-files --others --exclude-standard` +tar -cvf benchmark-results-${{ github.event.inputs.jdk }}-${{ github.event.inputs.scala }}.tar `git diff --name-only` `git ls-files --others --exclude-standard` - name: Upload benchmark results uses: actions/upload-artifact@v2 with: -name: benchmark-results-${{ github.event.inputs.jdk }}-${{ matrix.split }} -path: benchmark-results-${{ github.event.inputs.jdk }}.tar +name: benchmark-results-${{ github.event.inputs.jdk }}-${{ github.event.inputs.scala }}-${{ matrix.split }} +path: benchmark-results-${{ github.event.inputs.jdk }}-${{ github.event.inputs.scala }}.tar - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (1bb272de332 -> 299cdfad881)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git from 1bb272de332 [SPARK-39453][SQL] DS V2 supports push down misc non-aggregate functions(non ANSI) add 299cdfad881 [SPARK-39506][SQL] Make CacheTable, isCached, UncacheTable, setCurrentCatalog, currentCatalog, listCatalogs 3l namespace compatible No new revisions were added by this update. Summary of changes: project/MimaExcludes.scala | 7 +- .../org/apache/spark/sql/catalog/Catalog.scala | 21 ++ .../org/apache/spark/sql/catalog/interface.scala | 19 + .../apache/spark/sql/internal/CatalogImpl.scala| 54 -- .../apache/spark/sql/internal/CatalogSuite.scala | 86 -- 5 files changed, 141 insertions(+), 46 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (4ad7386eefe -> 1bb272de332)
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git from 4ad7386eefe [SPARK-38978][SQL] DS V2 supports push down OFFSET operator add 1bb272de332 [SPARK-39453][SQL] DS V2 supports push down misc non-aggregate functions(non ANSI) No new revisions were added by this update. Summary of changes: .../expressions/GeneralScalarExpression.java | 18 ++ .../sql/connector/util/V2ExpressionSQLBuilder.java | 3 +++ .../sql/catalyst/util/V2ExpressionBuilder.scala| 28 ++ .../org/apache/spark/sql/jdbc/H2Dialect.scala | 4 ++-- .../org/apache/spark/sql/jdbc/JDBCV2Suite.scala| 20 5 files changed, 71 insertions(+), 2 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-38978][SQL] DS V2 supports push down OFFSET operator
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 4ad7386eefe [SPARK-38978][SQL] DS V2 supports push down OFFSET operator 4ad7386eefe is described below commit 4ad7386eefe0856e500d1a11e2bb992a045ff217 Author: Jiaan Geng AuthorDate: Fri Jun 24 17:33:07 2022 +0800 [SPARK-38978][SQL] DS V2 supports push down OFFSET operator ### What changes were proposed in this pull request? Currently, DS V2 push-down supports `LIMIT` but `OFFSET`. If we can pushing down `OFFSET` to JDBC data source, it will be better performance. ### Why are the changes needed? push down `OFFSET` could improves the performance. ### Does this PR introduce _any_ user-facing change? 'No'. New feature. ### How was this patch tested? New tests. Closes #36295 from beliefer/SPARK-38978. Authored-by: Jiaan Geng Signed-off-by: Wenchen Fan --- .../spark/sql/connector/read/ScanBuilder.java | 3 +- .../sql/connector/read/SupportsPushDownLimit.java | 4 +- ...canBuilder.java => SupportsPushDownOffset.java} | 17 +- .../sql/connector/read/SupportsPushDownTopN.java | 23 +- .../main/scala/org/apache/spark/sql/Dataset.scala | 2 +- .../spark/sql/execution/DataSourceScanExec.scala | 9 +- .../execution/datasources/DataSourceStrategy.scala | 6 +- .../execution/datasources/jdbc/JDBCOptions.scala | 5 + .../sql/execution/datasources/jdbc/JDBCRDD.scala | 12 +- .../execution/datasources/jdbc/JDBCRelation.scala | 6 +- .../execution/datasources/v2/PushDownUtils.scala | 15 +- .../datasources/v2/PushedDownOperators.scala | 1 + .../datasources/v2/V2ScanRelationPushDown.scala| 75 - .../execution/datasources/v2/jdbc/JDBCScan.scala | 5 +- .../datasources/v2/jdbc/JDBCScanBuilder.scala | 20 +- .../org/apache/spark/sql/jdbc/JdbcDialects.scala | 7 + .../org/apache/spark/sql/jdbc/JDBCV2Suite.scala| 352 - 17 files changed, 514 insertions(+), 48 deletions(-) diff --git a/sql/catalyst/src/main/java/org/apache/spark/sql/connector/read/ScanBuilder.java b/sql/catalyst/src/main/java/org/apache/spark/sql/connector/read/ScanBuilder.java index 27ee534d804..f5ce604148b 100644 --- a/sql/catalyst/src/main/java/org/apache/spark/sql/connector/read/ScanBuilder.java +++ b/sql/catalyst/src/main/java/org/apache/spark/sql/connector/read/ScanBuilder.java @@ -23,7 +23,8 @@ import org.apache.spark.annotation.Evolving; * An interface for building the {@link Scan}. Implementations can mixin SupportsPushDownXYZ * interfaces to do operator push down, and keep the operator push down result in the returned * {@link Scan}. When pushing down operators, the push down order is: - * sample - filter - aggregate - limit - column pruning. + * sample - filter - aggregate - limit/top-n(sort + limit) - offset - + * column pruning. * * @since 3.0.0 */ diff --git a/sql/catalyst/src/main/java/org/apache/spark/sql/connector/read/SupportsPushDownLimit.java b/sql/catalyst/src/main/java/org/apache/spark/sql/connector/read/SupportsPushDownLimit.java index 035154d0845..8a725cd7ed7 100644 --- a/sql/catalyst/src/main/java/org/apache/spark/sql/connector/read/SupportsPushDownLimit.java +++ b/sql/catalyst/src/main/java/org/apache/spark/sql/connector/read/SupportsPushDownLimit.java @@ -21,8 +21,8 @@ import org.apache.spark.annotation.Evolving; /** * A mix-in interface for {@link ScanBuilder}. Data sources can implement this interface to - * push down LIMIT. Please note that the combination of LIMIT with other operations - * such as AGGREGATE, GROUP BY, SORT BY, CLUSTER BY, DISTRIBUTE BY, etc. is NOT pushed down. + * push down LIMIT. We can push down LIMIT with many other operations if they follow the + * operator order we defined in {@link ScanBuilder}'s class doc. * * @since 3.3.0 */ diff --git a/sql/catalyst/src/main/java/org/apache/spark/sql/connector/read/ScanBuilder.java b/sql/catalyst/src/main/java/org/apache/spark/sql/connector/read/SupportsPushDownOffset.java similarity index 68% copy from sql/catalyst/src/main/java/org/apache/spark/sql/connector/read/ScanBuilder.java copy to sql/catalyst/src/main/java/org/apache/spark/sql/connector/read/SupportsPushDownOffset.java index 27ee534d804..ffa2cad3715 100644 --- a/sql/catalyst/src/main/java/org/apache/spark/sql/connector/read/ScanBuilder.java +++ b/sql/catalyst/src/main/java/org/apache/spark/sql/connector/read/SupportsPushDownOffset.java @@ -20,14 +20,17 @@ package org.apache.spark.sql.connector.read; import org.apache.spark.annotation.Evolving; /** - * An interface for building the {@link Scan}. Implementations can mixin SupportsPushDownXYZ - * interfaces to do operator push down, and keep the operator push down result in the returned