[spark-website] branch asf-site updated: Update Spark 3.4 release window (#407)
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/spark-website.git The following commit(s) were added to refs/heads/asf-site by this push: new ad25dd72b Update Spark 3.4 release window (#407) ad25dd72b is described below commit ad25dd72b599178afd2390e131869b78d877b5c4 Author: Xinrong Meng AuthorDate: Fri Jul 22 17:37:32 2022 -0700 Update Spark 3.4 release window (#407) --- site/versioning-policy.html | 8 versioning-policy.md| 8 2 files changed, 8 insertions(+), 8 deletions(-) diff --git a/site/versioning-policy.html b/site/versioning-policy.html index 0851b5be3..437149a49 100644 --- a/site/versioning-policy.html +++ b/site/versioning-policy.html @@ -250,7 +250,7 @@ available APIs. generally be released about 6 months after 2.2.0. Maintenance releases happen as needed in between feature releases. Major releases do not happen according to a fixed schedule. -Spark 3.3 release window +Spark 3.4 release window @@ -261,15 +261,15 @@ in between feature releases. Major releases do not happen according to a fixed s - March 15th 2022 + January 15th 2023 Code freeze. Release branch cut. - Late March 2022 + Late January 2023 QA period. Focus on bug fixes, tests, stability and docs. Generally, no new features merged. - April 2022 + February 2023 Release candidates (RC), voting, etc. until final release passes diff --git a/versioning-policy.md b/versioning-policy.md index 55a0bd331..c1136de67 100644 --- a/versioning-policy.md +++ b/versioning-policy.md @@ -103,13 +103,13 @@ In general, feature ("minor") releases occur about every 6 months. Hence, Spark generally be released about 6 months after 2.2.0. Maintenance releases happen as needed in between feature releases. Major releases do not happen according to a fixed schedule. -Spark 3.3 release window +Spark 3.4 release window | Date | Event | | - | - | -| March 15th 2022 | Code freeze. Release branch cut.| -| Late March 2022 | QA period. Focus on bug fixes, tests, stability and docs. Generally, no new features merged.| -| April 2022 | Release candidates (RC), voting, etc. until final release passes| +| January 15th 2023 | Code freeze. Release branch cut.| +| Late January 2023 | QA period. Focus on bug fixes, tests, stability and docs. Generally, no new features merged.| +| February 2023 | Release candidates (RC), voting, etc. until final release passes| Maintenance releases and EOL - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] dongjoon-hyun merged pull request #407: Update Spark 3.4 release window
dongjoon-hyun merged PR #407: URL: https://github.com/apache/spark-website/pull/407 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[GitHub] [spark-website] dongjoon-hyun commented on pull request #407: Update Spark 3.4 release window
dongjoon-hyun commented on PR #407: URL: https://github.com/apache/spark-website/pull/407#issuecomment-1193022384 According to the discussion on the dev mailing list, I'll merge this -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-39784][SQL] Put Literal values on the right side of the data source filter after translating Catalyst Expression to data source filter
This is an automated email from the ASF dual-hosted git repository. huaxingao pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 2e2b1ae1021 [SPARK-39784][SQL] Put Literal values on the right side of the data source filter after translating Catalyst Expression to data source filter 2e2b1ae1021 is described below commit 2e2b1ae1021bc4bc99f9749e05e4770be3aec43f Author: huaxingao AuthorDate: Fri Jul 22 13:49:00 2022 -0700 [SPARK-39784][SQL] Put Literal values on the right side of the data source filter after translating Catalyst Expression to data source filter ### What changes were proposed in this pull request? Even though the literal value could be on both sides of the filter, e.g. both `a > 1` and `1 < a` are valid, after translating Catalyst Expression to data source filter, we want the literal value on the right side so it's easier for the data source to handle these filters. We do this kind of normalization for V1 Filter. We should have the same behavior for V2 Filter. Before this PR, for the filters that have literal values on the right side, e.g. `1 > a`, we keep it as is. After this PR, we will normalize it to `a < 1` so the data source doesn't need to check each of the filters (and do the flip). ### Why are the changes needed? I think we should follow V1 Filter's behavior, normalize the filters during catalyst Expression to DS Filter translation time to make the literal values on the right side, so later on, data source doesn't need to check every single filter to figure out if it needs to flip the sides. ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? new test Closes #37197 from huaxingao/flip. Authored-by: huaxingao Signed-off-by: huaxingao --- .../sql/catalyst/util/V2ExpressionBuilder.scala| 21 +++ .../datasources/v2/DataSourceV2StrategySuite.scala | 67 +- 2 files changed, 86 insertions(+), 2 deletions(-) diff --git a/sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala b/sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala index 8bb65a88044..59cbcf48334 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/catalyst/util/V2ExpressionBuilder.scala @@ -233,6 +233,10 @@ class V2ExpressionBuilder(e: Expression, isPredicate: Boolean = false) { val r = generateExpression(b.right) if (l.isDefined && r.isDefined) { b match { + case _: Predicate if isBinaryComparisonOperator(b.sqlOperator) && + l.get.isInstanceOf[LiteralValue[_]] && r.get.isInstanceOf[FieldReference] => +Some(new V2Predicate(flipComparisonOperatorName(b.sqlOperator), + Array[V2Expression](r.get, l.get))) case _: Predicate => Some(new V2Predicate(b.sqlOperator, Array[V2Expression](l.get, r.get))) case _ => @@ -408,6 +412,23 @@ class V2ExpressionBuilder(e: Expression, isPredicate: Boolean = false) { } case _ => None } + + private def isBinaryComparisonOperator(operatorName: String): Boolean = { +operatorName match { + case ">" | "<" | ">=" | "<=" | "=" | "<=>" => true + case _ => false +} + } + + private def flipComparisonOperatorName(operatorName: String): String = { +operatorName match { + case ">" => "<" + case "<" => ">" + case ">=" => "<=" + case "<=" => ">=" + case _ => operatorName +} + } } object ColumnOrField { diff --git a/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2StrategySuite.scala b/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2StrategySuite.scala index 66dc65cf681..c3f51bed269 100644 --- a/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2StrategySuite.scala +++ b/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2StrategySuite.scala @@ -18,14 +18,77 @@ package org.apache.spark.sql.execution.datasources.v2 import org.apache.spark.sql.catalyst.dsl.expressions._ -import org.apache.spark.sql.catalyst.expressions.Expression +import org.apache.spark.sql.catalyst.expressions._ import org.apache.spark.sql.catalyst.plans.PlanTest import org.apache.spark.sql.connector.expressions.{FieldReference, LiteralValue} import org.apache.spark.sql.connector.expressions.filter.Predicate import org.apache.spark.sql.test.SharedSparkSession -import org.apache.spark.sql.types.BooleanType +import org.apache.spark.sql.types.{BooleanType, IntegerType, StringType, StructField, StructType} class DataSourceV2StrategySuite extends PlanTest with SharedSparkSes
[spark] branch master updated: [SPARK-38597][K8S][INFRA] Enable Spark on K8S integration tests
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 5e6aab49a04 [SPARK-38597][K8S][INFRA] Enable Spark on K8S integration tests 5e6aab49a04 is described below commit 5e6aab49a046c19e85f2177df440c38c7277dc08 Author: Yikun Jiang AuthorDate: Fri Jul 22 09:21:08 2022 -0700 [SPARK-38597][K8S][INFRA] Enable Spark on K8S integration tests ### What changes were proposed in this pull request? Enable Spark on K8S integration tests in Github Action based on minikube: - The K8S IT will always triggered in user fork repo and `apache/spark` merged commits to master branch - This PR does NOT contains Volcano related test due to limited resource of github action. - minikube installation is allowed by Apache Infra: [INFRA-23000](https://issues.apache.org/jira/projects/INFRA/issues/INFRA-23000) - Why setting driver 0.5 cpu, executor 0.2 cpu? * Github-hosted runner hardware limited: [2U7G](https://docs.github.com/en/actions/using-github-hosted-runners/about-github-hosted-runners#supported-runners-and-hardware-resources), so cpu resource is very limited. * IT Job available CPU = 2U - 0.85U (K8S deploy) = 1.15U * There are 1.15 cpu left after k8s installation, to meet the requirement of K8S tests (one driver + max to 3 executors). * For memory: 6947 is maximum (Otherwise raise `Exiting due to RSRC_OVER_ALLOC_MEM: Requested memory allocation 7168MB is more than your system limit 6947MB.`), but this is not integer multiple of 1024, so I just set this to 6144 for better resource statistic. - Time cost info: * 14 mins to compile related code. * 3 mins to build docker images. * 20-30 mins to test * Total: about 30-40 mins ### Why are the changes needed? This will also improve the efficiency of K8S development and guarantee the quality of spark on K8S and spark docker image in some level. ### Does this PR introduce _any_ user-facing change? No, dev only. ### How was this patch tested? CI passed Closes #35830 Closes #37244 from Yikun/SPARK-38597-k8s-it. Authored-by: Yikun Jiang Signed-off-by: Dongjoon Hyun --- .github/workflows/build_and_test.yml | 73 +++- 1 file changed, 72 insertions(+), 1 deletion(-) diff --git a/.github/workflows/build_and_test.yml b/.github/workflows/build_and_test.yml index 02b799891fd..1902468e90c 100644 --- a/.github/workflows/build_and_test.yml +++ b/.github/workflows/build_and_test.yml @@ -99,7 +99,8 @@ jobs: \"docker-integration-tests\": \"$docker\", \"scala-213\": \"true\", \"java-11-17\": \"true\", - \"lint\" : \"true\" + \"lint\" : \"true\", + \"k8s-integration-tests\" : \"true\", }" echo $precondition # For debugging # GitHub Actions set-output doesn't take newlines @@ -869,3 +870,73 @@ jobs: with: name: unit-tests-log-docker-integration--8-${{ inputs.hadoop }}-hive2.3 path: "**/target/unit-tests.log" + + k8s-integration-tests: +needs: precondition +if: fromJson(needs.precondition.outputs.required).k8s-integration-tests == 'true' +name: Run Spark on Kubernetes Integration test +runs-on: ubuntu-20.04 +steps: + - name: Checkout Spark repository +uses: actions/checkout@v2 +with: + fetch-depth: 0 + repository: apache/spark + ref: ${{ inputs.branch }} + - name: Sync the current branch with the latest in Apache Spark +if: github.repository != 'apache/spark' +run: | + echo "APACHE_SPARK_REF=$(git rev-parse HEAD)" >> $GITHUB_ENV + git fetch https://github.com/$GITHUB_REPOSITORY.git ${GITHUB_REF#refs/heads/} + git -c user.name='Apache Spark Test Account' -c user.email='sparktest...@gmail.com' merge --no-commit --progress --squash FETCH_HEAD + git -c user.name='Apache Spark Test Account' -c user.email='sparktest...@gmail.com' commit -m "Merged commit" --allow-empty + - name: Cache Scala, SBT and Maven +uses: actions/cache@v2 +with: + path: | +build/apache-maven-* +build/scala-* +build/*.jar +~/.sbt + key: build-${{ hashFiles('**/pom.xml', 'project/build.properties', 'build/mvn', 'build/sbt', 'build/sbt-launch-lib.bash', 'build/spark-build-info') }} + restore-keys: | +build- + - name: Cache Coursier local repository +uses: actions/cache@v2 +with: + path: ~/.cache/coursier + key: k8s-integration-coursier-${{ hashFiles('**/pom.xml', '**/plugins.sbt') }} + restore-keys: | +