[spark] branch master updated: [SPARK-37139][PYTHON][FOLLOWUP] Fix class variable type hints in taskcontext.py
This is an automated email from the ASF dual-hosted git repository. gurwls223 pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new fd6079d [SPARK-37139][PYTHON][FOLLOWUP] Fix class variable type hints in taskcontext.py fd6079d is described below commit fd6079d72e6579d85066a430d30d082259165c0e Author: Takuya UESHIN AuthorDate: Sat Nov 13 15:33:45 2021 +0900 [SPARK-37139][PYTHON][FOLLOWUP] Fix class variable type hints in taskcontext.py ### What changes were proposed in this pull request? Fixes class variable type hints in taskcontext.py. ### Why are the changes needed? In `taskcontext.py`, a variable `_taskContext` should be annotated as `ClassVar` of `TaskContext`. Also, `_port` and `_secret` of `BarrierTaskContext ` should be annotated. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? `lint-python` should pass. Closes #34574 from ueshin/issues/SPARK-37139/taskcontext. Authored-by: Takuya UESHIN Signed-off-by: Hyukjin Kwon --- python/pyspark/taskcontext.py | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/python/pyspark/taskcontext.py b/python/pyspark/taskcontext.py index ebfcfcd..7333a68 100644 --- a/python/pyspark/taskcontext.py +++ b/python/pyspark/taskcontext.py @@ -14,7 +14,7 @@ # See the License for the specific language governing permissions and # limitations under the License. # -from typing import Type, Dict, List, Optional, Union, cast +from typing import ClassVar, Type, Dict, List, Optional, Union, cast from pyspark.java_gateway import local_connect_and_auth from pyspark.resource import ResourceInformation @@ -29,7 +29,7 @@ class TaskContext(object): :meth:`TaskContext.get`. """ -_taskContext: Optional["TaskContext"] = None +_taskContext: ClassVar[Optional["TaskContext"]] = None _attemptNumber: Optional[int] = None _partitionId: Optional[int] = None @@ -171,8 +171,8 @@ class BarrierTaskContext(TaskContext): This API is experimental """ -_port = None -_secret = None +_port: ClassVar[Optional[Union[str, int]]] = None +_secret: ClassVar[Optional[str]] = None @classmethod def _getOrCreate(cls: Type["BarrierTaskContext"]) -> "BarrierTaskContext": - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-37282][TESTS][FOLLOWUP] Mark `YarnShuffleServiceSuite` as ExtendedLevelDBTest
This is an automated email from the ASF dual-hosted git repository. sarutak pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 8b45a08 [SPARK-37282][TESTS][FOLLOWUP] Mark `YarnShuffleServiceSuite` as ExtendedLevelDBTest 8b45a08 is described below commit 8b45a08b763e9ee6c75b039893af3de5e5167643 Author: Dongjoon Hyun AuthorDate: Sat Nov 13 15:21:59 2021 +0900 [SPARK-37282][TESTS][FOLLOWUP] Mark `YarnShuffleServiceSuite` as ExtendedLevelDBTest ### What changes were proposed in this pull request? This PR is a follow-up of #34548. This is missed due to `-Pyarn` profile. ### Why are the changes needed? This is required to pass `yarn` module on Apple Silicon. **BEFORE** ``` $ build/sbt "yarn/test" ... [info] YarnShuffleServiceSuite: [info] org.apache.spark.network.yarn.YarnShuffleServiceSuite *** ABORTED *** (20 milliseconds) [info] java.lang.UnsatisfiedLinkError: Could not load library. Reasons: [no leveldbjni64-1.8 ... ``` **AFTER** ``` $ build/sbt "yarn/test" -Pyarn -Dtest.exclude.tags=org.apache.spark.tags.ExtendedLevelDBTest ... [info] Run completed in 4 minutes, 57 seconds. [info] Total number of tests run: 135 [info] Suites: completed 18, aborted 0 [info] Tests: succeeded 135, failed 0, canceled 1, ignored 0, pending 0 [info] All tests passed. [success] Total time: 319 s (05:19), completed Nov 12, 2021, 4:53:14 PM ``` ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? A manual test on Apple Silicon. ``` $ build/sbt "yarn/test" -Pyarn -Dtest.exclude.tags=org.apache.spark.tags.ExtendedLevelDBTest ``` Closes #34576 from dongjoon-hyun/SPARK-37282-2. Authored-by: Dongjoon Hyun Signed-off-by: Kousuke Saruta --- .../scala/org/apache/spark/network/yarn/YarnShuffleServiceSuite.scala | 2 ++ 1 file changed, 2 insertions(+) diff --git a/resource-managers/yarn/src/test/scala/org/apache/spark/network/yarn/YarnShuffleServiceSuite.scala b/resource-managers/yarn/src/test/scala/org/apache/spark/network/yarn/YarnShuffleServiceSuite.scala index b2025aa..38d2247 100644 --- a/resource-managers/yarn/src/test/scala/org/apache/spark/network/yarn/YarnShuffleServiceSuite.scala +++ b/resource-managers/yarn/src/test/scala/org/apache/spark/network/yarn/YarnShuffleServiceSuite.scala @@ -46,8 +46,10 @@ import org.apache.spark.internal.config._ import org.apache.spark.network.shuffle.{NoOpMergedShuffleFileManager, RemoteBlockPushResolver, ShuffleTestAccessor} import org.apache.spark.network.shuffle.protocol.ExecutorShuffleInfo import org.apache.spark.network.util.TransportConf +import org.apache.spark.tags.ExtendedLevelDBTest import org.apache.spark.util.Utils +@ExtendedLevelDBTest class YarnShuffleServiceSuite extends SparkFunSuite with Matchers with BeforeAndAfterEach { private[yarn] var yarnConfig: YarnConfiguration = null private[yarn] val SORT_MANAGER = "org.apache.spark.shuffle.sort.SortShuffleManager" - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-37312][TESTS] Add `.java-version` to `.gitignore` and `.rat-excludes`
This is an automated email from the ASF dual-hosted git repository. sarutak pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new d0eb621 [SPARK-37312][TESTS] Add `.java-version` to `.gitignore` and `.rat-excludes` d0eb621 is described below commit d0eb62179822c82596c4feaa412f2fdf5b83c02a Author: Dongjoon Hyun AuthorDate: Sat Nov 13 14:43:52 2021 +0900 [SPARK-37312][TESTS] Add `.java-version` to `.gitignore` and `.rat-excludes` ### What changes were proposed in this pull request? To support Java 8/11/17 test more easily, this PR aims to add `.java-version` to `.gitignore` and `.rat-excludes`. ### Why are the changes needed? When we use `jenv`, `dev/check-license` and `dev/run-tests` fails. ``` Running Apache RAT checks Could not find Apache license headers in the following files: !? /Users/dongjoon/APACHE/spark-merge/.java-version [error] running /Users/dongjoon/APACHE/spark-merge/dev/check-license ; received return code 1 ``` ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? ``` $ jenv local 17 $ dev/check-license ``` Closes #34577 from dongjoon-hyun/SPARK-37312. Authored-by: Dongjoon Hyun Signed-off-by: Kousuke Saruta --- .gitignore| 1 + dev/.rat-excludes | 1 + 2 files changed, 2 insertions(+) diff --git a/.gitignore b/.gitignore index 1a7881a..560265e 100644 --- a/.gitignore +++ b/.gitignore @@ -7,6 +7,7 @@ *.pyo *.swp *~ +.java-version .DS_Store .bsp/ .cache diff --git a/dev/.rat-excludes b/dev/.rat-excludes index a35d4ce..7932c5d 100644 --- a/dev/.rat-excludes +++ b/dev/.rat-excludes @@ -10,6 +10,7 @@ cache .generated-mima-member-excludes .rat-excludes .*md +.java-version derby.log licenses/* licenses-binary/* - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated (aaf0e5e -> 8e05c78)
This is an automated email from the ASF dual-hosted git repository. ueshin pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from aaf0e5e [SPARK-37292][SQL][FOLLOWUP] Simplify the condition when removing outer join if it only has DISTINCT on streamed side add 8e05c78 [SPARK-37298][SQL] Use unique exprIds in RewriteAsOfJoin No new revisions were added by this update. Summary of changes: .../spark/sql/catalyst/optimizer/RewriteAsOfJoin.scala | 13 +++-- 1 file changed, 7 insertions(+), 6 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch master updated: [SPARK-37292][SQL][FOLLOWUP] Simplify the condition when removing outer join if it only has DISTINCT on streamed side
This is an automated email from the ASF dual-hosted git repository. viirya pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new aaf0e5e [SPARK-37292][SQL][FOLLOWUP] Simplify the condition when removing outer join if it only has DISTINCT on streamed side aaf0e5e is described below commit aaf0e5e71509a2324e110e45366b753c7926c64b Author: Yuming Wang AuthorDate: Fri Nov 12 14:30:47 2021 -0800 [SPARK-37292][SQL][FOLLOWUP] Simplify the condition when removing outer join if it only has DISTINCT on streamed side ### What changes were proposed in this pull request? Simplify the condition when removing outer join if it only has DISTINCT on streamed side with alias. See: https://github.com/apache/spark/pull/34557#discussion_r748005299. ### Why are the changes needed? Simplify the code. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Existing unit test. Closes #34573 from wangyum/SPARK-37292-2. Authored-by: Yuming Wang Signed-off-by: Liang-Chi Hsieh --- .../main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala index f9f2e83..e03360d 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala @@ -179,12 +179,10 @@ object EliminateOuterJoin extends Rule[LogicalPlan] with PredicateHelper { if a.groupOnly && a.references.subsetOf(right.outputSet) => a.copy(child = right) case a @ Aggregate(_, _, p @ Project(_, Join(left, _, LeftOuter, _, _))) -if a.groupOnly && p.outputSet.subsetOf(a.references) && - AttributeSet(p.projectList.flatMap(_.references)).subsetOf(left.outputSet) => +if a.groupOnly && p.references.subsetOf(left.outputSet) => a.copy(child = p.copy(child = left)) case a @ Aggregate(_, _, p @ Project(_, Join(_, right, RightOuter, _, _))) -if a.groupOnly && p.outputSet.subsetOf(a.references) && - AttributeSet(p.projectList.flatMap(_.references)).subsetOf(right.outputSet) => +if a.groupOnly && p.references.subsetOf(right.outputSet) => a.copy(child = p.copy(child = right)) } } - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org
[spark] branch branch-3.2 updated: [SPARK-37302][BUILD] Explicitly downloads guava and jetty-io in test-dependencies.sh
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch branch-3.2 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.2 by this push: new 0c99f13 [SPARK-37302][BUILD] Explicitly downloads guava and jetty-io in test-dependencies.sh 0c99f13 is described below commit 0c99f137dac34de98d647d2919f4ca14f683b890 Author: Kousuke Saruta AuthorDate: Fri Nov 12 11:14:29 2021 -0800 [SPARK-37302][BUILD] Explicitly downloads guava and jetty-io in test-dependencies.sh ### What changes were proposed in this pull request? This PR change `dev/test-dependencies.sh` to download `guava` and `jetty-io` explicitly. `dev/run-tests.py` fails if Scala 2.13 is used and `guava` or `jetty-io` is not in the both of Maven and Coursier local repository. ``` $ rm -rf ~/.m2/repository/* $ # For Linux $ rm -rf ~/.cache/coursier/v1/* $ # For macOS $ rm -rf ~/Library/Caches/Coursier/v1/* $ dev/change-scala-version.sh 2.13 $ dev/test-dependencies.sh $ build/sbt -Pscala-2.13 clean compile ... [error] /home/kou/work/oss/spark-scala-2.13/common/network-common/src/main/java/org/apache/spark/network/util/TransportConf.java:24:1: error: package com.google.common.primitives does not exist [error] import com.google.common.primitives.Ints; [error]^ [error] /home/kou/work/oss/spark-scala-2.13/common/network-common/src/main/java/org/apache/spark/network/client/TransportClientFactory.java:30:1: error: package com.google.common.annotations does not exist [error] import com.google.common.annotations.VisibleForTesting; [error] ^ [error] /home/kou/work/oss/spark-scala-2.13/common/network-common/src/main/java/org/apache/spark/network/client/TransportClientFactory.java:31:1: error: package com.google.common.base does not exist [error] import com.google.common.base.Preconditions; ... ``` ``` [error] /home/kou/work/oss/spark-scala-2.13/core/src/main/scala/org/apache/spark/deploy/rest/RestSubmissionServer.scala:87:25: Class org.eclipse.jetty.io.ByteBufferPool not found - continuing with a stub. [error] val connector = new ServerConnector( [error] ^ [error] /home/kou/work/oss/spark-scala-2.13/core/src/main/scala/org/apache/spark/deploy/rest/RestSubmissionServer.scala:87:21: multiple constructors for ServerConnector with alternatives: [error] (x$1: org.eclipse.jetty.server.Server,x$2: java.util.concurrent.Executor,x$3: org.eclipse.jetty.util.thread.Scheduler,x$4: org.eclipse.jetty.io.ByteBufferPool,x$5: Int,x$6: Int,x$7: org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector [error] (x$1: org.eclipse.jetty.server.Server,x$2: org.eclipse.jetty.util.ssl.SslContextFactory,x$3: org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector [error] (x$1: org.eclipse.jetty.server.Server,x$2: org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector [error] (x$1: org.eclipse.jetty.server.Server,x$2: Int,x$3: Int,x$4: org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector [error] cannot be invoked with (org.eclipse.jetty.server.Server, Null, org.eclipse.jetty.util.thread.ScheduledExecutorScheduler, Null, Int, Int, org.eclipse.jetty.server.HttpConnectionFactory) [error] val connector = new ServerConnector( [error] ^ [error] /home/kou/work/oss/spark-scala-2.13/core/src/main/scala/org/apache/spark/ui/JettyUtils.scala:207:13: Class org.eclipse.jetty.io.ClientConnectionFactory not found - continuing with a stub. [error] new HttpClient(new HttpClientTransportOverHTTP(numSelectors), null) [error] ^ [error] /home/kou/work/oss/spark-scala-2.13/core/src/main/scala/org/apache/spark/ui/JettyUtils.scala:287:25: multiple constructors for ServerConnector with alternatives: [error] (x$1: org.eclipse.jetty.server.Server,x$2: java.util.concurrent.Executor,x$3: org.eclipse.jetty.util.thread.Scheduler,x$4: org.eclipse.jetty.io.ByteBufferPool,x$5: Int,x$6: Int,x$7: org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector [error] (x$1: org.eclipse.jetty.server.Server,x$2: org.eclipse.jetty.util.ssl.SslContextFactory,x$3: org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector [error] (x$1: org.eclipse.jetty.server.Server,x$2: org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector [error] (x$1: org.eclipse.jetty.server.Server,x$2: Int,x$3: Int,x$4: org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector [error] cannot be invoked with
[spark] branch master updated: [SPARK-37302][BUILD] Explicitly downloads guava and jetty-io in test-dependencies.sh
This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 01ab0be [SPARK-37302][BUILD] Explicitly downloads guava and jetty-io in test-dependencies.sh 01ab0be is described below commit 01ab0bedb2ad423963ebff78272e0d045b0b9107 Author: Kousuke Saruta AuthorDate: Fri Nov 12 11:14:29 2021 -0800 [SPARK-37302][BUILD] Explicitly downloads guava and jetty-io in test-dependencies.sh ### What changes were proposed in this pull request? This PR change `dev/test-dependencies.sh` to download `guava` and `jetty-io` explicitly. `dev/run-tests.py` fails if Scala 2.13 is used and `guava` or `jetty-io` is not in the both of Maven and Coursier local repository. ``` $ rm -rf ~/.m2/repository/* $ # For Linux $ rm -rf ~/.cache/coursier/v1/* $ # For macOS $ rm -rf ~/Library/Caches/Coursier/v1/* $ dev/change-scala-version.sh 2.13 $ dev/test-dependencies.sh $ build/sbt -Pscala-2.13 clean compile ... [error] /home/kou/work/oss/spark-scala-2.13/common/network-common/src/main/java/org/apache/spark/network/util/TransportConf.java:24:1: error: package com.google.common.primitives does not exist [error] import com.google.common.primitives.Ints; [error]^ [error] /home/kou/work/oss/spark-scala-2.13/common/network-common/src/main/java/org/apache/spark/network/client/TransportClientFactory.java:30:1: error: package com.google.common.annotations does not exist [error] import com.google.common.annotations.VisibleForTesting; [error] ^ [error] /home/kou/work/oss/spark-scala-2.13/common/network-common/src/main/java/org/apache/spark/network/client/TransportClientFactory.java:31:1: error: package com.google.common.base does not exist [error] import com.google.common.base.Preconditions; ... ``` ``` [error] /home/kou/work/oss/spark-scala-2.13/core/src/main/scala/org/apache/spark/deploy/rest/RestSubmissionServer.scala:87:25: Class org.eclipse.jetty.io.ByteBufferPool not found - continuing with a stub. [error] val connector = new ServerConnector( [error] ^ [error] /home/kou/work/oss/spark-scala-2.13/core/src/main/scala/org/apache/spark/deploy/rest/RestSubmissionServer.scala:87:21: multiple constructors for ServerConnector with alternatives: [error] (x$1: org.eclipse.jetty.server.Server,x$2: java.util.concurrent.Executor,x$3: org.eclipse.jetty.util.thread.Scheduler,x$4: org.eclipse.jetty.io.ByteBufferPool,x$5: Int,x$6: Int,x$7: org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector [error] (x$1: org.eclipse.jetty.server.Server,x$2: org.eclipse.jetty.util.ssl.SslContextFactory,x$3: org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector [error] (x$1: org.eclipse.jetty.server.Server,x$2: org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector [error] (x$1: org.eclipse.jetty.server.Server,x$2: Int,x$3: Int,x$4: org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector [error] cannot be invoked with (org.eclipse.jetty.server.Server, Null, org.eclipse.jetty.util.thread.ScheduledExecutorScheduler, Null, Int, Int, org.eclipse.jetty.server.HttpConnectionFactory) [error] val connector = new ServerConnector( [error] ^ [error] /home/kou/work/oss/spark-scala-2.13/core/src/main/scala/org/apache/spark/ui/JettyUtils.scala:207:13: Class org.eclipse.jetty.io.ClientConnectionFactory not found - continuing with a stub. [error] new HttpClient(new HttpClientTransportOverHTTP(numSelectors), null) [error] ^ [error] /home/kou/work/oss/spark-scala-2.13/core/src/main/scala/org/apache/spark/ui/JettyUtils.scala:287:25: multiple constructors for ServerConnector with alternatives: [error] (x$1: org.eclipse.jetty.server.Server,x$2: java.util.concurrent.Executor,x$3: org.eclipse.jetty.util.thread.Scheduler,x$4: org.eclipse.jetty.io.ByteBufferPool,x$5: Int,x$6: Int,x$7: org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector [error] (x$1: org.eclipse.jetty.server.Server,x$2: org.eclipse.jetty.util.ssl.SslContextFactory,x$3: org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector [error] (x$1: org.eclipse.jetty.server.Server,x$2: org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector [error] (x$1: org.eclipse.jetty.server.Server,x$2: Int,x$3: Int,x$4: org.eclipse.jetty.server.ConnectionFactory*)org.eclipse.jetty.server.ServerConnector [error] cannot be invoked with
[spark] branch branch-3.2 updated: [SPARK-37702][SQL] Use AnalysisContext to track referred temp functions
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a commit to branch branch-3.2 in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/branch-3.2 by this push: new b7ec10a [SPARK-37702][SQL] Use AnalysisContext to track referred temp functions b7ec10a is described below commit b7ec10afec4bd8a53e0ec06f90be1216b44e9538 Author: Linhong Liu AuthorDate: Fri Nov 12 22:20:02 2021 +0800 [SPARK-37702][SQL] Use AnalysisContext to track referred temp functions This PR uses `AnalysisContext` to track the referred temp functions in order to fix a temp function resolution issue when it's registered with a `FunctionBuilder` and referred by a temp view. During temporary view creation, we need to collect all the temp functions and save them to the metadata. So that next time when resolving the view SQL text, the functions can be resolved correctly. But if the temp function is registered with a `FunctionBuilder`, it's not a `UserDefinedExpression` so it cannot be collected as a temp function. As a result, the next time when the analyzer resolves a temp view, the registered function couldn't be found. Example: ```scala val func = CatalogFunction(FunctionIdentifier("tempFunc", None), ...) val builder = (e: Seq[Expression]) => e.head spark.sessionState.catalog.registerFunction(func, Some(builder)) sql("create temp view tv as select tempFunc(a, b) from values (1, 2) t(a, b)") sql("select * from tv").collect() ``` bug-fix no newly added test cases. Closes #34546 from linhongliu-db/SPARK-37702-ver3. Authored-by: Linhong Liu Signed-off-by: Wenchen Fan (cherry picked from commit 68a0ab5960e847e0fa1a59da0316d0c111574af4) Signed-off-by: Wenchen Fan --- .../spark/sql/catalyst/analysis/Analyzer.scala | 21 -- .../sql/catalyst/catalog/SessionCatalog.scala | 5 ++ .../spark/sql/catalyst/plans/logical/Command.scala | 5 +- .../sql/catalyst/plans/logical/v2Commands.scala| 8 +-- .../apache/spark/sql/execution/command/views.scala | 82 -- .../spark/sql/execution/datasources/ddl.scala | 6 +- .../spark/sql/SparkSessionExtensionSuite.scala | 10 +++ .../spark/sql/execution/SQLViewTestSuite.scala | 26 ++- .../execution/command/PlanResolutionSuite.scala| 6 +- 9 files changed, 111 insertions(+), 58 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala index fa6b247..89c7b5f 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala @@ -125,7 +125,11 @@ case class AnalysisContext( maxNestedViewDepth: Int = -1, relationCache: mutable.Map[Seq[String], LogicalPlan] = mutable.Map.empty, referredTempViewNames: Seq[Seq[String]] = Seq.empty, -referredTempFunctionNames: Seq[String] = Seq.empty, +// 1. If we are resolving a view, this field will be restored from the view metadata, +//by calling `AnalysisContext.withAnalysisContext(viewDesc)`. +// 2. If we are not resolving a view, this field will be updated everytime the analyzer +//lookup a temporary function. And export to the view metadata. +referredTempFunctionNames: mutable.Set[String] = mutable.Set.empty, outerPlan: Option[LogicalPlan] = None) object AnalysisContext { @@ -152,11 +156,17 @@ object AnalysisContext { maxNestedViewDepth, originContext.relationCache, viewDesc.viewReferredTempViewNames, - viewDesc.viewReferredTempFunctionNames) + mutable.Set(viewDesc.viewReferredTempFunctionNames: _*)) set(context) try f finally { set(originContext) } } + def withNewAnalysisContext[A](f: => A): A = { +val originContext = value.get() +reset() +try f finally { set(originContext) } + } + def withOuterPlan[A](outerPlan: LogicalPlan)(f: => A): A = { val originContext = value.get() val context = originContext.copy(outerPlan = Some(outerPlan)) @@ -204,11 +214,8 @@ class Analyzer(override val catalogManager: CatalogManager) } override def execute(plan: LogicalPlan): LogicalPlan = { -AnalysisContext.reset() -try { +AnalysisContext.withNewAnalysisContext { executeSameContext(plan) -} finally { - AnalysisContext.reset() } } @@ -3741,7 +3748,7 @@ class Analyzer(override val catalogManager: CatalogManager) _.containsPattern(COMMAND)) { case c: AnalysisOnlyCommand if c.resolved => checkAnalysis(c) -c.markAsAnalyzed() +c.markAsAnalyzed(AnalysisContext.get) } } } diff --git
[spark] branch master updated: [SPARK-37304][SQL] Allow ANSI intervals in v2 `ALTER TABLE .. REPLACE COLUMNS`
This is an automated email from the ASF dual-hosted git repository. maxgekk pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 71f4ee3 [SPARK-37304][SQL] Allow ANSI intervals in v2 `ALTER TABLE .. REPLACE COLUMNS` 71f4ee3 is described below commit 71f4ee38c71734128c5653b8f18a7d0bf1014b6b Author: Max Gekk AuthorDate: Fri Nov 12 17:23:36 2021 +0300 [SPARK-37304][SQL] Allow ANSI intervals in v2 `ALTER TABLE .. REPLACE COLUMNS` ### What changes were proposed in this pull request? In the PR, I propose to allow ANSI intervals: year-month and day-time intervals in the `ALTER TABLE .. REPLACE COLUMNS` command for tables in v2 catalogs (v1 catalogs don't support the command). Also added unified test suite to migrate related tests in the future. ### Why are the changes needed? To improve user experience with Spark SQL. After the changes, users can replace columns with ANSI intervals instead of removing and adding such columns. ### Does this PR introduce _any_ user-facing change? In some sense, yes. After the changes, the command doesn't output any error message. ### How was this patch tested? By running new test suite: ``` $ build/sbt "test:testOnly *AlterTableReplaceColumnsSuite" ``` Closes #34571 from MaxGekk/add-replace-ansi-interval-col. Authored-by: Max Gekk Signed-off-by: Max Gekk --- .../plans/logical/v2AlterTableCommands.scala | 2 +- .../AlterTableReplaceColumnsSuiteBase.scala| 54 ++ .../command/v2/AlterTableReplaceColumnsSuite.scala | 28 +++ 3 files changed, 83 insertions(+), 1 deletion(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2AlterTableCommands.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2AlterTableCommands.scala index 302a810..2eb828e 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2AlterTableCommands.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2AlterTableCommands.scala @@ -134,7 +134,7 @@ case class ReplaceColumns( table: LogicalPlan, columnsToAdd: Seq[QualifiedColType]) extends AlterTableCommand { columnsToAdd.foreach { c => -TypeUtils.failWithIntervalType(c.dataType) +TypeUtils.failWithIntervalType(c.dataType, forbidAnsiIntervals = false) } override lazy val resolved: Boolean = table.resolved && columnsToAdd.forall(_.resolved) diff --git a/sql/core/src/test/scala/org/apache/spark/sql/execution/command/AlterTableReplaceColumnsSuiteBase.scala b/sql/core/src/test/scala/org/apache/spark/sql/execution/command/AlterTableReplaceColumnsSuiteBase.scala new file mode 100644 index 000..fed4076 --- /dev/null +++ b/sql/core/src/test/scala/org/apache/spark/sql/execution/command/AlterTableReplaceColumnsSuiteBase.scala @@ -0,0 +1,54 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.command + +import java.time.{Duration, Period} + +import org.apache.spark.sql.{QueryTest, Row} + +/** + * This base suite contains unified tests for the `ALTER TABLE .. REPLACE COLUMNS` command that + * check the V2 table catalog. The tests that cannot run for all supported catalogs are + * located in more specific test suites: + * + * - V2 table catalog tests: + * `org.apache.spark.sql.execution.command.v2.AlterTableReplaceColumnsSuite` + */ +trait AlterTableReplaceColumnsSuiteBase extends QueryTest with DDLCommandTestUtils { + override val command = "ALTER TABLE .. REPLACE COLUMNS" + + test("SPARK-37304: Replace columns by ANSI intervals") { +withNamespaceAndTable("ns", "tbl") { t => + sql(s"CREATE TABLE $t (ym INTERVAL MONTH, dt INTERVAL HOUR, data STRING) $defaultUsing") + // TODO(SPARK-37303): Uncomment the command below after REPLACE COLUMNS is fixed + // sql(s"INSERT INTO $t SELECT INTERVAL '1' MONTH, INTERVAL '2' HOUR, 'abc'") + sql( +s""" + |ALTER TABLE $t REPLACE COLUMNS ( + | new_ym INTERVAL
[spark] branch master updated: [SPARK-37702][SQL] Use AnalysisContext to track referred temp functions
This is an automated email from the ASF dual-hosted git repository. wenchen pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git The following commit(s) were added to refs/heads/master by this push: new 68a0ab5 [SPARK-37702][SQL] Use AnalysisContext to track referred temp functions 68a0ab5 is described below commit 68a0ab5960e847e0fa1a59da0316d0c111574af4 Author: Linhong Liu AuthorDate: Fri Nov 12 22:20:02 2021 +0800 [SPARK-37702][SQL] Use AnalysisContext to track referred temp functions ### What changes were proposed in this pull request? This PR uses `AnalysisContext` to track the referred temp functions in order to fix a temp function resolution issue when it's registered with a `FunctionBuilder` and referred by a temp view. During temporary view creation, we need to collect all the temp functions and save them to the metadata. So that next time when resolving the view SQL text, the functions can be resolved correctly. But if the temp function is registered with a `FunctionBuilder`, it's not a `UserDefinedExpression` so it cannot be collected as a temp function. As a result, the next time when the analyzer resolves a temp view, the registered function couldn't be found. Example: ```scala val func = CatalogFunction(FunctionIdentifier("tempFunc", None), ...) val builder = (e: Seq[Expression]) => e.head spark.sessionState.catalog.registerFunction(func, Some(builder)) sql("create temp view tv as select tempFunc(a, b) from values (1, 2) t(a, b)") sql("select * from tv").collect() ``` ### Why are the changes needed? bug-fix ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? newly added test cases. Closes #34546 from linhongliu-db/SPARK-37702-ver3. Authored-by: Linhong Liu Signed-off-by: Wenchen Fan --- .../spark/sql/catalyst/analysis/Analyzer.scala | 21 -- .../sql/catalyst/catalog/SessionCatalog.scala | 5 ++ .../spark/sql/catalyst/plans/logical/Command.scala | 5 +- .../sql/catalyst/plans/logical/v2Commands.scala| 8 +-- .../apache/spark/sql/execution/command/views.scala | 82 -- .../spark/sql/execution/datasources/ddl.scala | 6 +- .../spark/sql/SparkSessionExtensionSuite.scala | 10 +++ .../spark/sql/execution/SQLViewTestSuite.scala | 26 ++- .../execution/command/PlanResolutionSuite.scala| 6 +- 9 files changed, 111 insertions(+), 58 deletions(-) diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala index 068886e..26d206f 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala @@ -125,7 +125,11 @@ case class AnalysisContext( maxNestedViewDepth: Int = -1, relationCache: mutable.Map[Seq[String], LogicalPlan] = mutable.Map.empty, referredTempViewNames: Seq[Seq[String]] = Seq.empty, -referredTempFunctionNames: Seq[String] = Seq.empty, +// 1. If we are resolving a view, this field will be restored from the view metadata, +//by calling `AnalysisContext.withAnalysisContext(viewDesc)`. +// 2. If we are not resolving a view, this field will be updated everytime the analyzer +//lookup a temporary function. And export to the view metadata. +referredTempFunctionNames: mutable.Set[String] = mutable.Set.empty, outerPlan: Option[LogicalPlan] = None) object AnalysisContext { @@ -152,11 +156,17 @@ object AnalysisContext { maxNestedViewDepth, originContext.relationCache, viewDesc.viewReferredTempViewNames, - viewDesc.viewReferredTempFunctionNames) + mutable.Set(viewDesc.viewReferredTempFunctionNames: _*)) set(context) try f finally { set(originContext) } } + def withNewAnalysisContext[A](f: => A): A = { +val originContext = value.get() +reset() +try f finally { set(originContext) } + } + def withOuterPlan[A](outerPlan: LogicalPlan)(f: => A): A = { val originContext = value.get() val context = originContext.copy(outerPlan = Some(outerPlan)) @@ -204,11 +214,8 @@ class Analyzer(override val catalogManager: CatalogManager) } override def execute(plan: LogicalPlan): LogicalPlan = { -AnalysisContext.reset() -try { +AnalysisContext.withNewAnalysisContext { executeSameContext(plan) -} finally { - AnalysisContext.reset() } } @@ -3651,7 +3658,7 @@ class Analyzer(override val catalogManager: CatalogManager) _.containsPattern(COMMAND)) { case c: AnalysisOnlyCommand if c.resolved => checkAnalysis(c) -c.markAsAnalyzed() +c.markAsAnalyzed(AnalysisContext.get) }
[spark] branch master updated (9191632 -> a4b8a8d)
This is an automated email from the ASF dual-hosted git repository. maxgekk pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/spark.git. from 9191632 [SPARK-36825][FOLLOWUP] Move the test code from `ParquetIOSuite` to `ParquetFileFormatSuite` add a4b8a8d [SPARK-37294][SQL][TESTS] Check inserting of ANSI intervals into a table partitioned by the interval columns No new revisions were added by this update. Summary of changes: .../spark/sql/connector/DataSourceV2SQLSuite.scala | 35 +- .../org/apache/spark/sql/sources/InsertSuite.scala | 34 + 2 files changed, 68 insertions(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org