srielau commented on code in PR #55523:
URL: https://github.com/apache/spark/pull/55523#discussion_r3142317605
##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala:
##########
@@ -981,7 +995,8 @@ class Analyzer(
// This is done by keeping the catalog and namespace in `AnalysisContext`,
and analyzer will
// look at `AnalysisContext.catalogAndNamespace` when resolving relations
with single-part name.
// If `AnalysisContext.catalogAndNamespace` is non-empty, analyzer will
expand single-part names
- // with it, instead of current catalog and namespace.
+ // with it, instead of current catalog and namespace. Unqualified relation
PATH is snapshotted
+ // in `AnalysisContext.resolutionPathEntries` while resolving each view
body.
Review Comment:
This comment claims unqualified relation PATH "is snapshotted in
`AnalysisContext.resolutionPathEntries` while resolving each view body," but
nothing in this PR sets that field — both
`AnalysisContext.withAnalysisContext(viewDesc)` and
`withAnalysisContext(function)` leave `resolutionPathEntries` at the default
`None`, and the PR description explicitly says "Frozen path analysis for
views/SQL functions ... comes in a follow-up PR." Either rephrase to make it
clear the wiring is forthcoming (e.g., "will be snapshotted ... in a
follow-up") or drop the sentence — as written, a reader following the trail
won't find the setter.
##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala:
##########
@@ -317,20 +329,22 @@ class Analyzer(
if (plan.analyzed) {
plan
} else {
+ def runAnalysis(): LogicalPlan = HybridAnalyzer.fromLegacyAnalyzer(
+ legacyAnalyzer = this, tracker = tracker).apply(plan)
+ def runWithConf(): LogicalPlan = sessionConf match {
+ case Some(c) => SQLConf.withExistingConf(c) { runAnalysis() }
+ case None => runAnalysis()
+ }
Review Comment:
`executeAndCheck` wraps the run in `SQLConf.withExistingConf(c)` when
`sessionConf = Some(c)`, but the public `execute` method just below (line
353-357) does not — it calls `executeSameContext` → `super.execute(plan)`
directly. Callers that go through `analyzer.execute` (e.g.,
`RelationalGroupedDataset.scala:647`, `HiveSerDeSuite`,
`WatermarkColumnResolutionSuite`, `CTEInlineSuite`, `DDLParserSuite`,
`AlignAssignmentsSuiteBase`, `PlanResolutionSuite`) therefore run the
resolution machinery against whatever `SQLConf.get` returns, not the captured
`sessionConf`. In production this is masked because the active session's conf
already matches, but it undermines the analyzer-isolation guarantee
`sessionConf` is meant to provide for tests like `ProtoToParsedPlanTestSuite`'s
`analyzerIsolationConf`. Pull the `withExistingConf` wrap into a helper that
both entry points share (or push it down into `executeSameContext`).
##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala:
##########
@@ -45,6 +43,11 @@ trait CheckAnalysis extends LookupCatalog with
QueryErrorsBase with PlanToString
protected def isView(nameParts: Seq[String]): Boolean
+ protected def conf: org.apache.spark.sql.internal.SQLConf
+
+ /** Conf for routine resolution/errors; override in Analyzer to use session
conf. */
+ protected def confForRoutineResolution: SQLConf = conf
Review Comment:
`confForRoutineResolution` is defined here and overridden in
`Analyzer.scala:305` (`override protected def confForRoutineResolution: SQLConf
= resolutionConf`), but `rg confForRoutineResolution` finds no read sites. Pure
dead code. Either delete the definition + override, or wire it into the actual
error sites — e.g., `case u: UnresolvedFunctionName` at
`CheckAnalysis.scala:408-423` currently reads
`catalogManager.sqlResolutionPathEntries` directly without going through this
hook, which is presumably why the override exists in the first place.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]