Re: [PR] [SPARK-56605][SQL] Wire resolution engine to use SQL PATH for table, function, and variable lookup [spark]

via GitHub Sun, 26 Apr 2026 09:28:08 -0700


cloud-fan commented on code in PR #55523:
URL: https://github.com/apache/spark/pull/55523#discussion_r3143783994



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/v2ResolutionPlans.scala:
##########
@@ -68,15 +68,39 @@ case class UnresolvedView(
     allowTemp: Boolean,
     suggestAlternative: Boolean = false) extends UnresolvedLeafNode
 
+/**
+ * Controls which search path is shown in `TABLE_OR_VIEW_NOT_FOUND` for
+ * [[UnresolvedTableOrView]] (see 
[[org.apache.spark.sql.catalyst.analysis.CheckAnalysis]]).
+ */
+sealed trait UnresolvedTableOrViewSearchPathMode
+
+object UnresolvedTableOrViewSearchPathMode {
+  /** DDL on catalog objects: `system.session` and current catalog namespace 
only. */
+  case object Ddl extends UnresolvedTableOrViewSearchPathMode
+  /**
+   * Like `SELECT` / DML: full `sqlResolutionPathEntries` order; fully 
qualified
+   * `system.session.*` names still use the temp-view-only path in errors.
+   */
+  case object QueryLike extends UnresolvedTableOrViewSearchPathMode
+  /** Commands that only target temp views (e.g. some `DROP TEMPORARY VIEW` 
paths). */
+  case object TempViewOnly extends UnresolvedTableOrViewSearchPathMode

Review Comment:
   `TempViewOnly` has no callers — `rg 
'UnresolvedTableOrViewSearchPathMode\.TempViewOnly'` finds only this definition 
and the pattern match in `CheckAnalysis.scala:398`. The previous 
`commandName.toUpperCase.contains("TEMPORARY VIEW")` branch was already dead 
(no `UnresolvedTableOrView` is constructed with a commandName containing those 
words anywhere in the repo), and this enum case faithfully reproduces that dead 
branch. The prior round wired up `QueryLike` from `visitDescribeRelation`; 
`TempViewOnly` is the remaining unused case — either wire it from a real DDL 
path or drop the case until a use shows up.



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala:
##########
@@ -139,17 +139,22 @@ object FakeV2SessionCatalog extends TableCatalog with 
FunctionCatalog with Suppo
  *                              even if a temp view `t` has been created.
  * @param outerPlan The query plan from the outer query that can be used to 
resolve star
  *                  expressions in a subquery.
+ * @param resolutionPathEntries When resolving a view body, the ordered path 
for unqualified
+ *                              relation names (see 
[[AnalysisContext.withAnalysisContext]]).
+ *                              [[None]] outside views: compute from session
+ *                              [[CatalogManager.sqlResolutionPathEntries]].
  */
 case class AnalysisContext(
     isDefault: Boolean = false,
     catalogAndNamespace: Seq[String] = Nil,
+    resolutionPathEntries: Option[Seq[Seq[String]]] = None,
     nestedViewDepth: Int = 0,
     maxNestedViewDepth: Int = -1,
     relationCache: mutable.Map[(Seq[String], Option[TimeTravelSpec]), 
LogicalPlan] =
       mutable.Map.empty,
     referredTempViewNames: Seq[Seq[String]] = Seq.empty,
     // 1. If we are resolving a view, this field will be restored from the 
view metadata,
-    //    by calling `AnalysisContext.withAnalysisContext(viewDesc)`.
+    //    by calling `AnalysisContext.withAnalysisContext(viewDesc, 
catalogManager)`.

Review Comment:
   This comment was amended in this PR to `withAnalysisContext(viewDesc, 
catalogManager)`, but the actual signature at line 203 takes only `(viewDesc: 
CatalogTable)` — there is no `catalogManager` parameter and the implementation 
was not changed to add one.
   
   ```suggestion
       //    by calling `AnalysisContext.withAnalysisContext(viewDesc)`.
   ```



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionResolution.scala:
##########
@@ -592,6 +634,58 @@ class FunctionResolution(
       errorClass = errorClass,
       messageParameters = messageParameters)
   }
+
+  /**
+   * Resolves [[UnresolvedProcedure]] for `CALL` / `DESCRIBE PROCEDURE` using 
the same multipart
+   * candidates as SQL functions and relations ([[resolutionCandidates]] /
+   * [[sqlResolutionPathEntriesForAnalysis]]). Catalogs that do not implement
+   * [[ProcedureCatalog]] are skipped for unqualified names; an explicitly 
catalog-qualified name
+   * that targets a non-[[ProcedureCatalog]] still raises
+   * [[QueryCompilationErrors.missingCatalogProceduresAbilityError]].
+   */
+  def resolveProcedure(unresolved: UnresolvedProcedure): LogicalPlan = {
+    val candidates = resolutionCandidates(unresolved.nameParts)
+    for (multipart <- candidates) {
+      val expandedOpt =
+        try {
+          Some(relationResolution.expandIdentifier(multipart))
+        } catch {
+          case NonFatal(_) => None
+        }
+      expandedOpt.foreach { expanded =>
+        CatalogAndIdentifier.unapply(expanded).foreach { case (catalog, ident) 
=>
+          catalog match {
+            case pc: ProcedureCatalog =>
+              try {
+                val procedure = pc.loadProcedure(ident)
+                return ResolvedProcedure(pc, ident, procedure)
+              } catch {
+                case e: AnalysisException => throw e
+                case e: SparkThrowable => throw e
+                case NonFatal(e) =>
+                  val cause = e match {
+                    case ex: Exception => ex
+                    case th => new RuntimeException(th)
+                  }
+                  throw QueryCompilationErrors.failedToLoadRoutineError(
+                    catalog.name +: ident.asMultipartIdentifier,
+                    cause)

Review Comment:
   PATH iteration here is inconsistent with how tables and functions handle a 
missing candidate, and the procedure case is the broken one:
   
   - Tables: `RelationResolution.tryResolvePersistent` calls 
`CatalogV2Util.loadTable`, which catches `NoSuchTableException` / 
`NoSuchDatabaseException` and returns `None` (`CatalogV2Util.scala:457-462`) — 
the next `relationResolutionStep` is tried.
   - Functions: `resolveFunctionCandidate` catches `NoSuchFunctionException` / 
`NoSuchNamespaceException` / `CatalogNotFoundException` and returns `None` 
(`FunctionResolution.scala:161-170`) — `resolveFunction`'s for-loop continues.
   - Procedures (this block): every catch arm — `AnalysisException`, 
`SparkThrowable`, `NonFatal` — either rethrows or wraps in 
`failedToLoadRoutineError`, exiting the candidate loop on the first failure.
   
   Connector implementations typically signal not-found by throwing 
(`InMemoryTableCatalog.loadProcedure` throws plain `RuntimeException("Procedure 
not found: " + ident)`), so for `SET PATH = catA, catB` with only `catB.proc` 
defined, the user sees `failedToLoadRoutineError` on `catA.proc` instead of 
resolving to `catB.proc`. To match table/function semantics, the unqualified 
case should treat a candidate failure as "skip and continue" and only fall 
through to `unresolvedRoutineError` after exhausting the path. The sub-question 
is *what* signals not-found — `ProcedureCatalog.loadProcedure` has no 
documented exception type for it — so this likely needs either a defined 
`NoSuchProcedureException` (analogous to `NoSuchFunctionException`) or 
swallowing arbitrary candidate failures for unqualified names. A SetPathSuite 
test for the [catA(missing), catB(present)] case would lock down whichever 
behavior is chosen.



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/RelationResolution.scala:
##########
@@ -110,34 +114,61 @@ class RelationResolution(
 
   /**
    * Scope in the relation resolution search path. Used to interpret
-   * [[SQLConf.resolutionSearchPath]] when resolving unqualified table/view 
names.
+   * [[CatalogManager.sqlResolutionPathEntries]] when resolving unqualified 
table/view names.
    */
-  private sealed trait RelationResolutionScope
-  private case object SessionScope extends RelationResolutionScope
-  private case object PersistentScope extends RelationResolutionScope
+  private sealed trait RelationResolutionStep
+  private case object SessionScopeStep extends RelationResolutionStep
+  private case class PersistentCatalogStep(catalogAndNamespace: Seq[String])
+      extends RelationResolutionStep
+
+  /**
+   * Path entries for unqualified relation resolution.
+   *
+   * Inside a view, [[AnalysisContext.resolutionPathEntries]] will be
+   * populated from the frozen path stored in view metadata (follow-up PR).
+   * When PATH is disabled, legacy resolution rules apply.
+   */
+  private def relationResolutionEntries: Seq[Seq[String]] = {
+    val pinned = AnalysisContext.get.resolutionPathEntries
+    if (pinned.isDefined && conf.pathEnabled) {
+      pinned.get
+    } else {
+      val expandCatalog = catalogManager.currentCatalog.name
+      val expandNamespace = catalogManager.currentNamespace.toSeq
+      val (pathCatalog, pathNamespace) =
+        if (isResolvingView) {
+          val p = AnalysisContext.get.catalogAndNamespace
+          (p.head, p.tail.toSeq)
+        } else {
+          (expandCatalog, expandNamespace)
+        }
+      catalogManager.sqlResolutionPathEntries(
+        pathCatalog,
+        pathNamespace,
+        expandCatalog,
+        expandNamespace)

Review Comment:
   `expandCatalog` / `expandNamespace` are always read from 
`catalogManager.currentCatalog.name` / `currentNamespace`, even inside a view 
body. So with PATH enabled + a stored session path, `CurrentSchemaEntry` 
markers expand using the *session's* current schema instead of the view's 
defining catalog/namespace. This is the mid-state called out in the PR 
description ("frozen path … in a follow-up PR"), but it is genuinely 
user-visible: between this PR and the follow-up, view bodies under PATH will 
resolve unqualified names against the wrong schema whenever the path includes 
`CurrentSchemaEntry` and the session has switched schema relative to the view 
definition. Worth either bypassing PATH inside views until frozen-path is 
plumbed (e.g., `if (isResolvingView) defaultPathOrder(...)`) or at least making 
the asymmetry explicit in the PR description so reviewers can weigh the 
trade-off.



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala:
##########
@@ -139,17 +139,22 @@ object FakeV2SessionCatalog extends TableCatalog with 
FunctionCatalog with Suppo
  *                              even if a temp view `t` has been created.
  * @param outerPlan The query plan from the outer query that can be used to 
resolve star
  *                  expressions in a subquery.
+ * @param resolutionPathEntries When resolving a view body, the ordered path 
for unqualified
+ *                              relation names (see 
[[AnalysisContext.withAnalysisContext]]).
+ *                              [[None]] outside views: compute from session
+ *                              [[CatalogManager.sqlResolutionPathEntries]].

Review Comment:
   The `(see [[AnalysisContext.withAnalysisContext]])` reference is misleading: 
`withAnalysisContext` (lines 203 and 224) does not populate 
`resolutionPathEntries` anywhere in this PR — the field defaults to `None` and 
is only ever read. The PR description explicitly says "Frozen path analysis for 
views/SQL functions … comes in a follow-up PR." A reader who follows the link 
will not find the population code the see-reference implies. Either drop the 
link until the follow-up adds the wiring, or reword to say the population is 
forthcoming.
   
   ```suggestion
    * @param resolutionPathEntries When resolving a view body, the ordered path 
for unqualified
    *                              relation names. Stays [[None]] in this PR; 
population from the
    *                              frozen path stored in view metadata is wired 
in a follow-up.
    *                              Outside views: compute from session
    *                              [[CatalogManager.sqlResolutionPathEntries]].
   ```



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala:
##########
@@ -2078,9 +2101,14 @@ class Analyzer(
                   throw 
QueryCompilationErrors.notAScalarFunctionError(nameParts.mkString("."), f)
 
                 case FunctionType.NotFound =>
-                  val catalogPath =
-                    catalogManager.currentCatalog.name +: 
catalogManager.currentNamespace
-                  val searchPath = 
SQLConf.get.resolutionSearchPath(catalogPath.toSeq)
+                  val catalogPath = {
+                    val ctx = AnalysisContext.get.catalogAndNamespace
+                    if (ctx.nonEmpty) ctx
+                    else (catalogManager.currentCatalog.name +:
+                      catalogManager.currentNamespace).toSeq
+                  }

Review Comment:
   This `ctx = AnalysisContext.get.catalogAndNamespace; if (ctx.nonEmpty) ctx 
else (currentCatalog.name +: currentNamespace).toSeq` block duplicates 
`CheckAnalysis.catalogPathForError` (line 97-101) and 
`FunctionResolution.currentCatalogPath` (line 76-80). The prior round's reply 
mentioned consolidating this into `catalogPathForError`, but 
`catalogPathForError` is `private` to the `CheckAnalysis` trait, so `Analyzer` 
(which mixes in `CheckAnalysis`) can't reuse it. Promoting it to `protected` 
would let this site call `fullSearchPathForError(catalogPathForError)` 
directly, and the `FunctionResolution` helper could share the same path source 
— finishing the consolidation.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [SPARK-56605][SQL] Wire resolution engine to use SQL PATH for table, function, and variable lookup [spark]

Reply via email to