cloud-fan commented on code in PR #55523:
URL: https://github.com/apache/spark/pull/55523#discussion_r3143783994
##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/v2ResolutionPlans.scala:
##########
@@ -68,15 +68,39 @@ case class UnresolvedView(
allowTemp: Boolean,
suggestAlternative: Boolean = false) extends UnresolvedLeafNode
+/**
+ * Controls which search path is shown in `TABLE_OR_VIEW_NOT_FOUND` for
+ * [[UnresolvedTableOrView]] (see
[[org.apache.spark.sql.catalyst.analysis.CheckAnalysis]]).
+ */
+sealed trait UnresolvedTableOrViewSearchPathMode
+
+object UnresolvedTableOrViewSearchPathMode {
+ /** DDL on catalog objects: `system.session` and current catalog namespace
only. */
+ case object Ddl extends UnresolvedTableOrViewSearchPathMode
+ /**
+ * Like `SELECT` / DML: full `sqlResolutionPathEntries` order; fully
qualified
+ * `system.session.*` names still use the temp-view-only path in errors.
+ */
+ case object QueryLike extends UnresolvedTableOrViewSearchPathMode
+ /** Commands that only target temp views (e.g. some `DROP TEMPORARY VIEW`
paths). */
+ case object TempViewOnly extends UnresolvedTableOrViewSearchPathMode
Review Comment:
`TempViewOnly` has no callers — `rg
'UnresolvedTableOrViewSearchPathMode\.TempViewOnly'` finds only this definition
and the pattern match in `CheckAnalysis.scala:398`. The previous
`commandName.toUpperCase.contains("TEMPORARY VIEW")` branch was already dead
(no `UnresolvedTableOrView` is constructed with a commandName containing those
words anywhere in the repo), and this enum case faithfully reproduces that dead
branch. The prior round wired up `QueryLike` from `visitDescribeRelation`;
`TempViewOnly` is the remaining unused case — either wire it from a real DDL
path or drop the case until a use shows up.
##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala:
##########
@@ -139,17 +139,22 @@ object FakeV2SessionCatalog extends TableCatalog with
FunctionCatalog with Suppo
* even if a temp view `t` has been created.
* @param outerPlan The query plan from the outer query that can be used to
resolve star
* expressions in a subquery.
+ * @param resolutionPathEntries When resolving a view body, the ordered path
for unqualified
+ * relation names (see
[[AnalysisContext.withAnalysisContext]]).
+ * [[None]] outside views: compute from session
+ * [[CatalogManager.sqlResolutionPathEntries]].
*/
case class AnalysisContext(
isDefault: Boolean = false,
catalogAndNamespace: Seq[String] = Nil,
+ resolutionPathEntries: Option[Seq[Seq[String]]] = None,
nestedViewDepth: Int = 0,
maxNestedViewDepth: Int = -1,
relationCache: mutable.Map[(Seq[String], Option[TimeTravelSpec]),
LogicalPlan] =
mutable.Map.empty,
referredTempViewNames: Seq[Seq[String]] = Seq.empty,
// 1. If we are resolving a view, this field will be restored from the
view metadata,
- // by calling `AnalysisContext.withAnalysisContext(viewDesc)`.
+ // by calling `AnalysisContext.withAnalysisContext(viewDesc,
catalogManager)`.
Review Comment:
This comment was amended in this PR to `withAnalysisContext(viewDesc,
catalogManager)`, but the actual signature at line 203 takes only `(viewDesc:
CatalogTable)` — there is no `catalogManager` parameter and the implementation
was not changed to add one.
```suggestion
// by calling `AnalysisContext.withAnalysisContext(viewDesc)`.
```
##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionResolution.scala:
##########
@@ -592,6 +634,58 @@ class FunctionResolution(
errorClass = errorClass,
messageParameters = messageParameters)
}
+
+ /**
+ * Resolves [[UnresolvedProcedure]] for `CALL` / `DESCRIBE PROCEDURE` using
the same multipart
+ * candidates as SQL functions and relations ([[resolutionCandidates]] /
+ * [[sqlResolutionPathEntriesForAnalysis]]). Catalogs that do not implement
+ * [[ProcedureCatalog]] are skipped for unqualified names; an explicitly
catalog-qualified name
+ * that targets a non-[[ProcedureCatalog]] still raises
+ * [[QueryCompilationErrors.missingCatalogProceduresAbilityError]].
+ */
+ def resolveProcedure(unresolved: UnresolvedProcedure): LogicalPlan = {
+ val candidates = resolutionCandidates(unresolved.nameParts)
+ for (multipart <- candidates) {
+ val expandedOpt =
+ try {
+ Some(relationResolution.expandIdentifier(multipart))
+ } catch {
+ case NonFatal(_) => None
+ }
+ expandedOpt.foreach { expanded =>
+ CatalogAndIdentifier.unapply(expanded).foreach { case (catalog, ident)
=>
+ catalog match {
+ case pc: ProcedureCatalog =>
+ try {
+ val procedure = pc.loadProcedure(ident)
+ return ResolvedProcedure(pc, ident, procedure)
+ } catch {
+ case e: AnalysisException => throw e
+ case e: SparkThrowable => throw e
+ case NonFatal(e) =>
+ val cause = e match {
+ case ex: Exception => ex
+ case th => new RuntimeException(th)
+ }
+ throw QueryCompilationErrors.failedToLoadRoutineError(
+ catalog.name +: ident.asMultipartIdentifier,
+ cause)
Review Comment:
PATH iteration here is inconsistent with how tables and functions handle a
missing candidate, and the procedure case is the broken one:
- Tables: `RelationResolution.tryResolvePersistent` calls
`CatalogV2Util.loadTable`, which catches `NoSuchTableException` /
`NoSuchDatabaseException` and returns `None` (`CatalogV2Util.scala:457-462`) —
the next `relationResolutionStep` is tried.
- Functions: `resolveFunctionCandidate` catches `NoSuchFunctionException` /
`NoSuchNamespaceException` / `CatalogNotFoundException` and returns `None`
(`FunctionResolution.scala:161-170`) — `resolveFunction`'s for-loop continues.
- Procedures (this block): every catch arm — `AnalysisException`,
`SparkThrowable`, `NonFatal` — either rethrows or wraps in
`failedToLoadRoutineError`, exiting the candidate loop on the first failure.
Connector implementations typically signal not-found by throwing
(`InMemoryTableCatalog.loadProcedure` throws plain `RuntimeException("Procedure
not found: " + ident)`), so for `SET PATH = catA, catB` with only `catB.proc`
defined, the user sees `failedToLoadRoutineError` on `catA.proc` instead of
resolving to `catB.proc`. To match table/function semantics, the unqualified
case should treat a candidate failure as "skip and continue" and only fall
through to `unresolvedRoutineError` after exhausting the path. The sub-question
is *what* signals not-found — `ProcedureCatalog.loadProcedure` has no
documented exception type for it — so this likely needs either a defined
`NoSuchProcedureException` (analogous to `NoSuchFunctionException`) or
swallowing arbitrary candidate failures for unqualified names. A SetPathSuite
test for the [catA(missing), catB(present)] case would lock down whichever
behavior is chosen.
##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/RelationResolution.scala:
##########
@@ -110,34 +114,61 @@ class RelationResolution(
/**
* Scope in the relation resolution search path. Used to interpret
- * [[SQLConf.resolutionSearchPath]] when resolving unqualified table/view
names.
+ * [[CatalogManager.sqlResolutionPathEntries]] when resolving unqualified
table/view names.
*/
- private sealed trait RelationResolutionScope
- private case object SessionScope extends RelationResolutionScope
- private case object PersistentScope extends RelationResolutionScope
+ private sealed trait RelationResolutionStep
+ private case object SessionScopeStep extends RelationResolutionStep
+ private case class PersistentCatalogStep(catalogAndNamespace: Seq[String])
+ extends RelationResolutionStep
+
+ /**
+ * Path entries for unqualified relation resolution.
+ *
+ * Inside a view, [[AnalysisContext.resolutionPathEntries]] will be
+ * populated from the frozen path stored in view metadata (follow-up PR).
+ * When PATH is disabled, legacy resolution rules apply.
+ */
+ private def relationResolutionEntries: Seq[Seq[String]] = {
+ val pinned = AnalysisContext.get.resolutionPathEntries
+ if (pinned.isDefined && conf.pathEnabled) {
+ pinned.get
+ } else {
+ val expandCatalog = catalogManager.currentCatalog.name
+ val expandNamespace = catalogManager.currentNamespace.toSeq
+ val (pathCatalog, pathNamespace) =
+ if (isResolvingView) {
+ val p = AnalysisContext.get.catalogAndNamespace
+ (p.head, p.tail.toSeq)
+ } else {
+ (expandCatalog, expandNamespace)
+ }
+ catalogManager.sqlResolutionPathEntries(
+ pathCatalog,
+ pathNamespace,
+ expandCatalog,
+ expandNamespace)
Review Comment:
`expandCatalog` / `expandNamespace` are always read from
`catalogManager.currentCatalog.name` / `currentNamespace`, even inside a view
body. So with PATH enabled + a stored session path, `CurrentSchemaEntry`
markers expand using the *session's* current schema instead of the view's
defining catalog/namespace. This is the mid-state called out in the PR
description ("frozen path … in a follow-up PR"), but it is genuinely
user-visible: between this PR and the follow-up, view bodies under PATH will
resolve unqualified names against the wrong schema whenever the path includes
`CurrentSchemaEntry` and the session has switched schema relative to the view
definition. Worth either bypassing PATH inside views until frozen-path is
plumbed (e.g., `if (isResolvingView) defaultPathOrder(...)`) or at least making
the asymmetry explicit in the PR description so reviewers can weigh the
trade-off.
##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala:
##########
@@ -139,17 +139,22 @@ object FakeV2SessionCatalog extends TableCatalog with
FunctionCatalog with Suppo
* even if a temp view `t` has been created.
* @param outerPlan The query plan from the outer query that can be used to
resolve star
* expressions in a subquery.
+ * @param resolutionPathEntries When resolving a view body, the ordered path
for unqualified
+ * relation names (see
[[AnalysisContext.withAnalysisContext]]).
+ * [[None]] outside views: compute from session
+ * [[CatalogManager.sqlResolutionPathEntries]].
Review Comment:
The `(see [[AnalysisContext.withAnalysisContext]])` reference is misleading:
`withAnalysisContext` (lines 203 and 224) does not populate
`resolutionPathEntries` anywhere in this PR — the field defaults to `None` and
is only ever read. The PR description explicitly says "Frozen path analysis for
views/SQL functions … comes in a follow-up PR." A reader who follows the link
will not find the population code the see-reference implies. Either drop the
link until the follow-up adds the wiring, or reword to say the population is
forthcoming.
```suggestion
* @param resolutionPathEntries When resolving a view body, the ordered path
for unqualified
* relation names. Stays [[None]] in this PR;
population from the
* frozen path stored in view metadata is wired
in a follow-up.
* Outside views: compute from session
* [[CatalogManager.sqlResolutionPathEntries]].
```
##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala:
##########
@@ -2078,9 +2101,14 @@ class Analyzer(
throw
QueryCompilationErrors.notAScalarFunctionError(nameParts.mkString("."), f)
case FunctionType.NotFound =>
- val catalogPath =
- catalogManager.currentCatalog.name +:
catalogManager.currentNamespace
- val searchPath =
SQLConf.get.resolutionSearchPath(catalogPath.toSeq)
+ val catalogPath = {
+ val ctx = AnalysisContext.get.catalogAndNamespace
+ if (ctx.nonEmpty) ctx
+ else (catalogManager.currentCatalog.name +:
+ catalogManager.currentNamespace).toSeq
+ }
Review Comment:
This `ctx = AnalysisContext.get.catalogAndNamespace; if (ctx.nonEmpty) ctx
else (currentCatalog.name +: currentNamespace).toSeq` block duplicates
`CheckAnalysis.catalogPathForError` (line 97-101) and
`FunctionResolution.currentCatalogPath` (line 76-80). The prior round's reply
mentioned consolidating this into `catalogPathForError`, but
`catalogPathForError` is `private` to the `CheckAnalysis` trait, so `Analyzer`
(which mixes in `CheckAnalysis`) can't reuse it. Promoting it to `protected`
would let this site call `fullSearchPathForError(catalogPathForError)`
directly, and the `FunctionResolution` helper could share the same path source
— finishing the consolidation.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]