srielau commented on code in PR #55977:
URL: https://github.com/apache/spark/pull/55977#discussion_r3267793373


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/CatalogManager.scala:
##########
@@ -265,6 +275,42 @@ class CatalogManager(
       currentCatalog, currentNamespace,
       currentCatalog, currentNamespace)
 
+  /**
+   * Snapshot the live PATH-derived [[SessionCatalog.SessionFunctionKind]] 
order used by
+   * unqualified function/table-function resolution.
+   *
+   * The kinds list is computed by reading the current catalog, current 
namespace, and the

Review Comment:
   Good catch. Reworked the docstring (commit ecf913ab) to spell out the 
contract: the kinds extraction in `systemFunctionKindsFromPath` only retains 
literal `system.builtin` / `system.session` entries, so even if `v1CurrentDb` 
lags by one `USE SCHEMA`, a stale `CURRENT_SCHEMA` expansion can change the 
path **content** but never the **kinds list** that this helper returns. Added 
that justification + a forward pointer to the deadlock cycle so future readers 
don't lift the v1 read inside the lock.



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/CatalogManager.scala:
##########
@@ -330,15 +376,32 @@ class CatalogManager(
     
catalog(_currentCatalogName.getOrElse(conf.getConf(SQLConf.DEFAULT_CATALOG)))
   }
 
-  def setCurrentCatalog(catalogName: String): Unit = synchronized {
-    // `setCurrentCatalog` is noop if it doesn't switch to a different catalog.
-    if (currentCatalog.name() != catalogName) {
-      catalog(catalogName)
-      _currentCatalogName = Some(catalogName)
-      _currentNamespace = None
+  def setCurrentCatalog(catalogName: String): Unit = {
+    // SPARK-56939: see [[setCurrentNamespace]]. Avoid nesting 
[[CatalogManager]]'s lock
+    // across [[v1SessionCatalog.setCurrentDatabase]] (which synchronizes on
+    // [[SessionCatalog]]) to prevent a lock-order inversion with concurrent 
unqualified
+    // function resolution.
+    val needsSwitch = synchronized {
+      // `setCurrentCatalog` is noop if it doesn't switch to a different 
catalog.
+      if (currentCatalog.name() != catalogName) {
+        // Force-load the named catalog while holding the manager lock to keep 
the
+        // not-found error semantics; if loading fails, throw before mutating 
state.
+        catalog(catalogName)
+        true
+      } else {
+        false
+      }
+    }
+    if (needsSwitch) {
       // Reset the current database of v1 `SessionCatalog` when switching 
current catalog, so that
       // when we switch back to session catalog, the current namespace 
definitely is ["default"].
+      // Run this BEFORE publishing the new catalog name so that if a reader 
observes the new

Review Comment:
   Fair point -- the prior comment only told the half-story that the new 
ordering forbids. Expanded it (ecf913ab) to also acknowledge the symmetric 
window: between `v1.setCurrentDatabase(default)` and the publish, a concurrent 
reader of `currentNamespace` sees `(oldCatalog, v1.currentDb = default)`, which 
when the old catalog was the session catalog briefly hides the user's previous 
namespace. Framed both halves as a concurrency trade-off against the deadlock 
alternative so a future maintainer can't read the comment as "strictly better." 
Did the same for `setCurrentNamespace`'s drifted `isSession` snapshot.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to