This is an automated email from the ASF dual-hosted git repository.

cloud-fan pushed a commit to branch branch-4.2
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-4.2 by this push:
     new 22f018bea049 [SPARK-55250][SQL][FOLLOWUP] Skip createNamespace for IF 
NOT EXISTS on existing namespace
22f018bea049 is described below

commit 22f018bea049cec7ba627fec34d0ed290199dc5d
Author: Wenchen Fan <[email protected]>
AuthorDate: Thu May 21 18:25:15 2026 +0800

    [SPARK-55250][SQL][FOLLOWUP] Skip createNamespace for IF NOT EXISTS on 
existing namespace
    
    ### What changes were proposed in this pull request?
    
    Follow-up to SPARK-55250. Add a recovery path to 
`CreateNamespaceExec.run()` so `IF NOT EXISTS` is a no-op when the namespace 
already exists, even if the catalog surfaces an error other than 
`NamespaceAlreadyExistsException`:
    
    ```scala
    try {
      val ownership = Map(PROP_OWNER -> Utils.getCurrentUserName())
      catalog.createNamespace(ns, (properties ++ ownership).asJava)
    } catch {
      case _: NamespaceAlreadyExistsException if ifNotExists =>
        logWarning(...)
      case NonFatal(e) if ifNotExists =>
        val exists = try catalog.namespaceExists(ns) catch { case NonFatal(_) 
=> false }
        if (exists) logWarning(..., e) else throw e
    }
    ```
    
    The unconditional `createNamespace` call introduced by SPARK-55250 is 
preserved as the first step, so the perf win from that PR is kept on every 
happy path. The `namespaceExists` fallback runs only when `createNamespace` has 
already failed — a path that was previously an unrecoverable error.
    
    ### Why are the changes needed?
    
    SPARK-55250 changed `CREATE NAMESPACE IF NOT EXISTS foo` from "check 
existence first, skip if present" to "always call `createNamespace`, catch 
`NamespaceAlreadyExistsException`". This relies on the catalog raising 
`NamespaceAlreadyExistsException` rather than some other error when the 
namespace is pre-existing.
    
    For `SupportsNamespaces` implementations that validate the request (ACLs, 
properties, etc.) before checking existence, this assumption doesn't hold: the 
validation error surfaces first, the `NamespaceAlreadyExistsException` is never 
thrown, and the `IF NOT EXISTS` no-op semantic is lost.
    
    The fix asks the only question that actually matters under `IF NOT EXISTS` 
after a failure: "does the namespace exist now?" If yes, intent satisfied; 
otherwise the original error propagates.
    
    ### Does this PR introduce _any_ user-facing change?
    
    Yes. `CREATE NAMESPACE IF NOT EXISTS foo` is again a no-op when `foo` 
already exists, regardless of which error the catalog raised on the create 
attempt. This matches the pre-SPARK-55250 contract.
    
    RPC accounting (using UC as an example of a catalog that validates before 
existence check):
    
    | Scenario | SPARK-55250 | This PR |
    |---|---|---|
    | no `IF NOT EXISTS` | 1 | 1 |
    | `IF NOT EXISTS`, foo absent | 1 | 1 |
    | `IF NOT EXISTS`, foo exists, create succeeds-or-throws-AlreadyExists | 1 
| 1 |
    | `IF NOT EXISTS`, foo exists, create throws something else | 1 (surfaces 
error ❌) | 2 (recovers ✅) |
    
    ### How was this patch tested?
    
    - New `ValidatingInMemoryTableCatalog` (an `InMemoryTableCatalog` subclass 
that validates before checking existence, so a pre-existing namespace raises a 
non-`NamespaceAlreadyExistsException`) registered as `validating_test_catalog` 
in `v2.CommandSuiteBase`.
    - New SQL-level regression test in `v2.CreateNamespaceSuite` that creates a 
namespace and re-runs `CREATE NAMESPACE IF NOT EXISTS` against that catalog — 
fails on master, passes with this PR.
    - Existing 
`org.apache.spark.sql.hive.execution.command.CreateNamespaceSuite` "hive client 
calls" test still asserts exactly 1 RPC for each of the three `CREATE 
NAMESPACE` shapes it covers — confirming SPARK-55250's perf win is preserved on 
the happy path.
    - Existing v1 / v2 `CreateNamespaceSuite` pass.
    
    ### Was this patch authored or co-authored using generative AI tooling?
    
    Generated-by: Claude Code (Opus 4.7)
    
    Closes #56027 from cloud-fan/wenchen/SPARK-55250-followup.
    
    Authored-by: Wenchen Fan <[email protected]>
    Signed-off-by: Wenchen Fan <[email protected]>
    (cherry picked from commit 4131b00ca51b34b0505c498edc9349d5fc6c13c7)
    Signed-off-by: Wenchen Fan <[email protected]>
---
 .../catalog/ValidatingInMemoryTableCatalog.scala}   | 21 ++++++++++++++++-----
 .../datasources/v2/CreateNamespaceExec.scala        | 14 ++++++++++++++
 .../execution/command/v2/CreateNamespaceSuite.scala | 21 +++++++++++++++++++++
 3 files changed, 51 insertions(+), 5 deletions(-)

diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/command/v2/CreateNamespaceSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/connector/catalog/ValidatingInMemoryTableCatalog.scala
similarity index 50%
copy from 
sql/core/src/test/scala/org/apache/spark/sql/execution/command/v2/CreateNamespaceSuite.scala
copy to 
sql/catalyst/src/test/scala/org/apache/spark/sql/connector/catalog/ValidatingInMemoryTableCatalog.scala
index 6b5475a1e267..820f51a2af45 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/command/v2/CreateNamespaceSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/connector/catalog/ValidatingInMemoryTableCatalog.scala
@@ -15,13 +15,24 @@
  * limitations under the License.
  */
 
-package org.apache.spark.sql.execution.command.v2
+package org.apache.spark.sql.connector.catalog
 
-import org.apache.spark.sql.execution.command
+import java.util
 
 /**
- * The class contains tests for the `CREATE NAMESPACE` command to check V2 
table catalogs.
+ * A test catalog whose `createNamespace` validates the request before 
checking existence, so a
+ * pre-existing namespace surfaces a non-`NamespaceAlreadyExistsException` 
error. Mirrors the
+ * authorize-then-execute ordering of catalogs like Unity Catalog and is used 
to exercise the
+ * `IF NOT EXISTS` recovery path in `CreateNamespaceExec`.
  */
-class CreateNamespaceSuite extends command.CreateNamespaceSuiteBase with 
CommandSuiteBase {
-  override def namespace: String = "ns1.ns2"
+class ValidatingInMemoryTableCatalog extends InMemoryTableCatalog {
+  override def createNamespace(
+      namespace: Array[String],
+      metadata: util.Map[String, String]): Unit = {
+    if (namespaceExists(namespace)) {
+      throw new RuntimeException(
+        s"simulated validation failure on pre-existing namespace 
${namespace.mkString(".")}")
+    }
+    super.createNamespace(namespace, metadata)
+  }
 }
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/CreateNamespaceExec.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/CreateNamespaceExec.scala
index 02197a76aa1b..95edbba62dcb 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/CreateNamespaceExec.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/CreateNamespaceExec.scala
@@ -18,6 +18,7 @@
 package org.apache.spark.sql.execution.datasources.v2
 
 import scala.jdk.CollectionConverters.MapHasAsJava
+import scala.util.control.NonFatal
 
 import org.apache.spark.internal.LogKeys.NAMESPACE
 import org.apache.spark.sql.catalyst.InternalRow
@@ -46,6 +47,19 @@ case class CreateNamespaceExec(
       case _: NamespaceAlreadyExistsException if ifNotExists =>
         logWarning(log"Namespace ${MDC(NAMESPACE, namespace.quoted)} was 
created concurrently. " +
           log"Ignoring.")
+      case NonFatal(e) if ifNotExists =>
+        // Some catalogs validate the request (e.g. ACLs, properties) before 
checking existence,
+        // so creating a pre-existing namespace can surface errors unrelated 
to the "already
+        // exists" condition the caller intends to ignore under IF NOT EXISTS. 
If the namespace
+        // really does exist, treat the operation as a no-op; otherwise 
propagate the original
+        // error.
+        val exists = try catalog.namespaceExists(ns) catch { case NonFatal(_) 
=> false }
+        if (exists) {
+          logWarning(log"Namespace ${MDC(NAMESPACE, namespace.quoted)} already 
exists; " +
+            log"swallowing underlying error under IF NOT EXISTS.", e)
+        } else {
+          throw e
+        }
     }
 
     Seq.empty
diff --git 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/command/v2/CreateNamespaceSuite.scala
 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/command/v2/CreateNamespaceSuite.scala
index 6b5475a1e267..973676fe1f63 100644
--- 
a/sql/core/src/test/scala/org/apache/spark/sql/execution/command/v2/CreateNamespaceSuite.scala
+++ 
b/sql/core/src/test/scala/org/apache/spark/sql/execution/command/v2/CreateNamespaceSuite.scala
@@ -17,6 +17,8 @@
 
 package org.apache.spark.sql.execution.command.v2
 
+import org.apache.spark.SparkConf
+import org.apache.spark.sql.connector.catalog.ValidatingInMemoryTableCatalog
 import org.apache.spark.sql.execution.command
 
 /**
@@ -24,4 +26,23 @@ import org.apache.spark.sql.execution.command
  */
 class CreateNamespaceSuite extends command.CreateNamespaceSuiteBase with 
CommandSuiteBase {
   override def namespace: String = "ns1.ns2"
+
+  // A test catalog whose createNamespace validates before checking existence; 
used to
+  // exercise CreateNamespaceExec's IF NOT EXISTS recovery path.
+  private val validatingCatalog: String = "validating_test_catalog"
+
+  override def sparkConf: SparkConf = super.sparkConf
+    .set(s"spark.sql.catalog.$validatingCatalog",
+      classOf[ValidatingInMemoryTableCatalog].getName)
+
+  test("SPARK-55250: IF NOT EXISTS is a no-op on pre-existing namespace even 
when the " +
+    "catalog raises a non-NamespaceAlreadyExistsException error") {
+    val ns = s"$validatingCatalog.$namespace"
+    withNamespace(ns) {
+      sql(s"CREATE NAMESPACE $ns")
+      // Without the IF NOT EXISTS recovery path, this would surface the 
catalog's
+      // pre-existence validation error.
+      sql(s"CREATE NAMESPACE IF NOT EXISTS $ns")
+    }
+  }
 }


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to