rdblue commented on a change in pull request #24560: [SPARK-27661][SQL] Add 
SupportsNamespaces API
URL: https://github.com/apache/spark/pull/24560#discussion_r308819205
 
 

 ##########
 File path: 
sql/catalyst/src/main/java/org/apache/spark/sql/catalog/v2/SupportsNamespaces.java
 ##########
 @@ -0,0 +1,145 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalog.v2;
+
+import org.apache.spark.sql.catalyst.analysis.NamespaceAlreadyExistsException;
+import org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException;
+
+import java.util.Map;
+
+/**
+ * Catalog methods for working with namespaces.
+ * <p>
+ * If an object such as a table, view, or function exists, its parent 
namespaces must also exist
+ * and must be returned by the discovery methods {@link #listNamespaces()} and
+ * {@link #listNamespaces(String[])}.
+ * <p>
+ * Catalog implementations are not required to maintain the existence of 
namespaces independent of
+ * objects in a namespace. For example, a function catalog that loads 
functions using reflection
+ * and uses Java packages as namespaces is not required to support the methods 
to create, alter, or
+ * drop a namespace. Implementations are allowed to discover the existence of 
objects or namespaces
+ * without throwing {@link NoSuchNamespaceException} when no namespace is 
found.
+ */
+public interface SupportsNamespaces extends CatalogPlugin {
+
+  /**
+   * Return a default namespace for the catalog.
+   * <p>
+   * When this catalog is set as the current catalog, the namespace returned 
by this method will be
+   * set as the current namespace.
+   * <p>
+   * The namespace returned by this method is not required to exist.
+   *
+   * @return a multi-part namespace
+   */
+  default String[] defaultNamespace() {
+    return new String[0];
+  }
+
+  /**
+   * List top-level namespaces from the catalog.
+   * <p>
+   * If an object such as a table, view, or function exists, its parent 
namespaces must also exist
+   * and must be returned by this discovery method. For example, if table 
a.b.t exists, this method
+   * must return ["a"] in the result array.
+   *
+   * @return an array of multi-part namespace names
+   */
+  String[][] listNamespaces() throws NoSuchNamespaceException;
+
+  /**
+   * List namespaces in a namespace.
+   * <p>
+   * If an object such as a table, view, or function exists, its parent 
namespaces must also exist
+   * and must be returned by this discovery method. For example, if table 
a.b.t exists, this method
+   * invoked as listNamespaces(["a"]) must return ["a", "b"] in the result 
array.
+   *
+   * @param namespace a multi-part namespace
+   * @return an array of multi-part namespace names
+   * @throws NoSuchNamespaceException If the namespace does not exist 
(optional)
+   */
+  String[][] listNamespaces(String[] namespace) throws 
NoSuchNamespaceException;
+
+  /**
+   * Test whether a namespace exists.
+   * <p>
+   * If an object such as a table, view, or function exists, its parent 
namespaces must also exist.
+   * For example, if table a.b.t exists, this method invoked as 
namespaceExists(["a"]) or
+   * namespaceExists(["a", "b"]) must return true.
+   *
+   * @param namespace a multi-part namespace
+   * @return true if the namespace exists, false otherwise
+   */
+  default boolean namespaceExists(String[] namespace) {
+    try {
+      loadNamespaceMetadata(namespace);
+      return true;
+    } catch (NoSuchNamespaceException e) {
+      return false;
+    }
+  }
+
+  /**
+   * Load metadata properties for a namespace.
+   *
+   * @param namespace a multi-part namespace
+   * @return a string map of properties for the given namespace
+   * @throws NoSuchNamespaceException If the namespace does not exist 
(optional)
+   * @throws UnsupportedOperationException If namespace properties are not 
supported
+   */
+  Map<String, String> loadNamespaceMetadata(String[] namespace) throws 
NoSuchNamespaceException;
+
+  /**
+   * Create a namespace in the catalog.
+   *
+   * @param namespace a multi-part namespace
+   * @param metadata a string map of properties for the given namespace
+   * @throws NamespaceAlreadyExistsException If the namespace already exists
+   * @throws UnsupportedOperationException If create is not a supported 
operation
+   */
+  void createNamespaceMetadata(
 
 Review comment:
   @cloud-fan, some sources like JDBC and Hive will throw an exception when the 
namespace doesn't exist. The problem is that we don't want to require this for 
all sources where it doesn't make sense, like a catalog that stores tables in 
an object store. Object stores don't necessarily have the concept of a 
directory that exists independently (s3 FS wrappers have conventions like 
`_$folder$` objects).
   
   That's why failing to create a table if the namespace doesn't exist is 
optional, and creating namespaces implicitly is allowed. This isn't something 
we can (or should) change.
   
   So the question is how do we handle this in this API? I think that it makes 
sense to allow namespace creation that is independent of tables -- by creating 
namespace metadata.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to