[GitHub] [spark] jzhuge commented on a diff in pull request #37556: [SPARK-39799][SQL] DataSourceV2: View catalog interface

2022-11-21 Thread GitBox


jzhuge commented on code in PR #37556:
URL: https://github.com/apache/spark/pull/37556#discussion_r1028346195


##
sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/View.java:
##
@@ -0,0 +1,74 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connector.catalog;
+
+import java.util.Map;
+
+import org.apache.spark.annotation.DeveloperApi;
+import org.apache.spark.sql.types.StructType;
+
+/**
+ * An interface representing a persisted view.
+ */
+@DeveloperApi
+public interface View {
+  /**
+   * A name to identify this view.
+   */
+  String name();
+
+  /**
+   * The view query SQL text.
+   */
+  String query();
+
+  /**
+   * The current catalog when the view is created.
+   */
+  String currentCatalog();
+
+  /**
+   * The current namespace when the view is created.
+   */
+  String[] currentNamespace();
+
+  /**
+   * The schema for the view when the view is created after applying column 
aliases.
+   */
+  StructType schema();
+
+  /**
+   * The output column names of the query that creates this view.
+   */
+  String[] queryColumnNames();
+
+  /**
+   * The view column aliases.
+   */
+  String[] columnAliases();
+
+  /**
+   * The view column comments.
+   */
+  String[] columnComments();
+
+  /**
+   * The view properties.

Review Comment:
   Thanks for the feedback!
   
   The goal is to capture the entire CREATE VIEW syntax in API. Leave the 
choice to catalog plugin developers on whether to use properties to implement 
the APIs. I'd expect HMS-backed view catalog implementation will continue to do 
so the same as the current v1 implementation. On the other hand, view catalog 
plugins that support more free-form storage formats such as JSON can choose a 
different approach.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] jzhuge commented on a diff in pull request #37556: [SPARK-39799][SQL] DataSourceV2: View catalog interface

2022-11-05 Thread GitBox


jzhuge commented on code in PR #37556:
URL: https://github.com/apache/spark/pull/37556#discussion_r1014686160


##
sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/ViewCatalog.java:
##
@@ -0,0 +1,157 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connector.catalog;
+
+import java.util.Map;
+
+import org.apache.spark.annotation.DeveloperApi;
+import org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException;
+import org.apache.spark.sql.catalyst.analysis.NoSuchViewException;
+import org.apache.spark.sql.catalyst.analysis.ViewAlreadyExistsException;
+import org.apache.spark.sql.types.StructType;
+
+/**
+ * Catalog methods for working with views.
+ */
+@DeveloperApi
+public interface ViewCatalog extends CatalogPlugin {

Review Comment:
   Indeed they are similar the table properties. I hoped to expose them with 
better APIs instead of string constants, but we can follow up (possibly in 
Builder). So will add them back.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] jzhuge commented on a diff in pull request #37556: [SPARK-39799][SQL] DataSourceV2: View catalog interface

2022-11-05 Thread GitBox


jzhuge commented on code in PR #37556:
URL: https://github.com/apache/spark/pull/37556#discussion_r1014686160


##
sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/ViewCatalog.java:
##
@@ -0,0 +1,157 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connector.catalog;
+
+import java.util.Map;
+
+import org.apache.spark.annotation.DeveloperApi;
+import org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException;
+import org.apache.spark.sql.catalyst.analysis.NoSuchViewException;
+import org.apache.spark.sql.catalyst.analysis.ViewAlreadyExistsException;
+import org.apache.spark.sql.types.StructType;
+
+/**
+ * Catalog methods for working with views.
+ */
+@DeveloperApi
+public interface ViewCatalog extends CatalogPlugin {

Review Comment:
   They are similar these table properties. I had hoped to expose them with 
better APIs instead of string constants, but we can follow up possibly in 
Builder. So I will add them.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] jzhuge commented on a diff in pull request #37556: [SPARK-39799][SQL] DataSourceV2: View catalog interface

2022-11-04 Thread GitBox


jzhuge commented on code in PR #37556:
URL: https://github.com/apache/spark/pull/37556#discussion_r1014549689


##
sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/ViewCatalog.java:
##
@@ -0,0 +1,157 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connector.catalog;
+
+import java.util.Map;
+
+import org.apache.spark.annotation.DeveloperApi;
+import org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException;
+import org.apache.spark.sql.catalyst.analysis.NoSuchViewException;
+import org.apache.spark.sql.catalyst.analysis.ViewAlreadyExistsException;
+import org.apache.spark.sql.types.StructType;
+
+/**
+ * Catalog methods for working with views.
+ */
+@DeveloperApi
+public interface ViewCatalog extends CatalogPlugin {

Review Comment:
   I left them out as they felt more "implementation" then "API". Do they 
belong to ViewCatalog?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] jzhuge commented on a diff in pull request #37556: [SPARK-39799][SQL] DataSourceV2: View catalog interface

2022-11-04 Thread GitBox


jzhuge commented on code in PR #37556:
URL: https://github.com/apache/spark/pull/37556#discussion_r1014532124


##
sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/ViewCatalog.java:
##
@@ -0,0 +1,155 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connector.catalog;
+
+import java.util.Map;
+
+import org.apache.spark.annotation.DeveloperApi;
+import org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException;
+import org.apache.spark.sql.catalyst.analysis.NoSuchViewException;
+import org.apache.spark.sql.catalyst.analysis.ViewAlreadyExistsException;
+import org.apache.spark.sql.types.StructType;
+
+/**
+ * Catalog methods for working with views.
+ */
+@DeveloperApi
+public interface ViewCatalog extends CatalogPlugin {
+
+  /**
+   * List the views in a namespace from the catalog.
+   * 
+   * If the catalog supports tables, this must return identifiers for only 
views and not tables.
+   *
+   * @param namespace a multi-part namespace
+   * @return an array of Identifiers for views
+   * @throws NoSuchNamespaceException If the namespace does not exist 
(optional).
+   */
+  Identifier[] listViews(String... namespace) throws NoSuchNamespaceException;
+
+  /**
+   * Load view metadata by {@link Identifier ident} from the catalog.
+   * 
+   * If the catalog supports tables and contains a table for the identifier 
and not a view,
+   * this must throw {@link NoSuchViewException}.
+   *
+   * @param ident a view identifier
+   * @return the view description
+   * @throws NoSuchViewException If the view doesn't exist or is a table
+   */
+  View loadView(Identifier ident) throws NoSuchViewException;
+
+  /**
+   * Invalidate cached view metadata for an {@link Identifier identifier}.
+   * 
+   * If the view is already loaded or cached, drop cached data. If the view 
does not exist or is
+   * not cached, do nothing. Calling this method should not query remote 
services.
+   *
+   * @param ident a view identifier
+   */
+  default void invalidateView(Identifier ident) {
+  }
+
+  /**
+   * Test whether a view exists using an {@link Identifier identifier} from 
the catalog.
+   * 
+   * If the catalog supports views and contains a view for the identifier and 
not a table,
+   * this must return false.
+   *
+   * @param ident a view identifier
+   * @return true if the view exists, false otherwise
+   */
+  default boolean viewExists(Identifier ident) {
+try {
+  return loadView(ident) != null;
+} catch (NoSuchViewException e) {
+  return false;
+}
+  }
+
+  /**
+   * Create a view in the catalog.
+   *
+   * @param ident a view identifier
+   * @param sql the SQL text that defines the view
+   * @param currentCatalog the current catalog
+   * @param currentNamespace the current namespace
+   * @param schema the view query output schema
+   * @param columnAliases the column aliases
+   * @param columnComments the column comments
+   * @param properties the view properties
+   * @return the view created
+   * @throws ViewAlreadyExistsException If a view or table already exists for 
the identifier
+   * @throws NoSuchNamespaceException If the identifier namespace does not 
exist (optional)
+   */
+  View createView(
+  Identifier ident,
+  String sql,
+  String currentCatalog,
+  String[] currentNamespace,
+  StructType schema,
+  String[] columnAliases,

Review Comment:
   How about builder?
   ```
   public interface ViewBuilder {
   
 ViewBuilder withQuery(String query);
 ViewBuilder withCurrentCatalog(String defaultCatalog);
 ViewBuilder withCurrentNamespace(String[] defaultNamespaces);
 ViewBuilder withSchema(StructType schema);
 ViewBuilder withQueryColumnNames(String[] queryColumnNames);
 ViewBuilder withColumnAliases(String[] columnAliases);
 ViewBuilder withColumnComments(String[] columnComments);
 ViewBuilder withProperties(Map properties);
 ViewBuilder withProperty(String key, String value);
 View create();
 View replace();
 View createOrReplace();
   }
   
   ViewCatalog {
 ViewBuilder buildView(Identifier ident);
   }
   ```



##
sql/catalyst/src/main/java/org/apache/spark

[GitHub] [spark] jzhuge commented on a diff in pull request #37556: [SPARK-39799][SQL] DataSourceV2: View catalog interface

2022-11-04 Thread GitBox


jzhuge commented on code in PR #37556:
URL: https://github.com/apache/spark/pull/37556#discussion_r1014532124


##
sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/ViewCatalog.java:
##
@@ -0,0 +1,155 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connector.catalog;
+
+import java.util.Map;
+
+import org.apache.spark.annotation.DeveloperApi;
+import org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException;
+import org.apache.spark.sql.catalyst.analysis.NoSuchViewException;
+import org.apache.spark.sql.catalyst.analysis.ViewAlreadyExistsException;
+import org.apache.spark.sql.types.StructType;
+
+/**
+ * Catalog methods for working with views.
+ */
+@DeveloperApi
+public interface ViewCatalog extends CatalogPlugin {
+
+  /**
+   * List the views in a namespace from the catalog.
+   * 
+   * If the catalog supports tables, this must return identifiers for only 
views and not tables.
+   *
+   * @param namespace a multi-part namespace
+   * @return an array of Identifiers for views
+   * @throws NoSuchNamespaceException If the namespace does not exist 
(optional).
+   */
+  Identifier[] listViews(String... namespace) throws NoSuchNamespaceException;
+
+  /**
+   * Load view metadata by {@link Identifier ident} from the catalog.
+   * 
+   * If the catalog supports tables and contains a table for the identifier 
and not a view,
+   * this must throw {@link NoSuchViewException}.
+   *
+   * @param ident a view identifier
+   * @return the view description
+   * @throws NoSuchViewException If the view doesn't exist or is a table
+   */
+  View loadView(Identifier ident) throws NoSuchViewException;
+
+  /**
+   * Invalidate cached view metadata for an {@link Identifier identifier}.
+   * 
+   * If the view is already loaded or cached, drop cached data. If the view 
does not exist or is
+   * not cached, do nothing. Calling this method should not query remote 
services.
+   *
+   * @param ident a view identifier
+   */
+  default void invalidateView(Identifier ident) {
+  }
+
+  /**
+   * Test whether a view exists using an {@link Identifier identifier} from 
the catalog.
+   * 
+   * If the catalog supports views and contains a view for the identifier and 
not a table,
+   * this must return false.
+   *
+   * @param ident a view identifier
+   * @return true if the view exists, false otherwise
+   */
+  default boolean viewExists(Identifier ident) {
+try {
+  return loadView(ident) != null;
+} catch (NoSuchViewException e) {
+  return false;
+}
+  }
+
+  /**
+   * Create a view in the catalog.
+   *
+   * @param ident a view identifier
+   * @param sql the SQL text that defines the view
+   * @param currentCatalog the current catalog
+   * @param currentNamespace the current namespace
+   * @param schema the view query output schema
+   * @param columnAliases the column aliases
+   * @param columnComments the column comments
+   * @param properties the view properties
+   * @return the view created
+   * @throws ViewAlreadyExistsException If a view or table already exists for 
the identifier
+   * @throws NoSuchNamespaceException If the identifier namespace does not 
exist (optional)
+   */
+  View createView(
+  Identifier ident,
+  String sql,
+  String currentCatalog,
+  String[] currentNamespace,
+  StructType schema,
+  String[] columnAliases,

Review Comment:
   How about builder?
   ```
   public interface ViewBuilder {
   
 ViewBuilder withQuery(String query);
 ViewBuilder withCurrentCatalog(String defaultCatalog);
 ViewBuilder withCurrentNamespace(String[] defaultNamespaces);
 ViewBuilder withSchema(StructType schema);
 ViewBuilder withQueryColumnNames(String[] queryColumnNames);
 ViewBuilder withColumnAliases(String[] columnAliases);
 ViewBuilder withColumnComments(String[] columnComments);
 ViewBuilder withProperties(Map properties);
 ViewBuilder withProperty(String key, String value);
 View create();
 View replace();
 View createOrReplace();
   }
   
   ViewCatalog {
 ViewBuilder buildView(Identifier ident);
   }
   ```



-- 
This is an automated message from the Apache Git S

[GitHub] [spark] jzhuge commented on a diff in pull request #37556: [SPARK-39799][SQL] DataSourceV2: View catalog interface

2022-11-04 Thread GitBox


jzhuge commented on code in PR #37556:
URL: https://github.com/apache/spark/pull/37556#discussion_r1014530432


##
sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/ViewCatalog.java:
##
@@ -0,0 +1,155 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connector.catalog;
+
+import java.util.Map;
+
+import org.apache.spark.annotation.DeveloperApi;
+import org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException;
+import org.apache.spark.sql.catalyst.analysis.NoSuchViewException;
+import org.apache.spark.sql.catalyst.analysis.ViewAlreadyExistsException;
+import org.apache.spark.sql.types.StructType;
+
+/**
+ * Catalog methods for working with views.
+ */
+@DeveloperApi
+public interface ViewCatalog extends CatalogPlugin {
+
+  /**
+   * List the views in a namespace from the catalog.
+   * 
+   * If the catalog supports tables, this must return identifiers for only 
views and not tables.
+   *
+   * @param namespace a multi-part namespace
+   * @return an array of Identifiers for views
+   * @throws NoSuchNamespaceException If the namespace does not exist 
(optional).
+   */
+  Identifier[] listViews(String... namespace) throws NoSuchNamespaceException;
+
+  /**
+   * Load view metadata by {@link Identifier ident} from the catalog.
+   * 
+   * If the catalog supports tables and contains a table for the identifier 
and not a view,
+   * this must throw {@link NoSuchViewException}.
+   *
+   * @param ident a view identifier
+   * @return the view description
+   * @throws NoSuchViewException If the view doesn't exist or is a table
+   */
+  View loadView(Identifier ident) throws NoSuchViewException;
+
+  /**
+   * Invalidate cached view metadata for an {@link Identifier identifier}.
+   * 
+   * If the view is already loaded or cached, drop cached data. If the view 
does not exist or is
+   * not cached, do nothing. Calling this method should not query remote 
services.
+   *
+   * @param ident a view identifier
+   */
+  default void invalidateView(Identifier ident) {
+  }
+
+  /**
+   * Test whether a view exists using an {@link Identifier identifier} from 
the catalog.
+   * 
+   * If the catalog supports views and contains a view for the identifier and 
not a table,
+   * this must return false.
+   *
+   * @param ident a view identifier
+   * @return true if the view exists, false otherwise
+   */
+  default boolean viewExists(Identifier ident) {
+try {
+  return loadView(ident) != null;
+} catch (NoSuchViewException e) {
+  return false;
+}
+  }
+
+  /**
+   * Create a view in the catalog.
+   *
+   * @param ident a view identifier
+   * @param sql the SQL text that defines the view
+   * @param currentCatalog the current catalog
+   * @param currentNamespace the current namespace
+   * @param schema the view query output schema
+   * @param columnAliases the column aliases
+   * @param columnComments the column comments
+   * @param properties the view properties
+   * @return the view created
+   * @throws ViewAlreadyExistsException If a view or table already exists for 
the identifier
+   * @throws NoSuchNamespaceException If the identifier namespace does not 
exist (optional)
+   */
+  View createView(
+  Identifier ident,
+  String sql,
+  String currentCatalog,
+  String[] currentNamespace,
+  StructType schema,
+  String[] columnAliases,

Review Comment:
   I'd hesitate to do so because these have different meanings: one is more 
like "CreateViewRequest", the other like "ViewMetadata". Using the same 
interface may cause confusion and evolving one of them in the future.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] jzhuge commented on a diff in pull request #37556: [SPARK-39799][SQL] DataSourceV2: View catalog interface

2022-11-04 Thread GitBox


jzhuge commented on code in PR #37556:
URL: https://github.com/apache/spark/pull/37556#discussion_r1014523963


##
sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/ViewCatalog.java:
##
@@ -0,0 +1,155 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connector.catalog;
+
+import java.util.Map;
+
+import org.apache.spark.annotation.DeveloperApi;
+import org.apache.spark.sql.catalyst.analysis.NoSuchNamespaceException;
+import org.apache.spark.sql.catalyst.analysis.NoSuchViewException;
+import org.apache.spark.sql.catalyst.analysis.ViewAlreadyExistsException;
+import org.apache.spark.sql.types.StructType;
+
+/**
+ * Catalog methods for working with views.
+ */
+@DeveloperApi
+public interface ViewCatalog extends CatalogPlugin {
+
+  /**
+   * List the views in a namespace from the catalog.
+   * 
+   * If the catalog supports tables, this must return identifiers for only 
views and not tables.
+   *
+   * @param namespace a multi-part namespace
+   * @return an array of Identifiers for views
+   * @throws NoSuchNamespaceException If the namespace does not exist 
(optional).
+   */
+  Identifier[] listViews(String... namespace) throws NoSuchNamespaceException;
+
+  /**
+   * Load view metadata by {@link Identifier ident} from the catalog.
+   * 
+   * If the catalog supports tables and contains a table for the identifier 
and not a view,
+   * this must throw {@link NoSuchViewException}.
+   *
+   * @param ident a view identifier
+   * @return the view description
+   * @throws NoSuchViewException If the view doesn't exist or is a table
+   */
+  View loadView(Identifier ident) throws NoSuchViewException;
+
+  /**
+   * Invalidate cached view metadata for an {@link Identifier identifier}.
+   * 
+   * If the view is already loaded or cached, drop cached data. If the view 
does not exist or is
+   * not cached, do nothing. Calling this method should not query remote 
services.
+   *
+   * @param ident a view identifier
+   */
+  default void invalidateView(Identifier ident) {
+  }
+
+  /**
+   * Test whether a view exists using an {@link Identifier identifier} from 
the catalog.
+   * 
+   * If the catalog supports views and contains a view for the identifier and 
not a table,
+   * this must return false.
+   *
+   * @param ident a view identifier
+   * @return true if the view exists, false otherwise
+   */
+  default boolean viewExists(Identifier ident) {
+try {
+  return loadView(ident) != null;
+} catch (NoSuchViewException e) {
+  return false;
+}
+  }
+
+  /**
+   * Create a view in the catalog.
+   *
+   * @param ident a view identifier
+   * @param sql the SQL text that defines the view
+   * @param currentCatalog the current catalog
+   * @param currentNamespace the current namespace
+   * @param schema the view query output schema
+   * @param columnAliases the column aliases
+   * @param columnComments the column comments
+   * @param properties the view properties
+   * @return the view created
+   * @throws ViewAlreadyExistsException If a view or table already exists for 
the identifier
+   * @throws NoSuchNamespaceException If the identifier namespace does not 
exist (optional)
+   */
+  View createView(
+  Identifier ident,
+  String sql,
+  String currentCatalog,
+  String[] currentNamespace,
+  StructType schema,
+  String[] columnAliases,

Review Comment:
   Should we create a Builder?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] jzhuge commented on a diff in pull request #37556: [SPARK-39799][SQL] DataSourceV2: View catalog interface

2022-11-04 Thread GitBox


jzhuge commented on code in PR #37556:
URL: https://github.com/apache/spark/pull/37556#discussion_r1014522636


##
sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/View.java:
##
@@ -0,0 +1,69 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connector.catalog;
+
+import java.util.Map;
+
+import org.apache.spark.annotation.DeveloperApi;
+import org.apache.spark.sql.types.StructType;
+
+/**
+ * An interface representing a persisted view.
+ */
+@DeveloperApi
+public interface View {
+  /**
+   * A name to identify this view.
+   */
+  String name();
+
+  /**
+   * The view query SQL text.
+   */
+  String sql();
+
+  /**
+   * The current catalog when the view is created.
+   */
+  String currentCatalog();
+
+  /**
+   * The current namespace when the view is created.
+   */
+  String[] currentNamespace();
+
+  /**
+   * The schema for the SQL text when the view is created.
+   */
+  StructType schema();
+
+  /**
+   * The view column aliases.
+   */
+  String[] columnAliases();
+

Review Comment:
   Pushed a few changes:
   - Updated Javadoc for View.schema to indicate aliases applied
   - Added `View.queryColumnNames` to store output column names of the query 
that creates this view
   - Renamed `View.sql` to `View.query` because the term "query" is more 
accurate to describe the SELECT query that creates the view.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] jzhuge commented on a diff in pull request #37556: [SPARK-39799][SQL] DataSourceV2: View catalog interface

2022-11-04 Thread GitBox


jzhuge commented on code in PR #37556:
URL: https://github.com/apache/spark/pull/37556#discussion_r1014522636


##
sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/View.java:
##
@@ -0,0 +1,69 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connector.catalog;
+
+import java.util.Map;
+
+import org.apache.spark.annotation.DeveloperApi;
+import org.apache.spark.sql.types.StructType;
+
+/**
+ * An interface representing a persisted view.
+ */
+@DeveloperApi
+public interface View {
+  /**
+   * A name to identify this view.
+   */
+  String name();
+
+  /**
+   * The view query SQL text.
+   */
+  String sql();
+
+  /**
+   * The current catalog when the view is created.
+   */
+  String currentCatalog();
+
+  /**
+   * The current namespace when the view is created.
+   */
+  String[] currentNamespace();
+
+  /**
+   * The schema for the SQL text when the view is created.
+   */
+  StructType schema();
+
+  /**
+   * The view column aliases.
+   */
+  String[] columnAliases();
+

Review Comment:
   Pushed a few changes:
   - Updated Javadoc for View.schema to indicate aliases applied
   - Added `View.queryColumnNames` to store output column names
   - Renamed `View.sql` to `View.query` because "query" is more accurate to 
describe the SELECT query for the view.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] jzhuge commented on a diff in pull request #37556: [SPARK-39799][SQL] DataSourceV2: View catalog interface

2022-11-04 Thread GitBox


jzhuge commented on code in PR #37556:
URL: https://github.com/apache/spark/pull/37556#discussion_r1014522636


##
sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/View.java:
##
@@ -0,0 +1,69 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connector.catalog;
+
+import java.util.Map;
+
+import org.apache.spark.annotation.DeveloperApi;
+import org.apache.spark.sql.types.StructType;
+
+/**
+ * An interface representing a persisted view.
+ */
+@DeveloperApi
+public interface View {
+  /**
+   * A name to identify this view.
+   */
+  String name();
+
+  /**
+   * The view query SQL text.
+   */
+  String sql();
+
+  /**
+   * The current catalog when the view is created.
+   */
+  String currentCatalog();
+
+  /**
+   * The current namespace when the view is created.
+   */
+  String[] currentNamespace();
+
+  /**
+   * The schema for the SQL text when the view is created.
+   */
+  StructType schema();
+
+  /**
+   * The view column aliases.
+   */
+  String[] columnAliases();
+

Review Comment:
   Pushed a few changes:
   - Updated Javadoc for View.schema to indicate aliases applied
   - Added `View.queryColumnNames` to store output column names, useful for 
SELECT star queries
   - Renamed `View.sql` to `View.query` because "query" is more accurate to 
describe the SELECT query for the view.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] jzhuge commented on a diff in pull request #37556: [SPARK-39799][SQL] DataSourceV2: View catalog interface

2022-11-04 Thread GitBox


jzhuge commented on code in PR #37556:
URL: https://github.com/apache/spark/pull/37556#discussion_r1014515719


##
sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/View.java:
##
@@ -0,0 +1,69 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connector.catalog;
+
+import java.util.Map;
+
+import org.apache.spark.annotation.DeveloperApi;
+import org.apache.spark.sql.types.StructType;
+
+/**
+ * An interface representing a persisted view.
+ */
+@DeveloperApi
+public interface View {
+  /**
+   * A name to identify this view.
+   */
+  String name();
+
+  /**
+   * The view query SQL text.
+   */
+  String sql();
+
+  /**
+   * The current catalog when the view is created.
+   */
+  String currentCatalog();
+
+  /**
+   * The current namespace when the view is created.
+   */
+  String[] currentNamespace();
+
+  /**
+   * The schema for the SQL text when the view is created.
+   */
+  StructType schema();
+
+  /**
+   * The view column aliases.
+   */
+  String[] columnAliases();
+

Review Comment:
   Yeah, for select star, we need something similar to `viewQueryColumnNames`.
   
   If schema is pre-alias, then `schema.fieldNames` can serve the purpose.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] jzhuge commented on a diff in pull request #37556: [SPARK-39799][SQL] DataSourceV2: View catalog interface

2022-11-04 Thread GitBox


jzhuge commented on code in PR #37556:
URL: https://github.com/apache/spark/pull/37556#discussion_r1014506227


##
sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/View.java:
##
@@ -0,0 +1,69 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connector.catalog;
+
+import java.util.Map;
+
+import org.apache.spark.annotation.DeveloperApi;
+import org.apache.spark.sql.types.StructType;
+
+/**
+ * An interface representing a persisted view.
+ */
+@DeveloperApi
+public interface View {
+  /**
+   * A name to identify this view.
+   */
+  String name();
+
+  /**
+   * The view query SQL text.
+   */
+  String sql();
+
+  /**
+   * The current catalog when the view is created.
+   */
+  String currentCatalog();
+
+  /**
+   * The current namespace when the view is created.
+   */
+  String[] currentNamespace();
+
+  /**
+   * The schema for the SQL text when the view is created.
+   */
+  StructType schema();
+
+  /**
+   * The view column aliases.
+   */
+  String[] columnAliases();
+

Review Comment:
   No, `schema()` contains the final schema, the same as V1, to reduce 
confusion. I will update javadoc to clarify.
   
   Thank you and @wmoustafa for calling it out!
   
   For V1, schema stores column aliases. But for V2, we decided to use these 
metadata fields to capture the entire CREATE VIEW statement instead of relying 
on `schema` which feels more like "derived":
   - sql
   - columnAliases
   - columnComments
   
   Let me know if this makes sense.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] jzhuge commented on a diff in pull request #37556: [SPARK-39799][SQL] DataSourceV2: View catalog interface

2022-11-03 Thread GitBox


jzhuge commented on code in PR #37556:
URL: https://github.com/apache/spark/pull/37556#discussion_r1013618805


##
sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/View.java:
##
@@ -0,0 +1,69 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connector.catalog;
+
+import java.util.Map;
+
+import org.apache.spark.annotation.DeveloperApi;
+import org.apache.spark.sql.types.StructType;
+
+/**
+ * An interface representing a persisted view.
+ */
+@DeveloperApi
+public interface View {
+  /**
+   * A name to identify this view.
+   */
+  String name();
+
+  /**
+   * The view query SQL text.
+   */
+  String sql();
+
+  /**
+   * The current catalog when the view is created.
+   */
+  String currentCatalog();
+
+  /**
+   * The current namespace when the view is created.
+   */
+  String[] currentNamespace();
+
+  /**
+   * The schema for the SQL text when the view is created.
+   */
+  StructType schema();
+
+  /**
+   * The view column aliases.
+   */
+  String[] columnAliases();
+

Review Comment:
   This is to capture this part in Spark's CREATE VIEW syntax:
   ```
   ( { column_alias [ COMMENT column_comment ] } [, ...] )
   ```
   Array columnAliases and columnComments must have the same size.
   
   #28147 does not have it but we need to add them to maintain existing Spark 
features.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] jzhuge commented on a diff in pull request #37556: [SPARK-39799][SQL] DataSourceV2: View catalog interface

2022-08-18 Thread GitBox


jzhuge commented on code in PR #37556:
URL: https://github.com/apache/spark/pull/37556#discussion_r949539969


##
sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/View.java:
##
@@ -0,0 +1,69 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connector.catalog;
+
+import java.util.Map;
+
+import org.apache.spark.annotation.Experimental;
+import org.apache.spark.sql.types.StructType;
+
+/**
+ * An interface representing a persisted view.
+ */
+@Experimental

Review Comment:
   Ok with `DeveloperApi` annotation



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] jzhuge commented on a diff in pull request #37556: [SPARK-39799][SQL] DataSourceV2: View catalog interface

2022-08-18 Thread GitBox


jzhuge commented on code in PR #37556:
URL: https://github.com/apache/spark/pull/37556#discussion_r949538104


##
sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/View.java:
##
@@ -0,0 +1,69 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connector.catalog;
+
+import java.util.Map;
+
+import org.apache.spark.annotation.Experimental;
+import org.apache.spark.sql.types.StructType;
+
+/**
+ * An interface representing a persisted view.
+ */
+@Experimental

Review Comment:
   The parent PR is #35636



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org