Hi John,

Thanks for working on this! View support is very important to the catalog
plugin API.

After reading your doc, I have one high-level question: should view be a
separated API or it's just a special type of table?

AFAIK in most databases, tables and views share the same namespace. You
can't create a view if a same-name table exists. In Hive, view is just a
special type of table, so they are in the same namespace naturally. If we
have both table catalog and view catalog, we need a mechanism to make sure
there are no name conflicts.

On the other hand, the view metadata is very simple that can be put in
table properties. I'd like to see more thoughts to evaluate these 2
approaches:
1. *Add a new View API*. How to avoid name conflicts between table and
view? When resolving relation, shall we lookup table catalog first or view
catalog?
2. *Reuse the Table API*. How to indicate it's a view? What if we do want
to store table and views separately?

I think a new View API is more flexible. I'd vote for it if we can come up
with a good mechanism to avoid name conflicts.

On Wed, Aug 12, 2020 at 6:20 AM John Zhuge <jzh...@apache.org> wrote:

> Hi Spark devs,
>
> I'd like to bring more attention to this SPIP. As Dongjoon indicated in
> the email "Apache Spark 3.1 Feature Expectation (Dec. 2020)", this feature
> can be considered for 3.2 or even 3.1.
>
> View catalog builds on top of the catalog plugin system introduced in
> DataSourceV2. It adds the “ViewCatalog” API to load, create, alter, and
> drop views. A catalog plugin can naturally implement both ViewCatalog and
> TableCatalog.
>
> Our internal implementation has been in production for over 8 months.
> Recently we extended it to support materialized views, for the read path
> initially.
>
> The PR has conflicts that I will resolve them shortly.
>
> Thanks,
>
> On Wed, Apr 22, 2020 at 12:24 AM John Zhuge <jzh...@apache.org> wrote:
>
>> Hi everyone,
>>
>> In order to disassociate view metadata from Hive Metastore and support
>> different storage backends, I am proposing a new view catalog API to load,
>> create, alter, and drop views.
>>
>> Document:
>> https://docs.google.com/document/d/1XOxFtloiMuW24iqJ-zJnDzHl2KMxipTjJoxleJFz66A/edit?usp=sharing
>> JIRA: https://issues.apache.org/jira/browse/SPARK-31357
>> WIP PR: https://github.com/apache/spark/pull/28147
>>
>> As part of a project to support common views across query engines like
>> Spark and Presto, my team used the view catalog API in Spark
>> implementation. The project has been in production over three months.
>>
>> Thanks,
>> John Zhuge
>>
>
>
> --
> John Zhuge
>

Reply via email to