Thanks for the feedback, Ryan! I can share the WIP copy of the SPIP if that
makes sense.

I can't find out a lot about view resolution and validation in SQL Spec
Part1. Anybody with full SQL knowledge may chime in.

Here are my understanding based on online manuals, docs, and other
resources:

   - A view has a name in the database schema so that other queries can use
   it like a table.
   - A view's schema is frozen at the time the view is created; subsequent
   changes to underlying tables (e.g. adding a column) will not be reflected
   in the view's schema. If an underlying table is dropped or changed in an
   incompatible fashion, subsequent attempts to query the invalid view will
   fail.

In Preso, view columns are used for validation only (see
StatementAnalyzer.Visitor#isViewStale):

   - view column names must match the visible fields of analyzed view sql
   - the visible fields can be coerced to view column types

In Spark 2.2+, view columns are also used for validation (see
CheckAnalysis#checkAnalysis case View):

   - view column names must match the output fields of the view sql
   - view column types must be able to UpCast to output field types

Rule EliminateView adds a Project to viewQueryColumnNames if it exists.

As for `softwareVersion`, the purpose is to track which software version is
used to create the view, in preparation for different versions of the same
software or even different softwares, such as Presto vs Spark.


On Tue, Aug 13, 2019 at 9:47 AM Ryan Blue <rb...@netflix.com> wrote:

> Thanks for working on this, John!
>
> I'd like to see a more complete write-up of what you're proposing. Without
> that, I don't think we can have a productive discussion about this.
>
> For example, I think you're proposing to keep the view columns to ensure
> that the same columns are produced by the view every time, based on
> requirements from the SQL spec. Let's start by stating what those behavior
> requirements are, so that everyone has the context to understand why your
> proposal includes the view columns. Similarly, I'd like to know why you're
> proposing `softwareVersion` in the view definition.
>
> On Tue, Aug 13, 2019 at 8:56 AM John Zhuge <jzh...@apache.org> wrote:
>
>> Catalog support has been added to DSv2 along with a table catalog
>> interface. Here I'd like to propose a view catalog interface, for the
>> following benefit:
>>
>>    - Abstraction for view management thus allowing different view
>>    backends
>>    - Disassociation of view definition storage from Hive Metastore
>>
>> A catalog plugin can be both TableCatalog and ViewCatalog. Resolve an
>> identifier as view first then table.
>>
>> More details in SPIP and PR if we decide to proceed. Here is a quick
>> glance at the API:
>>
>> ViewCatalog interface:
>>
>>    - loadView
>>    - listViews
>>    - createView
>>    - deleteView
>>
>> View interface:
>>
>>    - name
>>    - originalSql
>>    - defaultCatalog
>>    - defaultNamespace
>>    - viewColumns
>>    - owner
>>    - createTime
>>    - softwareVersion
>>    - options (map)
>>
>> ViewColumn interface:
>>
>>    - name
>>    - type
>>
>>
>> Thanks,
>> John Zhuge
>>
>
>
> --
> Ryan Blue
> Software Engineer
> Netflix
>


-- 
John Zhuge

Reply via email to