Thanks for the feedback, Ryan! I can share the WIP copy of the SPIP if that makes sense.
I can't find out a lot about view resolution and validation in SQL Spec Part1. Anybody with full SQL knowledge may chime in. Here are my understanding based on online manuals, docs, and other resources: - A view has a name in the database schema so that other queries can use it like a table. - A view's schema is frozen at the time the view is created; subsequent changes to underlying tables (e.g. adding a column) will not be reflected in the view's schema. If an underlying table is dropped or changed in an incompatible fashion, subsequent attempts to query the invalid view will fail. In Preso, view columns are used for validation only (see StatementAnalyzer.Visitor#isViewStale): - view column names must match the visible fields of analyzed view sql - the visible fields can be coerced to view column types In Spark 2.2+, view columns are also used for validation (see CheckAnalysis#checkAnalysis case View): - view column names must match the output fields of the view sql - view column types must be able to UpCast to output field types Rule EliminateView adds a Project to viewQueryColumnNames if it exists. As for `softwareVersion`, the purpose is to track which software version is used to create the view, in preparation for different versions of the same software or even different softwares, such as Presto vs Spark. On Tue, Aug 13, 2019 at 9:47 AM Ryan Blue <rb...@netflix.com> wrote: > Thanks for working on this, John! > > I'd like to see a more complete write-up of what you're proposing. Without > that, I don't think we can have a productive discussion about this. > > For example, I think you're proposing to keep the view columns to ensure > that the same columns are produced by the view every time, based on > requirements from the SQL spec. Let's start by stating what those behavior > requirements are, so that everyone has the context to understand why your > proposal includes the view columns. Similarly, I'd like to know why you're > proposing `softwareVersion` in the view definition. > > On Tue, Aug 13, 2019 at 8:56 AM John Zhuge <jzh...@apache.org> wrote: > >> Catalog support has been added to DSv2 along with a table catalog >> interface. Here I'd like to propose a view catalog interface, for the >> following benefit: >> >> - Abstraction for view management thus allowing different view >> backends >> - Disassociation of view definition storage from Hive Metastore >> >> A catalog plugin can be both TableCatalog and ViewCatalog. Resolve an >> identifier as view first then table. >> >> More details in SPIP and PR if we decide to proceed. Here is a quick >> glance at the API: >> >> ViewCatalog interface: >> >> - loadView >> - listViews >> - createView >> - deleteView >> >> View interface: >> >> - name >> - originalSql >> - defaultCatalog >> - defaultNamespace >> - viewColumns >> - owner >> - createTime >> - softwareVersion >> - options (map) >> >> ViewColumn interface: >> >> - name >> - type >> >> >> Thanks, >> John Zhuge >> > > > -- > Ryan Blue > Software Engineer > Netflix > -- John Zhuge