I see. Makes sense to me. Thanks Timo for the detailed explanation! On Tue, Feb 9, 2021 at 9:48 PM Timo Walther <twal...@apache.org> wrote:
> Hi Rui, > > 1. It depends whether you would like to declare (unresolved) or use > (resolved) a schema. In catalogs and APIs, people would actually like to > declare a schema. Because the schema might reference objects from other > catalogs etc. However, whenever the schema comes out of the framework it > is fully resolved and people can use to configure their UI, connector, etc. > 2. No, `getTable` doesn't have to return a resolved schema. Actually, > this was my initial design (see Rejected Alternatives 1) where we pass > the SchemaResolver into the Catalog. However, a catalog must not deal > with resolution. When storing a table we need a resolved schema to > perist the fully expanded properties, however, when reading those > properties in again the schema can be resolved in a later stage. > > Regards, > Timo > > On 09.02.21 14:07, Rui Li wrote: > > Hi Timo, > > > > Thanks for the FLIP. It looks good to me overall. I have two questions. > > 1. When should we use a resolved schema and when to use an unresolved > one? > > 2. The FLIP mentions only resolved tables/views can be stored into a > > catalog. Does that mean the getTable method should also return a resolved > > object? > > > > On Tue, Feb 9, 2021 at 6:29 PM Timo Walther <twal...@apache.org> wrote: > > > >> Hi Jark, > >> > >> thanks for your feedback. Let me answer some of your comments: > >> > >> 1) Since we decided to build an entire new stack, we can also introduce > >> better names for columns, constraints, and watermark spec. My goal was > >> to shorten the names during this refactoring. Therefore, `TableSchema` > >> becomes `Schema` and `TableColumn` becomes `Column`. This also fits > >> better to a `CatalogView` that has a schema but is actually not a table > >> but a view. So `Column` is very generic. What do you think? > >> > >> 2) `ComputedColumn` and `WatermarkSpec` of the new generation will store > >> `ResolvedExpression`. > >> > >> 3) I adopted most of the methods from `TableSchema` in `ResolvedSchema`. > >> However, I skipped `getColumnDataTypes()` because the behavior is not > >> clear to me. Should it include computed columns or virtual metadata > >> columns? I think we should force users to think about what they require. > >> Otherwise we implicitly introduce bugs. > >> > >> Regards, > >> Timo > >> > >> On 09.02.21 10:56, Jark Wu wrote: > >>> Hi Timo, > >>> > >>> The messy TableSchema confuses many developers. > >>> It's great to see we can finally come up with a clean interface > hierarchy > >>> and still backward compatible. > >>> > >>> Thanks for preparing the nice FLIP. It looks good to me. I have some > >> minor > >>> comments: > >>> > >>> 1) Should `ResolvedSchema#getColumn(int)` returns `TableColumn` instead > >> of > >>> `Column`? > >>> > >>> 2) You mentioned ResolvedSchema should store ResolvedExpression, should > >> we > >>> extend > >>> `ComputedColumn` and `WatermarkSpec` to allow `ResolvedExpression`? > >>> > >>> 3) `ResolvedSchema` aims to replace `TableSchema`, it would be better > to > >>> add un-deprecated > >>> methods of `TableSchema` into `ResolvedSchema` > >>> (e.g. `getColumnDataTypes()`). > >>> Then users can have a smooth migration. > >>> > >>> Best, > >>> Jark > >>> > >>> On Mon, 8 Feb 2021 at 20:21, Dawid Wysakowicz <dwysakow...@apache.org> > >>> wrote: > >>> > >>>> Hi Timo, > >>>> > >>>> From my perspective the proposed changes look good. I agree it is an > >>>> important step towards FLIP-129 and FLIP-136. Personally I feel > >>>> comfortable voting on the document. > >>>> > >>>> Best, > >>>> > >>>> Dawid > >>>> > >>>> On 05/02/2021 16:09, Timo Walther wrote: > >>>>> Hi everyone, > >>>>> > >>>>> you might have seen that we discussed a better schema API in past as > >>>>> part of FLIP-129 and FLIP-136. We also discussed this topic during > >>>>> different releases: > >>>>> > >>>>> https://issues.apache.org/jira/browse/FLINK-17793 > >>>>> > >>>>> Jark and I had an offline discussion how we can finally fix this > >>>>> shortcoming and maintain backwards compatibile for a couple of > >>>>> releases to give people time to update their code. > >>>>> > >>>>> I would like to propose the following FLIP: > >>>>> > >>>>> > >>>> > >> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-164%3A+Improve+Schema+Handling+in+Catalogs > >>>>> > >>>>> > >>>>> The FLIP updates the class hierarchy to achieve the following goals: > >>>>> > >>>>> - make it visible whether a schema is resolved or unresolved and when > >>>>> the resolution happens > >>>>> - offer a unified API for FLIP-129, FLIP-136, and catalogs > >>>>> - allow arbitrary data types and expressions in the schema for > >>>>> watermark spec or columns > >>>>> - have access to other catalogs for declaring a data type or > >>>>> expression via CatalogManager > >>>>> - a cleaned up TableSchema > >>>>> - remain backwards compatible in the persisted properties and API > >>>>> > >>>>> Looking forward to your feedback. > >>>>> > >>>>> Thanks, > >>>>> Timo > >>>> > >>>> > >>> > >> > >> > > > > -- Best regards! Rui Li