Hi Timo, 1) I'm fine with `Column`, but are we going to introduce new interfaces for `UniqueConstraint` and `WatermarkSpec`? If we want to introduce a new stack, it would be better to have a different name, otherwise, it's easy to use a wrong class for users.
Best, Jark On Wed, 10 Feb 2021 at 09:49, Rui Li <lirui.fu...@gmail.com> wrote: > I see. Makes sense to me. Thanks Timo for the detailed explanation! > > On Tue, Feb 9, 2021 at 9:48 PM Timo Walther <twal...@apache.org> wrote: > > > Hi Rui, > > > > 1. It depends whether you would like to declare (unresolved) or use > > (resolved) a schema. In catalogs and APIs, people would actually like to > > declare a schema. Because the schema might reference objects from other > > catalogs etc. However, whenever the schema comes out of the framework it > > is fully resolved and people can use to configure their UI, connector, > etc. > > 2. No, `getTable` doesn't have to return a resolved schema. Actually, > > this was my initial design (see Rejected Alternatives 1) where we pass > > the SchemaResolver into the Catalog. However, a catalog must not deal > > with resolution. When storing a table we need a resolved schema to > > perist the fully expanded properties, however, when reading those > > properties in again the schema can be resolved in a later stage. > > > > Regards, > > Timo > > > > On 09.02.21 14:07, Rui Li wrote: > > > Hi Timo, > > > > > > Thanks for the FLIP. It looks good to me overall. I have two questions. > > > 1. When should we use a resolved schema and when to use an unresolved > > one? > > > 2. The FLIP mentions only resolved tables/views can be stored into a > > > catalog. Does that mean the getTable method should also return a > resolved > > > object? > > > > > > On Tue, Feb 9, 2021 at 6:29 PM Timo Walther <twal...@apache.org> > wrote: > > > > > >> Hi Jark, > > >> > > >> thanks for your feedback. Let me answer some of your comments: > > >> > > >> 1) Since we decided to build an entire new stack, we can also > introduce > > >> better names for columns, constraints, and watermark spec. My goal was > > >> to shorten the names during this refactoring. Therefore, `TableSchema` > > >> becomes `Schema` and `TableColumn` becomes `Column`. This also fits > > >> better to a `CatalogView` that has a schema but is actually not a > table > > >> but a view. So `Column` is very generic. What do you think? > > >> > > >> 2) `ComputedColumn` and `WatermarkSpec` of the new generation will > store > > >> `ResolvedExpression`. > > >> > > >> 3) I adopted most of the methods from `TableSchema` in > `ResolvedSchema`. > > >> However, I skipped `getColumnDataTypes()` because the behavior is not > > >> clear to me. Should it include computed columns or virtual metadata > > >> columns? I think we should force users to think about what they > require. > > >> Otherwise we implicitly introduce bugs. > > >> > > >> Regards, > > >> Timo > > >> > > >> On 09.02.21 10:56, Jark Wu wrote: > > >>> Hi Timo, > > >>> > > >>> The messy TableSchema confuses many developers. > > >>> It's great to see we can finally come up with a clean interface > > hierarchy > > >>> and still backward compatible. > > >>> > > >>> Thanks for preparing the nice FLIP. It looks good to me. I have some > > >> minor > > >>> comments: > > >>> > > >>> 1) Should `ResolvedSchema#getColumn(int)` returns `TableColumn` > instead > > >> of > > >>> `Column`? > > >>> > > >>> 2) You mentioned ResolvedSchema should store ResolvedExpression, > should > > >> we > > >>> extend > > >>> `ComputedColumn` and `WatermarkSpec` to allow > `ResolvedExpression`? > > >>> > > >>> 3) `ResolvedSchema` aims to replace `TableSchema`, it would be better > > to > > >>> add un-deprecated > > >>> methods of `TableSchema` into `ResolvedSchema` > > >>> (e.g. `getColumnDataTypes()`). > > >>> Then users can have a smooth migration. > > >>> > > >>> Best, > > >>> Jark > > >>> > > >>> On Mon, 8 Feb 2021 at 20:21, Dawid Wysakowicz < > dwysakow...@apache.org> > > >>> wrote: > > >>> > > >>>> Hi Timo, > > >>>> > > >>>> From my perspective the proposed changes look good. I agree it is > an > > >>>> important step towards FLIP-129 and FLIP-136. Personally I feel > > >>>> comfortable voting on the document. > > >>>> > > >>>> Best, > > >>>> > > >>>> Dawid > > >>>> > > >>>> On 05/02/2021 16:09, Timo Walther wrote: > > >>>>> Hi everyone, > > >>>>> > > >>>>> you might have seen that we discussed a better schema API in past > as > > >>>>> part of FLIP-129 and FLIP-136. We also discussed this topic during > > >>>>> different releases: > > >>>>> > > >>>>> https://issues.apache.org/jira/browse/FLINK-17793 > > >>>>> > > >>>>> Jark and I had an offline discussion how we can finally fix this > > >>>>> shortcoming and maintain backwards compatibile for a couple of > > >>>>> releases to give people time to update their code. > > >>>>> > > >>>>> I would like to propose the following FLIP: > > >>>>> > > >>>>> > > >>>> > > >> > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-164%3A+Improve+Schema+Handling+in+Catalogs > > >>>>> > > >>>>> > > >>>>> The FLIP updates the class hierarchy to achieve the following > goals: > > >>>>> > > >>>>> - make it visible whether a schema is resolved or unresolved and > when > > >>>>> the resolution happens > > >>>>> - offer a unified API for FLIP-129, FLIP-136, and catalogs > > >>>>> - allow arbitrary data types and expressions in the schema for > > >>>>> watermark spec or columns > > >>>>> - have access to other catalogs for declaring a data type or > > >>>>> expression via CatalogManager > > >>>>> - a cleaned up TableSchema > > >>>>> - remain backwards compatible in the persisted properties and API > > >>>>> > > >>>>> Looking forward to your feedback. > > >>>>> > > >>>>> Thanks, > > >>>>> Timo > > >>>> > > >>>> > > >>> > > >> > > >> > > > > > > > > > -- > Best regards! > Rui Li >