I see. Makes sense to me. Thanks Timo for the detailed explanation!

On Tue, Feb 9, 2021 at 9:48 PM Timo Walther <twal...@apache.org> wrote:

> Hi Rui,
>
> 1. It depends whether you would like to declare (unresolved) or use
> (resolved) a schema. In catalogs and APIs, people would actually like to
> declare a schema. Because the schema might reference objects from other
> catalogs etc. However, whenever the schema comes out of the framework it
> is fully resolved and people can use to configure their UI, connector, etc.
> 2. No, `getTable` doesn't have to return a resolved schema. Actually,
> this was my initial design (see Rejected Alternatives 1) where we pass
> the SchemaResolver into the Catalog. However, a catalog must not deal
> with resolution. When storing a table we need a resolved schema to
> perist the fully expanded properties, however, when reading those
> properties in again the schema can be resolved in a later stage.
>
> Regards,
> Timo
>
> On 09.02.21 14:07, Rui Li wrote:
> > Hi Timo,
> >
> > Thanks for the FLIP. It looks good to me overall. I have two questions.
> > 1. When should we use a resolved schema and when to use an unresolved
> one?
> > 2. The FLIP mentions only resolved tables/views can be stored into a
> > catalog. Does that mean the getTable method should also return a resolved
> > object?
> >
> > On Tue, Feb 9, 2021 at 6:29 PM Timo Walther <twal...@apache.org> wrote:
> >
> >> Hi Jark,
> >>
> >> thanks for your feedback. Let me answer some of your comments:
> >>
> >> 1) Since we decided to build an entire new stack, we can also introduce
> >> better names for columns, constraints, and watermark spec. My goal was
> >> to shorten the names during this refactoring. Therefore, `TableSchema`
> >> becomes `Schema` and `TableColumn` becomes `Column`. This also fits
> >> better to a `CatalogView` that has a schema but is actually not a table
> >> but a view. So `Column` is very generic. What do you think?
> >>
> >> 2) `ComputedColumn` and `WatermarkSpec` of the new generation will store
> >> `ResolvedExpression`.
> >>
> >> 3) I adopted most of the methods from `TableSchema` in `ResolvedSchema`.
> >> However, I skipped `getColumnDataTypes()` because the behavior is not
> >> clear to me. Should it include computed columns or virtual metadata
> >> columns? I think we should force users to think about what they require.
> >> Otherwise we implicitly introduce bugs.
> >>
> >> Regards,
> >> Timo
> >>
> >> On 09.02.21 10:56, Jark Wu wrote:
> >>> Hi Timo,
> >>>
> >>> The messy TableSchema confuses many developers.
> >>> It's great to see we can finally come up with a clean interface
> hierarchy
> >>> and still backward compatible.
> >>>
> >>> Thanks for preparing the nice FLIP. It looks good to me. I have some
> >> minor
> >>> comments:
> >>>
> >>> 1) Should `ResolvedSchema#getColumn(int)` returns `TableColumn` instead
> >> of
> >>> `Column`?
> >>>
> >>> 2) You mentioned ResolvedSchema should store ResolvedExpression, should
> >> we
> >>> extend
> >>>     `ComputedColumn` and `WatermarkSpec` to allow `ResolvedExpression`?
> >>>
> >>> 3) `ResolvedSchema` aims to replace `TableSchema`, it would be better
> to
> >>> add un-deprecated
> >>> methods of `TableSchema` into `ResolvedSchema`
> >>> (e.g. `getColumnDataTypes()`).
> >>> Then users can have a smooth migration.
> >>>
> >>> Best,
> >>> Jark
> >>>
> >>> On Mon, 8 Feb 2021 at 20:21, Dawid Wysakowicz <dwysakow...@apache.org>
> >>> wrote:
> >>>
> >>>> Hi Timo,
> >>>>
> >>>>   From my perspective the proposed changes look good. I agree it is an
> >>>> important step towards FLIP-129 and FLIP-136. Personally I feel
> >>>> comfortable voting on the document.
> >>>>
> >>>> Best,
> >>>>
> >>>> Dawid
> >>>>
> >>>> On 05/02/2021 16:09, Timo Walther wrote:
> >>>>> Hi everyone,
> >>>>>
> >>>>> you might have seen that we discussed a better schema API in past as
> >>>>> part of FLIP-129 and FLIP-136. We also discussed this topic during
> >>>>> different releases:
> >>>>>
> >>>>> https://issues.apache.org/jira/browse/FLINK-17793
> >>>>>
> >>>>> Jark and I had an offline discussion how we can finally fix this
> >>>>> shortcoming and maintain backwards compatibile for a couple of
> >>>>> releases to give people time to update their code.
> >>>>>
> >>>>> I would like to propose the following FLIP:
> >>>>>
> >>>>>
> >>>>
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-164%3A+Improve+Schema+Handling+in+Catalogs
> >>>>>
> >>>>>
> >>>>> The FLIP updates the class hierarchy to achieve the following goals:
> >>>>>
> >>>>> - make it visible whether a schema is resolved or unresolved and when
> >>>>> the resolution happens
> >>>>> - offer a unified API for FLIP-129, FLIP-136, and catalogs
> >>>>> - allow arbitrary data types and expressions in the schema for
> >>>>> watermark spec or columns
> >>>>> - have access to other catalogs for declaring a data type or
> >>>>> expression via CatalogManager
> >>>>> - a cleaned up TableSchema
> >>>>> - remain backwards compatible in the persisted properties and API
> >>>>>
> >>>>> Looking forward to your feedback.
> >>>>>
> >>>>> Thanks,
> >>>>> Timo
> >>>>
> >>>>
> >>>
> >>
> >>
> >
>
>

-- 
Best regards!
Rui Li

Reply via email to