Hi Timo,

1) I'm fine with `Column`, but are we going to introduce new interfaces
for `UniqueConstraint` and `WatermarkSpec`? If we want to introduce
a new stack, it would be better to have a different name, otherwise,
it's easy to use a wrong class for users.

Best,
Jark

On Wed, 10 Feb 2021 at 09:49, Rui Li <lirui.fu...@gmail.com> wrote:

> I see. Makes sense to me. Thanks Timo for the detailed explanation!
>
> On Tue, Feb 9, 2021 at 9:48 PM Timo Walther <twal...@apache.org> wrote:
>
> > Hi Rui,
> >
> > 1. It depends whether you would like to declare (unresolved) or use
> > (resolved) a schema. In catalogs and APIs, people would actually like to
> > declare a schema. Because the schema might reference objects from other
> > catalogs etc. However, whenever the schema comes out of the framework it
> > is fully resolved and people can use to configure their UI, connector,
> etc.
> > 2. No, `getTable` doesn't have to return a resolved schema. Actually,
> > this was my initial design (see Rejected Alternatives 1) where we pass
> > the SchemaResolver into the Catalog. However, a catalog must not deal
> > with resolution. When storing a table we need a resolved schema to
> > perist the fully expanded properties, however, when reading those
> > properties in again the schema can be resolved in a later stage.
> >
> > Regards,
> > Timo
> >
> > On 09.02.21 14:07, Rui Li wrote:
> > > Hi Timo,
> > >
> > > Thanks for the FLIP. It looks good to me overall. I have two questions.
> > > 1. When should we use a resolved schema and when to use an unresolved
> > one?
> > > 2. The FLIP mentions only resolved tables/views can be stored into a
> > > catalog. Does that mean the getTable method should also return a
> resolved
> > > object?
> > >
> > > On Tue, Feb 9, 2021 at 6:29 PM Timo Walther <twal...@apache.org>
> wrote:
> > >
> > >> Hi Jark,
> > >>
> > >> thanks for your feedback. Let me answer some of your comments:
> > >>
> > >> 1) Since we decided to build an entire new stack, we can also
> introduce
> > >> better names for columns, constraints, and watermark spec. My goal was
> > >> to shorten the names during this refactoring. Therefore, `TableSchema`
> > >> becomes `Schema` and `TableColumn` becomes `Column`. This also fits
> > >> better to a `CatalogView` that has a schema but is actually not a
> table
> > >> but a view. So `Column` is very generic. What do you think?
> > >>
> > >> 2) `ComputedColumn` and `WatermarkSpec` of the new generation will
> store
> > >> `ResolvedExpression`.
> > >>
> > >> 3) I adopted most of the methods from `TableSchema` in
> `ResolvedSchema`.
> > >> However, I skipped `getColumnDataTypes()` because the behavior is not
> > >> clear to me. Should it include computed columns or virtual metadata
> > >> columns? I think we should force users to think about what they
> require.
> > >> Otherwise we implicitly introduce bugs.
> > >>
> > >> Regards,
> > >> Timo
> > >>
> > >> On 09.02.21 10:56, Jark Wu wrote:
> > >>> Hi Timo,
> > >>>
> > >>> The messy TableSchema confuses many developers.
> > >>> It's great to see we can finally come up with a clean interface
> > hierarchy
> > >>> and still backward compatible.
> > >>>
> > >>> Thanks for preparing the nice FLIP. It looks good to me. I have some
> > >> minor
> > >>> comments:
> > >>>
> > >>> 1) Should `ResolvedSchema#getColumn(int)` returns `TableColumn`
> instead
> > >> of
> > >>> `Column`?
> > >>>
> > >>> 2) You mentioned ResolvedSchema should store ResolvedExpression,
> should
> > >> we
> > >>> extend
> > >>>     `ComputedColumn` and `WatermarkSpec` to allow
> `ResolvedExpression`?
> > >>>
> > >>> 3) `ResolvedSchema` aims to replace `TableSchema`, it would be better
> > to
> > >>> add un-deprecated
> > >>> methods of `TableSchema` into `ResolvedSchema`
> > >>> (e.g. `getColumnDataTypes()`).
> > >>> Then users can have a smooth migration.
> > >>>
> > >>> Best,
> > >>> Jark
> > >>>
> > >>> On Mon, 8 Feb 2021 at 20:21, Dawid Wysakowicz <
> dwysakow...@apache.org>
> > >>> wrote:
> > >>>
> > >>>> Hi Timo,
> > >>>>
> > >>>>   From my perspective the proposed changes look good. I agree it is
> an
> > >>>> important step towards FLIP-129 and FLIP-136. Personally I feel
> > >>>> comfortable voting on the document.
> > >>>>
> > >>>> Best,
> > >>>>
> > >>>> Dawid
> > >>>>
> > >>>> On 05/02/2021 16:09, Timo Walther wrote:
> > >>>>> Hi everyone,
> > >>>>>
> > >>>>> you might have seen that we discussed a better schema API in past
> as
> > >>>>> part of FLIP-129 and FLIP-136. We also discussed this topic during
> > >>>>> different releases:
> > >>>>>
> > >>>>> https://issues.apache.org/jira/browse/FLINK-17793
> > >>>>>
> > >>>>> Jark and I had an offline discussion how we can finally fix this
> > >>>>> shortcoming and maintain backwards compatibile for a couple of
> > >>>>> releases to give people time to update their code.
> > >>>>>
> > >>>>> I would like to propose the following FLIP:
> > >>>>>
> > >>>>>
> > >>>>
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-164%3A+Improve+Schema+Handling+in+Catalogs
> > >>>>>
> > >>>>>
> > >>>>> The FLIP updates the class hierarchy to achieve the following
> goals:
> > >>>>>
> > >>>>> - make it visible whether a schema is resolved or unresolved and
> when
> > >>>>> the resolution happens
> > >>>>> - offer a unified API for FLIP-129, FLIP-136, and catalogs
> > >>>>> - allow arbitrary data types and expressions in the schema for
> > >>>>> watermark spec or columns
> > >>>>> - have access to other catalogs for declaring a data type or
> > >>>>> expression via CatalogManager
> > >>>>> - a cleaned up TableSchema
> > >>>>> - remain backwards compatible in the persisted properties and API
> > >>>>>
> > >>>>> Looking forward to your feedback.
> > >>>>>
> > >>>>> Thanks,
> > >>>>> Timo
> > >>>>
> > >>>>
> > >>>
> > >>
> > >>
> > >
> >
> >
>
> --
> Best regards!
> Rui Li
>

Reply via email to