Re: [DISCUSS] FLIP-129: Refactor Descriptor API to register connector in Table API

Jark Wu Thu, 09 Jul 2020 09:51:34 -0700

Hi Dawid,

Thanks for the great feedback! Here are my responses:

1) computedColumn(..) vs column(..)
I'm fine to use `column(..)` in both cases.

2) Expression DSL vs pure SQL string for computed columns
This is a good point. Actually, I also prefer to use Expression DSL because
this is more Table API style.
However, this requires to modify TableSchema again to accept & expose
Expression as computed columns.
I'm not convinced about this, because AFAIK, we want to have a
CatalogTableSchema to hold this information
and don't want to extend TableSchema. Maybe Timo can give some points here.
Besides, this will make the descriptor API can't be persisted in Catalog
unless FLIP-80 is done.

3) Schema#proctime and Schema#watermarkFor#boundedOutOfOrderTimestamps
The original intention behind these APIs are providing shortcut APIs for
Table API users.
But I'm also fine to only provide the DDL-like methods if you have
concerns. We can discuss shortcuts in the future if users request.

4) LikeOption
LikeOption.INCLUDING.ALL is a constant (enum values). I have added more
description about this in the FLIP.

5) implementation?
I don't want to mention too much about implementation details in the FLIP
at the beginning, because the API is already very long.
But I also added an "Implementation" section to explain them.

6) static method vs new keyword
Personally I prefer the new keyword because it makes the API cleaner. If we
want remove new keyword and use static methods, we have to:
Either adding a `Schema.builder()/create()` method as the starting method,
Or duplicating all the methods as static methods, e.g. we have 12 methods
in `Kafka`, any of them can be a starting method, then we will have 24
methods in `Kafka`.
Both are not good, and it's hard to keep all the descriptors having the
same starting method name, but all the descriptors can start from the same
new keyword.

Best,
Jark

On Thu, 9 Jul 2020 at 15:48, Dawid Wysakowicz <[email protected]>
wrote:

> Correction to my point 4. The example is correct. I did not read it
> carefully enough. Sorry for the confusion. Nevertheless I'd still like
> to see a bit more explanation on the LikeOptions.
>
> On 07/07/2020 04:32, Jark Wu wrote:
> > Hi everyone,
> >
> > Leonard and I prepared a FLIP about refactoring current Descriptor API,
> > i.e. TableEnvironment#connect(). We would like to propose a new
> descriptor
> > API to register connectors in Table API.
> >
> > Since Flink 1.9, the community focused more on the new SQL DDL feature.
> > After a series of releases, the SQL DDL is powerful and has many rich
> > features now. However, Descriptor API (the `TableEnvironment#connect()`)
> > has been stagnant for a long time and missing lots of core features, such
> > as computed columns and primary keys. That's frustrating for Table API
> > users who want to register tables programmatically. Besides, currently, a
> > connector must implement a corresponding Descriptor (e.g. `new Kafka()`)
> > before using the "connect" API. Therefore, we hope to reduce this effort
> > for connector developers, that custom source/sinks can be registered via
> > the descriptor API without implementing a Descriptor.
> >
> > These are the problems we want to resolve in this FLIP. I'm looking
> forward
> > to your comments.
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-129%3A+Refactor+Descriptor+API+to+register+connector+for+Table+API
> >
> > Best,
> > Jark
> >
>
>

Re: [DISCUSS] FLIP-129: Refactor Descriptor API to register connector in Table API

Reply via email to