IIUC it's good to see that both serializable (tables description from DDL)
and unserializable (tables with DataStream underneath) tables are treated
unify with CatalogTable.

Can I also assume functions that either come from a function class (from
DDL)
or function objects (newed by user) will also treated unify with
CatalogFunction?

This will greatly simplify and unify current API level concepts and design.

And it seems only one thing left, how do we deal with
ConnectTableDescriptor?
It's actually very similar with serializable CatalogTable, both carry some
text
properties which even are the same. Is there any chance we can further unify
this to CatalogTable?

object
Best,
Kurt


On Thu, Sep 19, 2019 at 3:13 PM Jark Wu <imj...@gmail.com> wrote:

> Thanks Dawid for the design doc.
>
> In general, I’m +1 to the FLIP.
>
>
> +1 to the single-string and parse way to express object path.
>
> +1 to deprecate registerTableSink & registerTableSource.
> But I would suggest to provide an easy way to register a custom
> source/sink before we drop them (this is another story).
> Currently, it’s not easy to implement a custom connector descriptor.
>
> Best,
> Jark
>
>
> > 在 2019年9月19日,11:37,Dawid Wysakowicz <wysakowicz.da...@gmail.com> 写道:
> >
> > Hi JingsongLee,
> > From my understanding they can. Underneath they will be CatalogTables.
> The
> > difference is the lifetime of the tables. Plus some of the user facing
> > interfaces cannot be persisted e.g. datastream. Therefore we must have a
> > separate methods for that. In the end the temporary tables are held in
> > memory as CatalogTables.
> > Best,
> > Dawid
> >
> > On Thu, 19 Sep 2019, 10:08 JingsongLee, <lzljs3620...@aliyun.com
> .invalid>
> > wrote:
> >
> >> Hi dawid:
> >> Can temporary tables achieve the same capabilities as catalog table?
> >> like statistics: CatalogTableStatistics, CatalogColumnStatistics,
> >> PartitionStatistics
> >> like partition support: we have added some catalog equivalent interfaces
> >> on TableSource/TableSink: getPartitions, getPartitionFieldNames
> >> Maybe it's not a good idea to add these interfaces to
> >> TableSource/TableSink. What do you think?
> >>
> >> Best,
> >> Jingsong Lee
> >>
> >>
> >> ------------------------------------------------------------------
> >> From:Kurt Young <ykt...@gmail.com>
> >> Send Time:2019年9月18日(星期三) 17:54
> >> To:dev <dev@flink.apache.org>
> >> Subject:Re: [DISCUSS] FLIP-64: Support for Temporary Objects in Table
> >> module
> >>
> >> Hi all,
> >>
> >> Sorry to join this party late. Big +1 to this flip, especially for the
> >> dropping
> >> "registerTableSink & registerTableSource" part. These are indeed legacy
> >> and we should try to unify them through CatalogTable after we introduce
> >> the concept of Catalog.
> >>
> >> From my understanding, what we can registered should all be metadata,
> >> TableSource/TableSink should only be the one who is responsible to do
> >> the real work, i.e. reading and writing data according to the schema and
> >> other information like computed column, partition, .e.g.
> >>
> >> Best,
> >> Kurt
> >>
> >>
> >> On Wed, Sep 18, 2019 at 5:14 PM JingsongLee <lzljs3620...@aliyun.com
> >> .invalid>
> >> wrote:
> >>
> >>> After some development and thinking, I have a general understanding.
> >>> +1 to registering a source/sink does not fit into the SQL world.
> >>> I am OK to have a deprecated registerTemporarySource/Sink to compatible
> >>> with old ways.
> >>>
> >>> Best,
> >>> Jingsong Lee
> >>>
> >>>
> >>> ------------------------------------------------------------------
> >>> From:Timo Walther <twal...@apache.org>
> >>> Send Time:2019年9月17日(星期二) 08:00
> >>> To:dev <dev@flink.apache.org>
> >>> Subject:Re: [DISCUSS] FLIP-64: Support for Temporary Objects in Table
> >>> module
> >>>
> >>> Hi Dawid,
> >>>
> >>> thanks for the design document. It fixes big concept gaps due to
> >>> historical reasons with proper support for serializability and catalog
> >>> support in mind.
> >>>
> >>> I would not mind a registerTemporarySource/Sink, but the problem that I
> >>> see is that many people think that this is the recommended way of
> >>> registering a table source/sink which is not true. We should guide
> users
> >>> to either use connect() or DDL API which can be validated and stored in
> >>> catalog.
> >>>
> >>> Also from a concept perspective, registering a source/sink does not fit
> >>> into the SQL world. SQL does not know about source/sinks but only about
> >>> tables. If the responsibility of a TableSource/TableSink is just a pure
> >>> physical data consumer/producer that is not connected to the actual
> >>> logical table schema, we would need a possibility of defining time
> >>> attributes and interpreting/converting a changelog. This should be done
> >>> by the framework with information from the DDL/connect() and not be
> >>> defined in every table source.
> >>>
> >>> Regards,
> >>> Timo
> >>>
> >>>
> >>> On 09.09.19 14:16, JingsongLee wrote:
> >>>> Hi dawid:
> >>>>
> >>>> It is difficult to describe specific examples.
> >>>> Sometimes users will generate some java converters through some
> >>>>  Java code, or generate some Java classes through third-party
> >>>>  libraries. Of course, these can be best done through properties.
> >>>> But this requires additional work from users.My suggestion is to
> >>>>  keep this Java instance class way that is user-friendly.
> >>>>
> >>>> Best,
> >>>> Jingsong Lee
> >>>>
> >>>>
> >>>> ------------------------------------------------------------------
> >>>> From:Dawid Wysakowicz <dwysakow...@apache.org>
> >>>> Send Time:2019年9月6日(星期五) 16:21
> >>>> To:dev <dev@flink.apache.org>
> >>>> Subject:Re: [DISCUSS] FLIP-64: Support for Temporary Objects in Table
> >>> module
> >>>>
> >>>> Hi all,
> >>>> @Jingsong Could you elaborate a bit more what do you mean by
> >>>> "some Connectors are difficult to convert all states to properties"
> >>>> All the Flink provided connectors will definitely be expressible with
> >>> properties (In the end you should be able to use them from DDL). I
> think
> >> if
> >>> a TableSource is complex enough that it handles filter push down,
> >> partition
> >>> support etc. should rather be made available both from DDL & java/scala
> >>> code. I'm happy to reconsider adding registerTemporaryTable(String
> path,
> >>> TableSource source) if you have some concrete examples in mind.
> >>>>
> >>>>
> >>>> @Xuefu: We also considered the ObjectIdentifier (or actually
> >> introducing
> >>> a new identifier representation to differentiate between resolved and
> >>> unresolved identifiers) with the same concerns. We decided to suggest
> the
> >>> string & parsing logic because of usability.
> >>>>     tEnv.from("cat.db.table")
> >>>> is shorter and easier to write than
> >>>>     tEnv.from(Identifier.for("cat", "db", "name")
> >>>> And also implicitly solves the problem what happens if a user (e.g.
> >> used
> >>> to other systems) uses that API in a following manner:
> >>>>     tEnv.from(Identifier.for("db.name")
> >>>> I'm happy to revisit it if the general consensus is that it's better
> to
> >>> use the OO aproach.
> >>>> Best,
> >>>> Dawid
> >>>>
> >>>> On 06/09/2019 10:00, Xuefu Z wrote:
> >>>>
> >>>> Thanks to Dawid for starting the discussion and writeup. It looks
> >> pretty
> >>>> good to me except that I'm a little concerned about the object
> >> reference
> >>>> and string parsing in the code, which seems to an anti-pattern to OOP.
> >>> Have
> >>>> we considered using ObjectIdenitifier with optional catalog and db
> >> parts,
> >>>> esp. if we are worried about arguments of variable length or method
> >>>> overloading? It's quite likely that the result of string parsing is an
> >>>> ObjectIdentifier instance any way.
> >>>>
> >>>> Having string parsing logic in the code is a little dangerous as it
> >>>> duplicates part of the DDL/DML parsing, and they can easily get out of
> >>> sync.
> >>>>
> >>>> Thanks,
> >>>> Xuefu
> >>>>
> >>>> On Fri, Sep 6, 2019 at 1:57 PM JingsongLee <lzljs3620...@aliyun.com
> >>> .invalid>
> >>>> wrote:
> >>>>
> >>>>
> >>>> Thanks dawid, +1 for this approach.
> >>>>
> >>>> One concern is the removal of registerTableSink & registerTableSource
> >>>>  in TableEnvironment. It has two alternatives:
> >>>> 1.the properties approach (DDL, descriptor).
> >>>> 2.from/toDataStream.
> >>>>
> >>>> #1 can only be properties, not java states, and some Connectors
> >>>>  are difficult to convert all states to properties.
> >>>> #2 can contain java state. But can't use TableSource-related features,
> >>>> like project & filter push down, partition support, etc..
> >>>>
> >>>> Any idea about this?
> >>>>
> >>>> Best,
> >>>> Jingsong Lee
> >>>>
> >>>>
> >>>> ------------------------------------------------------------------
> >>>> From:Dawid Wysakowicz <dwysakow...@apache.org>
> >>>> Send Time:2019年9月4日(星期三) 22:20
> >>>> To:dev <dev@flink.apache.org>
> >>>> Subject:[DISCUSS] FLIP-64: Support for Temporary Objects in Table
> >> module
> >>>>
> >>>> Hi all,
> >>>> As part of FLIP-30 a Catalog API was introduced that enables storing
> >>> table
> >>>> meta objects permanently. At the same time the majority of current
> APIs
> >>>> create temporary objects that cannot be serialized. We should clarify
> >> the
> >>>> creation of meta objects (tables, views, functions) in a unified way.
> >>>> Another current problem in the API is that all the temporary objects
> >> are
> >>>> stored in a special built-in catalog, which is not very intuitive for
> >>> many
> >>>> users, as they must be aware of that catalog to reference temporary
> >>> objects.
> >>>> Lastly, different APIs have different ways of providing object paths:
> >>>>
> >>>> String path…,
> >>>> String path, String pathContinued…
> >>>> String name
> >>>> We should choose one approach and unify it across all APIs.
> >>>> I suggest a FLIP to address the above issues.
> >>>> Looking forward to your opinions.
> >>>> FLIP link:
> >>>>
> >>>
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-64%3A+Support+for+Temporary+Objects+in+Table+module
> >>>>
> >>
> >>
>
>

Reply via email to