Hi Dawid,
thanks for the design document. It fixes big concept gaps due to
historical reasons with proper support for serializability and catalog
support in mind.
I would not mind a registerTemporarySource/Sink, but the problem that I
see is that many people think that this is the recommended way of
registering a table source/sink which is not true. We should guide users
to either use connect() or DDL API which can be validated and stored in
catalog.
Also from a concept perspective, registering a source/sink does not fit
into the SQL world. SQL does not know about source/sinks but only about
tables. If the responsibility of a TableSource/TableSink is just a pure
physical data consumer/producer that is not connected to the actual
logical table schema, we would need a possibility of defining time
attributes and interpreting/converting a changelog. This should be done
by the framework with information from the DDL/connect() and not be
defined in every table source.
Regards,
Timo
On 09.09.19 14:16, JingsongLee wrote:
Hi dawid:
It is difficult to describe specific examples.
Sometimes users will generate some java converters through some
Java code, or generate some Java classes through third-party
libraries. Of course, these can be best done through properties.
But this requires additional work from users.My suggestion is to
keep this Java instance class way that is user-friendly.
Best,
Jingsong Lee
------------------------------------------------------------------
From:Dawid Wysakowicz <dwysakow...@apache.org>
Send Time:2019年9月6日(星期五) 16:21
To:dev <dev@flink.apache.org>
Subject:Re: [DISCUSS] FLIP-64: Support for Temporary Objects in Table module
Hi all,
@Jingsong Could you elaborate a bit more what do you mean by
"some Connectors are difficult to convert all states to properties"
All the Flink provided connectors will definitely be expressible with properties
(In the end you should be able to use them from DDL). I think if a TableSource is
complex enough that it handles filter push down, partition support etc. should
rather be made available both from DDL & java/scala code. I'm happy to
reconsider adding registerTemporaryTable(String path, TableSource source) if you
have some concrete examples in mind.
@Xuefu: We also considered the ObjectIdentifier (or actually introducing a new
identifier representation to differentiate between resolved and unresolved
identifiers) with the same concerns. We decided to suggest the string & parsing
logic because of usability.
tEnv.from("cat.db.table")
is shorter and easier to write than
tEnv.from(Identifier.for("cat", "db", "name")
And also implicitly solves the problem what happens if a user (e.g. used to
other systems) uses that API in a following manner:
tEnv.from(Identifier.for("db.name")
I'm happy to revisit it if the general consensus is that it's better to use the
OO aproach.
Best,
Dawid
On 06/09/2019 10:00, Xuefu Z wrote:
Thanks to Dawid for starting the discussion and writeup. It looks pretty
good to me except that I'm a little concerned about the object reference
and string parsing in the code, which seems to an anti-pattern to OOP. Have
we considered using ObjectIdenitifier with optional catalog and db parts,
esp. if we are worried about arguments of variable length or method
overloading? It's quite likely that the result of string parsing is an
ObjectIdentifier instance any way.
Having string parsing logic in the code is a little dangerous as it
duplicates part of the DDL/DML parsing, and they can easily get out of sync.
Thanks,
Xuefu
On Fri, Sep 6, 2019 at 1:57 PM JingsongLee <lzljs3620...@aliyun.com.invalid>
wrote:
Thanks dawid, +1 for this approach.
One concern is the removal of registerTableSink & registerTableSource
in TableEnvironment. It has two alternatives:
1.the properties approach (DDL, descriptor).
2.from/toDataStream.
#1 can only be properties, not java states, and some Connectors
are difficult to convert all states to properties.
#2 can contain java state. But can't use TableSource-related features,
like project & filter push down, partition support, etc..
Any idea about this?
Best,
Jingsong Lee
------------------------------------------------------------------
From:Dawid Wysakowicz <dwysakow...@apache.org>
Send Time:2019年9月4日(星期三) 22:20
To:dev <dev@flink.apache.org>
Subject:[DISCUSS] FLIP-64: Support for Temporary Objects in Table module
Hi all,
As part of FLIP-30 a Catalog API was introduced that enables storing table
meta objects permanently. At the same time the majority of current APIs
create temporary objects that cannot be serialized. We should clarify the
creation of meta objects (tables, views, functions) in a unified way.
Another current problem in the API is that all the temporary objects are
stored in a special built-in catalog, which is not very intuitive for many
users, as they must be aware of that catalog to reference temporary objects.
Lastly, different APIs have different ways of providing object paths:
String path…,
String path, String pathContinued…
String name
We should choose one approach and unify it across all APIs.
I suggest a FLIP to address the above issues.
Looking forward to your opinions.
FLIP link:
https://cwiki.apache.org/confluence/display/FLINK/FLIP-64%3A+Support+for+Temporary+Objects+in+Table+module