Re: [DISCUSS] [FLINK-16824] Creating Temporal Table Function via DDL

2020-07-28 Thread Leonard Xu
Hi, all I open a new discussion of FLIP-132[1] which based on our consensus on current thread. Let me keep communication in the new thread, please let me know if you have any concerns. Best Leonard [1]

Re: [DISCUSS] [FLINK-16824] Creating Temporal Table Function via DDL

2020-07-06 Thread Seth Wiesman
* I mistyped the rejected_query, it should be CREATE VIEW AS post_agg_stream SELECT currencyId, AVG(rate)* as *rate* FROM *currency_rates CREATE VIEW AS rejected_query SELECT ...FROM transactions AS t JOIN post_agg_stream FOR SYSTEM_TIME AS OF t.transactionTime AS r ON r.currency =

Re: [DISCUSS] [FLINK-16824] Creating Temporal Table Function via DDL

2020-07-06 Thread Seth Wiesman
Hey Leonard, Agreed, this is a fun discussion! (1) For support changelog source backed CDC tools, a problem is that can we > use the temporal table as a general source table which may followed by some > aggregation operations, more accurate is wether the aggregation operator > can use the

Re: [DISCUSS] [FLINK-16824] Creating Temporal Table Function via DDL

2020-07-06 Thread Benchao Li
Hi everyone, Thanks a lot for the great discussions so far. After reading through the long discussion, I still have one question. Currently the temporal table function supports both event time and proc time joining. If we use "FOR SYSTEM_TIME AS OF" syntax without "TEMPORAL" keyword in DDL, does

Re: [DISCUSS] [FLINK-16824] Creating Temporal Table Function via DDL

2020-07-06 Thread Leonard Xu
Hi, Seth Thanks for your explanation of user cases, and you’re wright the look up join/table is one kind of temporal table join/table which tracks latest snapshot of external DB-like tables, it's why we proposed use same temporal join syntax. In fact, I have invested and checked Debezuim

Re: [DISCUSS] [FLINK-16824] Creating Temporal Table Function via DDL

2020-07-06 Thread Seth Wiesman
As an aside, I conceptually view temporal table joins to be semantically equivalent to look up table joins. They are just two different ways of consuming the same data. Seth On Mon, Jul 6, 2020 at 8:56 AM Seth Wiesman wrote: > Hi Leonard, > > Regarding DELETE operations I tend to have the

Re: [DISCUSS] [FLINK-16824] Creating Temporal Table Function via DDL

2020-07-06 Thread Seth Wiesman
Hi Leonard, Regarding DELETE operations I tend to have the opposite reaction. I spend a lot of time working with production Flink users across a large number of organizations and to say we don't support temporal tables that include DELETEs will be a blocker for adoption. Even organizations that

Re: [DISCUSS] [FLINK-16824] Creating Temporal Table Function via DDL

2020-07-03 Thread Leonard Xu
Hi, Konstantin > . Would we support a temporal join with a changelog stream with > event time semantics by ignoring DELETE messages or would it be completed > unsupported. I don’t know the percentage of this feature in temporal scenarios. Comparing to support the approximate event time join by

Re: [DISCUSS] [FLINK-16824] Creating Temporal Table Function via DDL

2020-07-03 Thread Konstantin Knauf
Hi Leonard, Thank you for the summary. I don't fully understand the implications of (3). Would we support a temporal join with a changelog stream with event time semantics by ignoring DELETE messages or would it be completed unsupported. I mean something like the following sequence of statements:

Re: [DISCUSS] [FLINK-16824] Creating Temporal Table Function via DDL

2020-07-03 Thread Leonard Xu
Thanks Jingsong, Jark, Knauf, Seth for sharing your thoughts. Although we discussed many details about the concept, I think it’s worth to clarify the semantic from long term goals. Temporal table concept was first imported in SQL:2011, I made some investigation of Temporal Table work mechanism

Re: [DISCUSS] [FLINK-16824] Creating Temporal Table Function via DDL

2020-07-02 Thread Seth Wiesman
It is clear there are a lot of edge cases with temporal tables that need to be carefully thought out. If we go at this problem from the perspective of what a majority of users need to accomplish in production, I believe there is a simpler version of this problem we can solve that can be expanded

Re: [DISCUSS] [FLINK-16824] Creating Temporal Table Function via DDL

2020-07-02 Thread Konstantin Knauf
Hi everyone, well, this got complicated :) Let me add my thoughts: * Temporal Table Joins are already quite hard to understand for many users. If need be, we should trade off for simplicity. * The important case is the *event time *temporal join. In my understanding processing time temporal

Re: [DISCUSS] [FLINK-16824] Creating Temporal Table Function via DDL

2020-06-28 Thread Jingsong Li
Thanks for your discussion. Looks like the problem is supporting the versioned temporal table for the changelog source. I want to share more of my thoughts: When I think about changelog sources, I treat it as a view like: "CREATE VIEW changelog_table AS SELECT ... FROM origin_table GROUP BY

Re: [DISCUSS] [FLINK-16824] Creating Temporal Table Function via DDL

2020-06-25 Thread Jark Wu
Hi all, Thanks Leonard for summarizing our discussion. I want to share more of my thoughts: * rowtime is a column in the its schema, so the rowtime of DELETE event is the value of the previous image. * operation time is the time when the DML statements happen in databases, so the operation time

Re: [DISCUSS] [FLINK-16824] Creating Temporal Table Function via DDL

2020-06-24 Thread Leonard Xu
Hi, kurt, Fabian After an offline discussion with Jark, We think that the 'PERIOD FOR SYSTEM_TIME(operation_time)' statement might be needed now. Changelog table is superset of insert-only table, use PRIMARY KEY and rowtime may work well in insert-only or upsert source but has some problem in

Re: [DISCUSS] [FLINK-16824] Creating Temporal Table Function via DDL

2020-06-23 Thread Leonard Xu
Hi, everyone Thanks Fabian,Kurt for making the multiple version(event time) clear, I also like the 'PERIOD FOR SYSTEM' syntax which supported in SQL standard. I think we can add some explanation of the multiple version support in the future section of FLIP. For the PRIMARY KEY semantic, I

Re: [DISCUSS] [FLINK-16824] Creating Temporal Table Function via DDL

2020-06-23 Thread Fabian Hueske
Thanks Kurt, Yes, you are right. The `PERIOD FOR SYSTEM_TIME` that you linked before corresponds to the VERSION clause that I used and would explicitly define the versioning of a table. I didn't know that the `PERIOD FOR SYSTEM_TIME` cause is already defined by the SQL standard. I think we would

Re: [DISCUSS] [FLINK-16824] Creating Temporal Table Function via DDL

2020-06-23 Thread Kurt Young
Hi Fabian, I agree with you that implicitly letting event time to be the version of the table will work in most cases, but not for all. That's the reason I mentioned `PERIOD FOR` [1] syntax in my first email, which is already in sql standard to represent the validity of each row in the table. If

Re: [DISCUSS] [FLINK-16824] Creating Temporal Table Function via DDL

2020-06-23 Thread Fabian Hueske
Hi everyone, Every table with a primary key and an event-time attribute provides what is needed for an event-time temporal table join. I agree that, from a technical point of view, the TEMPORAL keyword is not required. I'm more sceptical about implicitly deriving the versioning information of a

Re: [DISCUSS] [FLINK-16824] Creating Temporal Table Function via DDL

2020-06-22 Thread Jark Wu
I'm also +1 for not adding the TEMPORAL keyword. +1 to make the PRIMARY KEY semantic clear for sources. >From my point of view: 1) PRIMARY KEY on changelog souruce: It means that when the changelogs (INSERT/UPDATE/DELETE) are materialized, the materialized table should be unique on the primary

Re: [DISCUSS] [FLINK-16824] Creating Temporal Table Function via DDL

2020-06-22 Thread Konstantin Knauf
Hi everyone, I also agree with Leonard/Kurt's proposal for CREATE TEMPORAL TABLE. Best, Konstantin On Mon, Jun 22, 2020 at 10:53 AM Kurt Young wrote: > I agree with Timo, semantic about primary key needs more thought and > discussion, especially after FLIP-95 and FLIP-105. > > Best, > Kurt >

Re: [DISCUSS] [FLINK-16824] Creating Temporal Table Function via DDL

2020-06-22 Thread Kurt Young
I agree with Timo, semantic about primary key needs more thought and discussion, especially after FLIP-95 and FLIP-105. Best, Kurt On Mon, Jun 22, 2020 at 4:45 PM Timo Walther wrote: > Hi Leonard, > > thanks for the summary. > > After reading all of the previous arguments and working on

Re: [DISCUSS] [FLINK-16824] Creating Temporal Table Function via DDL

2020-06-22 Thread Timo Walther
Hi Leonard, thanks for the summary. After reading all of the previous arguments and working on FLIP-95. I would also lean towards the conclusion of not adding the TEMPORAL keyword. After FLIP-95, what we considered as a CREATE TEMPORAL TABLE can be represented as a CREATE TABLE with PRIMARY

Re: [DISCUSS] [FLINK-16824] Creating Temporal Table Function via DDL

2020-06-20 Thread Leonard Xu
Hi everyone, Thanks for the nice discussion. I’d like to move forward the work, please let me simply summarize the main opinion and current divergences. 1. The agreements have been achieved: 1.1 The motivation we're discussing temporal table DDL is just for creating temporal table in pure SQL

Re: [DISCUSS] [FLINK-16824] Creating Temporal Table Function via DDL

2020-05-13 Thread Kurt Young
Thanks for sharing your opinion. I can see there are some very small divergences we had through your description. I think it would be a good idea to first discuss these first. Let's first put aside table version for now, and only discuss about whether a DDL table should be treated as a DMBS style

Re: [DISCUSS] [FLINK-16824] Creating Temporal Table Function via DDL

2020-05-13 Thread Fabian Hueske
I think Flink should behave similar to other DBMSs. Other DBMS do not allow to query the history of a table, even though the DBMS has seen all changes of the table (as transactions or directly as a changelog if the table was replicated) and recorded them in its log. You need to declare a table as

Re: [DISCUSS] [FLINK-16824] Creating Temporal Table Function via DDL

2020-05-08 Thread Kurt Young
All tables being described by Flink's DDL are dynamic tables. But dynamic table is more like a logical concept, but not physical things. Physically, dynamic table has two different forms, one is a materialized table which changes over time (e.g. Database table, HBase table), another form is stream

Re: [DISCUSS] [FLINK-16824] Creating Temporal Table Function via DDL

2020-05-08 Thread Fabian Hueske
I think we need the TEMPORAL TABLE syntax because they are conceptually more than just regular tables. In a addition to being a table that always holds the latest values (and can thereby serve as input to a continuous query), the system also needs to track the history of such a table to be able to

Re: [DISCUSS] [FLINK-16824] Creating Temporal Table Function via DDL

2020-05-08 Thread Kurt Young
I might missed something but why we need a new "TEMPORAL TABLE" syntax? According to Fabian's first mail: > Hence, the requirements for a temporal table are: > * The temporal table has a primary key / unique attribute > * The temporal table has a time-attribute that defines the start of the >

Re: [DISCUSS] [FLINK-16824] Creating Temporal Table Function via DDL

2020-05-07 Thread Jark Wu
Hi, I agree what Fabian said above. Besides, IMO, (3) is in a lower priority and will involve much more things. It makes sense to me to do it in two-phase. Regarding to (3), the key point to convert an append-only table into changelog table is that the framework should know the operation type,

Re: [DISCUSS] [FLINK-16824] Creating Temporal Table Function via DDL

2020-05-07 Thread Fabian Hueske
Thanks for the summary Konstantin. I think you got all points right. IMO, the way forward would be to work on a FLIP to define * the concept of temporal tables, * how to feed them from retraction tables * how to feed them from append-only tables * their specification with CREATE TEMPORAL TABLE, *

Re: [DISCUSS] [FLINK-16824] Creating Temporal Table Function via DDL

2020-05-07 Thread Konstantin Knauf
Hi everyone, Thanks everyone for joining the discussion on this. Please let me summarize what I have understood so far. 1) For joining an append-only table and a temporal table the syntax the "FOR SYSTEM_TIME AS OF " seems to be preferred (Fabian, Timo, Seth). 2) To define a temporal table

Re: [DISCUSS] [FLINK-16824] Creating Temporal Table Function via DDL

2020-04-18 Thread Jark Wu
Hi Fabian, Just to clarify a little bit, we decided to move the "converting append-only table into changelog table" into future work. So FLIP-105 only introduced some CDC formats (debezium) and new TableSource interfaces proposed in FLIP-95. I should have started a new FLIP for the new CDC

Re: [DISCUSS] [FLINK-16824] Creating Temporal Table Function via DDL

2020-04-17 Thread Fabian Hueske
Thanks Jark! I certainly need to read up on FLIP-105 (and I'll try to adjust my terminology to changelog table from now on ;-) ) If FLIP-105 addresses the issue of converting an append-only table into a changelog table that upserts on primary key (basically what the VIEW definition in my first

Re: [DISCUSS] [FLINK-16824] Creating Temporal Table Function via DDL

2020-04-17 Thread Jark Wu
Hi Fabian, I think converting an append-only table into temporal table contains two things: (1) converting append-only table into changelog table (or retraction table as you said) (2) define the converted changelog table (maybe is a view now) as temporal (or history tracked). The first thing is

Re: [DISCUSS] [FLINK-16824] Creating Temporal Table Function via DDL

2020-04-17 Thread Fabian Hueske
Hi, I agree with most of what Timo said. The TEMPORAL keyword (which unfortunately might be easily confused with TEMPORARY...) looks very intuitive and I think using the only time attribute for versioning would be a good choice. However, TEMPORAL TABLE on retraction tables do not solve the full

Re: [DISCUSS] [FLINK-16824] Creating Temporal Table Function via DDL

2020-04-17 Thread Seth Wiesman
I really like the TEMPORAL keyword, I find it very intuitive. The down side of this approach would be that an additional preprocessing > step would not be possible anymore because there is no preceding view. > Yes and no. My understanding is we are not talking about making any changes to how

Re: [DISCUSS] [FLINK-16824] Creating Temporal Table Function via DDL

2020-04-17 Thread Fabian Hueske
Hi all, First of all, I appologize for the text wall that's following... ;-) A temporal table join joins an append-only table and a temporal table. The question about how to represent a temporal table join boils down to two questions: 1) How to represent a temporal table 2) How to specify the

Re: [DISCUSS] [FLINK-16824] Creating Temporal Table Function via DDL

2020-04-16 Thread Jark Wu
Hi Konstantin, Thanks for bringing this discussion. I think temporal join is a very important feature and should be exposed to pure SQL users. And I already received many requirements like this. However, my concern is that how to properly support this feature in SQL. Introducing a DDL syntax for

Re: [DISCUSS] [FLINK-16824] Creating Temporal Table Function via DDL

2020-04-16 Thread Benchao Li
Hi Konstantin, Thanks for bringing up this discussion. +1 for the idea. We have met this in our company too, and I planned to support it recently in our internal branch. regarding to your questions, 1) I think it might be more a table/view than function, just like Temporal Table (which is also

[DISCUSS] [FLINK-16824] Creating Temporal Table Function via DDL

2020-04-16 Thread Konstantin Knauf
Hi everyone, it would be very useful if temporal tables could be created via DDL. Currently, users either need to do this in the Table API or in the environment file of the Flink CLI, which both require the user to switch the context of the SQL CLI/Editor. I recently created a ticket for this