Hi Konstantin,

Thanks for bringing this discussion. I think temporal join is a very
important feature and should be exposed to pure SQL users.
And I already received many requirements like this.
However, my concern is that how to properly support this feature in SQL.
Introducing a DDL syntax for Temporal Table Function is one way, but maybe
not the best one.

The most important reason is that the underlying of temporal table function
is exactly a changelog stream.
The temporal join is actually temporal joining a fact stream with the
changelog stream on processing time or event time.
We will soon support to create a changelog source using DDL once FLIP-95
and FLIP-105 is finished.
At that time, we can have a simple DDL to create changelog source like this;

CREATE TABLE rate_changelog (
  currency STRING,
  rate DECIMAL
) WITH (
  'connector' = 'kafka',
  'topic' = 'rate_binlog',
  'properties.bootstrap.servers' = 'localhost:9092',
  'format' = 'debezium-json'
);

In the meanwhile, we already have a SQL standard temporal join syntax [1],
i.e. the "A JOIN B FOR SYSTEM_TIME AS OF ..".
It is currently used as dimension table lookup join, but the semantic is
the same to the "temporal table function join"[2].
I'm in favor of "FOR SYSTEM_TIME AS OF" because it is more nature
becuase the definition of B is a *table* not a *table function*,
and the syntax is included in SQL standard.

So once we have the ability to define "rate_changelog" table, then we can
use the following query to temporal join the changelog on processing time.

SELECT *
FROM orders JOIN rate_changelog FOR SYSTEM_TIME AS OF orders.proctime
ON orders.currency = rate_changelog.currency;

In a nutshell, once FLIP-95 and FLIP-105 is ready, we can easily to support
"temporal join on changelogs" without introducing new syntax.
IMO, introducing a DDL syntax for Temporal Table Function looks like not an
easy way and may have repetitive work.

Best,
Jark

[1]:
https://ci.apache.org/projects/flink/flink-docs-master/dev/table/streaming/joins.html#join-with-a-temporal-table
[2]:
https://ci.apache.org/projects/flink/flink-docs-master/dev/table/streaming/joins.html#join-with-a-temporal-table-function





On Thu, 16 Apr 2020 at 23:04, Benchao Li <libenc...@gmail.com> wrote:

> Hi Konstantin,
>
> Thanks for bringing up this discussion. +1 for the idea.
> We have met this in our company too, and I planned to support it recently
> in our internal branch.
>
> regarding to your questions,
> 1) I think it might be more a table/view than function, just like Temporal
> Table (which is also known as
> dimension table). Maybe we need a DDL like CREATE VIEW and plus some
> additional settings.
> 2) If we design the DDL for it like view, then maybe temporary is ok
> enough.
>
> Konstantin Knauf <kna...@apache.org> 于2020年4月16日周四 下午8:16写道:
>
> > Hi everyone,
> >
> > it would be very useful if temporal tables could be created  via DDL.
> > Currently, users either need to do this in the Table API or in the
> > environment file of the Flink CLI, which both require the user to switch
> > the context of the SQL CLI/Editor. I recently created a ticket for this
> > request [1].
> >
> > I see two main questions:
> >
> > 1) What would be the DDL syntax? A Temporal Table is on the one hand a
> view
> > and on the other a function depending on how you look at it.
> >
> > 2) Would this temporal table view/function be stored in the catalog or
> only
> > be temporary?
> >
> > I personally do not have much experience in this area of Flink, so I am
> > looking forward to hearing your thoughts on this.
> >
> > Best,
> >
> > Konstantin
> >
> > [1] https://issues.apache.org/jira/browse/FLINK-16824
> >
> > --
> >
> > Konstantin Knauf
> >
>
>
> --
>
> Benchao Li
> School of Electronics Engineering and Computer Science, Peking University
> Tel:+86-15650713730
> Email: libenc...@gmail.com; libenc...@pku.edu.cn
>

Reply via email to