Hi Haohui,

thanks for your input!

Can you describe the semantics of the join you'd like to see in Flink 1.4?
I can think of three types of joins that match your description:
1) `table` is an external table stored in an external database (redis,
cassandra, MySQL, etc) and we join with the current value that is in that
table. This could be implemented with an async TableFunction (based on
Flink's AsyncFunction).
2) `table` is static: In this case we need support for side-inputs to read
the whole table before starting to process the other (streaming) side.
There is a FLIP [1] for side inputs. I don't know what's the status of this
feature though.
3) `table` changing and each record of the stream should be joined with the
most recent update (but no future updates). In this case, the query is more
complex to express and requires some time-bound logic which is quite
cumbersome to express in SQL. I think this is a very important type of
join, but IMO it is more challenging to implement than the other joins. We
had also a discussion about this type of join on the dev ML a few months
back [2].

Which type of join are you looking for (external table, static table,
dynamic table)?

Regarding the bottleneck of committers, the situation should become a bit
better in the near future as we have two committer more working on the
relational APIs (Jark is spending more time here and Shaoxuan recently
became a committer). However, we will of course continue to encourage and
help contributors to earn the merits to become committers and grow the
number of committers.

Thank you very much,
Fabian

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-17+Side+Inputs+for+DataStream+API
[2]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/STREAM-SQL-inner-queries-tp15585.html

Reply via email to