Worst-case optimal join processing on Streams

Laurens VIJNCK Sat, 14 Dec 2019 06:52:45 -0800

Dear folks,

DISCLAIMER: With this mail, my sole intention is to establish contact with the 
community and trade ideas on how to realize the goal described below.


I'm a starting PhD researcher in distributed systems and databases who is 
particularly interested in worst-case optimal (multiway) join processing on 
streams. I have performed preliminary tests with a new join algorithm that 
shows rather promising results. However, the limitation is that the algorithm 
operates in a centralized fashion. My goal is to extend the capabilities of the 
algorithm to operate in a distributed environment. To showcase my results, I 
want to implement a proof-of-concept in Apache Flink. I know this is a rather 
ambitious project, hence why I am reaching out to the community.

I have traversed most of the application development documentation on the 
website (e.g., [1, 2, 3, 4]) but I am now eager the learn more about the 
internals thereof. Specifically, I want to gain some more insights in the 
lifecycle of a query in Flink. Is there some additional documentation available 
on this subject?

Thanks in advance.

[1] https://flink.apache.org/news/2015/04/13/release-0.9.0-milestone1.html
[2] 
https://ci.apache.org/projects/flink/flink-docs-stable/dev/table/streaming/dynamic_tables.html
[3] 
https://ci.apache.org/projects/flink/flink-docs-release-1.9/dev/table/streaming/joins.html
[4] https://cwiki.apache.org/confluence/display/FLINK/Optimizer+Internals

Kind regards,

Laurens Vijnck

Worst-case optimal join processing on Streams

Reply via email to