Hello Flink and Tez, I would like to point you to a first version of Flink running on Tez. This is a Flink subproject (to be initially contributed to flink-addons) that allows you to run unmodified Flink programs on top of Apache Tez.
You can get the code here: https://github.com/ktzoumas/incubator-flink/tree/tez_support If you want to give it a spin, some basic instructions are here: https://github.com/ktzoumas/incubator-flink/tree/tez_support/flink-addons/flink-tez Be warned that this is still work in progress, so you may encounter bugs, and this has not yet been optimized for performance. A few words on how it works and the motivation: The programs pass as usual through the Flink compiler and use the Flink runtime operators (map, reduce, join, etc, including the Flink facilities for sorting, hashing, etc). Instead of generating a Flink distributed program (called "JobGraph" in Flink), we can now also generate a Tez program (called "DAG" in Tez). I have been asked why would we want to do that, as Flink has its own execution engine. Two reasons in my opinion. First, Tez follows design choices that are geared towards resource elasticity, whereas the design choices behind Flink's engine are geared more towards low latency querying and iterative processing. Therefoere, the two engines can really complement each other. Users can run their Flink programs in the engine that fits better their use case and setup. Second, in Flink we have put a lot of effort in separating program assembly with program execution and architecting the system in layers (APIs, common API, compiler, data processing runtime, distributed execution engine). The possibility to swap execution engines is a good showcase of the benefits of such a layered architecture. Of course, trying it out and reporting bugs or contributing is very welcome! Best, Kostas