[
https://issues.apache.org/jira/browse/MRQL-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14110882#comment-14110882
]
Leonidas Fegaras commented on MRQL-45:
--------------------------------------
This patch adds support for Flink. You can run MRQL on Flink using the script
bin/mrql.flink. The Flink evaluation mode has been tested on Flink local and
pseudo-distributed modes and on a Yarn cluster with 16 nodes. Problems: 1) The
MRQL query compiler to java byte code doesn't work in local mode (but it uses
the MRQL interpreter instead). 2) Due to a Flink bug, total aggregations return
their results as a string which is parsed to numerical values. I have run all
tests and did some performance measurements using matrix multiplication and
pagerank. It seems that Flink is a bit slower than Spark but then again I don't
use the Flink optimizer much, because MRQL data are not relational data.
> Add support for Apache Flink
> -----------------------------
>
> Key: MRQL-45
> URL: https://issues.apache.org/jira/browse/MRQL-45
> Project: MRQL
> Issue Type: Improvement
> Affects Versions: 0.9.4
> Reporter: Leonidas Fegaras
> Assignee: Leonidas Fegaras
> Priority: Critical
> Attachments: MRQL-45.patch
>
>
> I am in the process of adding support for Apache Flink (previously called
> Stratosphere). I will wait for their first incubator release first. The Flink
> interface is similar to Spark but it also supports optimization. That is,
> Flink operations are not executed immediately, but they are collected and
> optimized using relational optimization techniques. Do we need to support
> Flink? It is nice but not crucial. Users can submit the same MRQL query to
> multiple platforms and pick the one that best suits their application. It may
> also expand our user community.
--
This message was sent by Atlassian JIRA
(v6.2#6252)