[ 
https://issues.apache.org/jira/browse/MRQL-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14110882#comment-14110882
 ] 

Leonidas Fegaras commented on MRQL-45:
--------------------------------------

This patch adds support for Flink. You can run MRQL on Flink using the script 
bin/mrql.flink. The Flink evaluation mode has been tested on Flink local and 
pseudo-distributed modes and on a Yarn cluster with 16 nodes. Problems: 1) The 
MRQL query compiler to java byte code doesn't work in local mode (but it uses 
the MRQL interpreter instead). 2) Due to a Flink bug, total aggregations return 
their results as a string which is parsed to numerical values. I have run all 
tests and did some performance measurements using matrix multiplication and 
pagerank. It seems that Flink is a bit slower than Spark but then again I don't 
use the Flink optimizer much, because MRQL data are not relational data.


> Add support for Apache Flink 
> -----------------------------
>
>                 Key: MRQL-45
>                 URL: https://issues.apache.org/jira/browse/MRQL-45
>             Project: MRQL
>          Issue Type: Improvement
>    Affects Versions: 0.9.4
>            Reporter: Leonidas Fegaras
>            Assignee: Leonidas Fegaras
>            Priority: Critical
>         Attachments: MRQL-45.patch
>
>
> I am in the process of adding support for Apache Flink (previously called 
> Stratosphere). I will wait for their first incubator release first. The Flink 
> interface is similar to Spark but it also supports optimization. That is, 
> Flink operations are not executed immediately, but they are collected and 
> optimized using relational optimization techniques. Do we need to support 
> Flink? It is nice but not crucial. Users can submit the same MRQL query to 
> multiple platforms and pick the one that best suits their application. It may 
> also expand our user community.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to