[
https://issues.apache.org/jira/browse/MRQL-12?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Leonidas Fegaras updated MRQL-12:
---------------------------------
Attachment: Evaluator.gen
MRQL-12.2.patch
> Support query evaluation in Spark mode
> --------------------------------------
>
> Key: MRQL-12
> URL: https://issues.apache.org/jira/browse/MRQL-12
> Project: MRQL
> Issue Type: Improvement
> Components: Run-Time Data
> Affects Versions: 0.9.0
> Environment: Apache Spark http://spark-project.org/
> Reporter: Leonidas Fegaras
> Assignee: Leonidas Fegaras
> Attachments: Evaluator.gen, Evaluator.gen, MRQL-12.2.patch,
> MRQL-12.patch
>
> Original Estimate: 240h
> Remaining Estimate: 240h
>
> Spark provides primitives for in-memory cluster computing
> (http://spark-project.org/). It has been developed at UC Berkeley and has
> recently accepted as an ASF incubating project. It has already attracted many
> developers and I think it will play a major role in the hadoop ecosystem. So,
> I thought it will be nice to be able to evaluate MRQL queries in a Spark
> cluster. Spark already supports Hive (called Shark). Like Hama, Spark can
> evaluate queries in memory but unlike Hama, it supports full fault-tolerance.
> I have already written all the code but I have only tested it in local mode
> (on a single multi-core node). This task turned out to be easier than I
> thought because MRQL plans are similar to Spark operations. The only
> annoyance was that I had to make all data structures Serializable. I also had
> to include the Gen source code (the Java preprocessor), with ASF licence,
> which will make the transition to maven easier.
> I am attaching the patch below. The actual code that contains the Spark
> evaluator is the file Evaluator.gen which is attached separately.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira