Leonidas Fegaras created MRQL-33:
------------------------------------

             Summary: Fix various bugs in iteration queries
                 Key: MRQL-33
                 URL: https://issues.apache.org/jira/browse/MRQL-33
             Project: MRQL
          Issue Type: Bug
          Components: Query Optimization
            Reporter: Leonidas Fegaras
            Assignee: Leonidas Fegaras


This is a major patch that fixes many errors related to MRQL iteration queries 
(repeat-queries) and optimizes matrix operations. Matrix factorization 
(queries/factorization.mrql) is now highly optimized. Here are the changes: 
1) New groupByJoin interface: Instead of a combiner and a mapper, it now uses 
an accumulator with a left zero value. A groupByJoin is a join followed by a 
groupBy that generalizes matrix multiplication. It is implemented using one 
map-reduce only based on BSP Valiant's algorithm.
2) New algebraic optimization rules that generate groupByJoins.
3) Compiler was extended to compile functions on persistent collections 
efficiently. A persistent collection is a Sequence file in map-reduce mode or 
an RDD in spark mode. Now these persistent collections do not have to be 
materialized in memory before function calls.
4) Global variable bindings are now passed as configuration parameters instead 
of replacing the variable with the value in the code.
I am attaching the patch next.




--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to