GitHub user codingjaguar opened a pull request:

    https://github.com/apache/spark/pull/10163

    [SPARK-12161][SQL] Ignore order of predicates in cache matching

    This PR improves `LogicalPlan.sameResult` so that semantically equivalent 
queries with different order of predicates are still matched. 
    
    Consider an example:
    Query 1: CACHE TABLE first AS SELECT * FROM table A where A.id >100 AND 
A.id < 200;
    Query 2: SELECT * FROM table A where A.id < 200 AND A.id > 100;
    Currently in SparkSQL, Query 2 cannot utilize the cache result of query 1, 
although query 1 and query 2 are the same if ignoring the order of the 
predicates.
    We modified the compare function `LogicalPlan.sameResult`. The idea is to 
split the condition of filter into a sequence of expressions and wrap it into a 
set. Now we can easily compare the sets rather than literally compare the 
conditions, thus ignoring the order of the predicates.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/codingjaguar/spark master

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/10163.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #10163
    
----
commit 579b5a24a726a0739284714ca35ffff4b6441537
Author: Jiang Chen <codingjag...@gmail.com>
Date:   2015-12-05T17:38:32Z

    Add equivalentConditions()

commit 0a096983f86945415ef6fb91af0be400d4a05aaf
Author: windscope <windsco...@gmail.com>
Date:   2015-12-05T18:04:57Z

    set comparison for projection

commit 85f1ebca7ec568c84a2f08a131b3512c5a204526
Author: windscope <windsco...@gmail.com>
Date:   2015-12-05T19:48:02Z

    Fix set conversion bug

commit 1a2b534e01d0010d64fbe3b2382bd59eb0a28a4b
Author: windscope <windsco...@gmail.com>
Date:   2015-12-05T20:13:42Z

    Remove set comparison of projection

commit 5fcb85ca97e8f48aff7848e51e5dfb187597dbee
Author: windscope <windsco...@gmail.com>
Date:   2015-12-05T21:02:32Z

    Add test case for filter condition order

commit 8f93c6aa7a628b71e789385664352878d3e2fd3d
Author: windscope <windsco...@gmail.com>
Date:   2015-12-05T21:32:53Z

    Fix style error

commit 6eb6fddf82220c3181cd7151b573fe135d2e9c0a
Author: windscope <windsco...@gmail.com>
Date:   2015-12-06T00:58:53Z

    add testcase for SameResultSuite

commit 02fc878081da8ccd313fd60ed0ff81f9735794c0
Author: windscope <windsco...@gmail.com>
Date:   2015-12-06T01:44:00Z

    add testcase for OR split filter condition

commit bcb6df01a1706d92d728c4dce02a600be88f3fd9
Author: Jiang Chen <codingjag...@gmail.com>
Date:   2015-12-06T02:04:50Z

    Supported expressions with disjunctive predicates;
    refactor cleanArgs so that we can reuse cleanExpression().

commit 360bb2b9169f1ae030c040bec4e035f2ce8dc0c7
Author: windscope <windsco...@gmail.com>
Date:   2015-12-06T02:07:00Z

    Merge branch 'jiang.filter-set' of github.com:codingjaguar/spark into 
jiang.filter-set

commit 94837d697c94a8f83bd4384f5681321b5cfe5d97
Author: Jiang Chen <codingjag...@gmail.com>
Date:   2015-12-06T02:46:27Z

    Merge branch 'jiang.filter-set'

commit 0de3d7e10789b5e46e67f50942e607b8f229f64d
Author: windscope <windsco...@gmail.com>
Date:   2015-12-06T02:07:00Z

    Merge branch 'jiang.filter-set' of github.com:codingjaguar/spark into 
jiang.filter-set

commit 13ce03f2f5132eb8264e2f8d785410a8a94efec0
Author: Jiang Chen <codingjag...@gmail.com>
Date:   2015-12-06T02:50:08Z

    Removed dead code

commit 9f6df41f67540765abd646e917c13237f4af2147
Author: Jiang Chen <codingjag...@gmail.com>
Date:   2015-12-06T02:50:26Z

    Merge branch 'jiang.filter-set' of github.com:codingjaguar/spark into 
jiang.filter-set

commit e63df887670817937c2cd2da57aa8d5f06553ce7
Author: Jiang Chen <codingjag...@gmail.com>
Date:   2015-12-06T02:52:11Z

    Merge branch 'jiang.filter-set'

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to