GitHub user codingjaguar opened a pull request: https://github.com/apache/spark/pull/10163
[SPARK-12161][SQL] Ignore order of predicates in cache matching This PR improves `LogicalPlan.sameResult` so that semantically equivalent queries with different order of predicates are still matched. Consider an example: Query 1: CACHE TABLE first AS SELECT * FROM table A where A.id >100 AND A.id < 200; Query 2: SELECT * FROM table A where A.id < 200 AND A.id > 100; Currently in SparkSQL, Query 2 cannot utilize the cache result of query 1, although query 1 and query 2 are the same if ignoring the order of the predicates. We modified the compare function `LogicalPlan.sameResult`. The idea is to split the condition of filter into a sequence of expressions and wrap it into a set. Now we can easily compare the sets rather than literally compare the conditions, thus ignoring the order of the predicates. You can merge this pull request into a Git repository by running: $ git pull https://github.com/codingjaguar/spark master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/10163.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #10163 ---- commit 579b5a24a726a0739284714ca35ffff4b6441537 Author: Jiang Chen <codingjag...@gmail.com> Date: 2015-12-05T17:38:32Z Add equivalentConditions() commit 0a096983f86945415ef6fb91af0be400d4a05aaf Author: windscope <windsco...@gmail.com> Date: 2015-12-05T18:04:57Z set comparison for projection commit 85f1ebca7ec568c84a2f08a131b3512c5a204526 Author: windscope <windsco...@gmail.com> Date: 2015-12-05T19:48:02Z Fix set conversion bug commit 1a2b534e01d0010d64fbe3b2382bd59eb0a28a4b Author: windscope <windsco...@gmail.com> Date: 2015-12-05T20:13:42Z Remove set comparison of projection commit 5fcb85ca97e8f48aff7848e51e5dfb187597dbee Author: windscope <windsco...@gmail.com> Date: 2015-12-05T21:02:32Z Add test case for filter condition order commit 8f93c6aa7a628b71e789385664352878d3e2fd3d Author: windscope <windsco...@gmail.com> Date: 2015-12-05T21:32:53Z Fix style error commit 6eb6fddf82220c3181cd7151b573fe135d2e9c0a Author: windscope <windsco...@gmail.com> Date: 2015-12-06T00:58:53Z add testcase for SameResultSuite commit 02fc878081da8ccd313fd60ed0ff81f9735794c0 Author: windscope <windsco...@gmail.com> Date: 2015-12-06T01:44:00Z add testcase for OR split filter condition commit bcb6df01a1706d92d728c4dce02a600be88f3fd9 Author: Jiang Chen <codingjag...@gmail.com> Date: 2015-12-06T02:04:50Z Supported expressions with disjunctive predicates; refactor cleanArgs so that we can reuse cleanExpression(). commit 360bb2b9169f1ae030c040bec4e035f2ce8dc0c7 Author: windscope <windsco...@gmail.com> Date: 2015-12-06T02:07:00Z Merge branch 'jiang.filter-set' of github.com:codingjaguar/spark into jiang.filter-set commit 94837d697c94a8f83bd4384f5681321b5cfe5d97 Author: Jiang Chen <codingjag...@gmail.com> Date: 2015-12-06T02:46:27Z Merge branch 'jiang.filter-set' commit 0de3d7e10789b5e46e67f50942e607b8f229f64d Author: windscope <windsco...@gmail.com> Date: 2015-12-06T02:07:00Z Merge branch 'jiang.filter-set' of github.com:codingjaguar/spark into jiang.filter-set commit 13ce03f2f5132eb8264e2f8d785410a8a94efec0 Author: Jiang Chen <codingjag...@gmail.com> Date: 2015-12-06T02:50:08Z Removed dead code commit 9f6df41f67540765abd646e917c13237f4af2147 Author: Jiang Chen <codingjag...@gmail.com> Date: 2015-12-06T02:50:26Z Merge branch 'jiang.filter-set' of github.com:codingjaguar/spark into jiang.filter-set commit e63df887670817937c2cd2da57aa8d5f06553ce7 Author: Jiang Chen <codingjag...@gmail.com> Date: 2015-12-06T02:52:11Z Merge branch 'jiang.filter-set' ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org