GitHub user antonoal opened a pull request:

    https://github.com/apache/spark/pull/11777

    Added transitive closure transformation to Catalyst

    ## What changes were proposed in this pull request?
    A relatively simple transformation is missing from Catalyst's arsenal - 
generation of transitive predicates. For instance, if you have got the 
following query:
    `select * 
    from   table1 t1
    join   table2 t2
    on     t1.a = t2.b
    where  t1.a = 42`
    then it is a fair assumption that t2.b also equals 42 hence an additional 
predicate could be generated. The additional predicate could in turn be pushed 
down through the join and improve performance of the whole query by filtering 
out the data before joining it.
    Such a transformation exists in Oracle DB.
    Please note, in this PR a transitive predicate would be created for the 
following operations: 
    * a BinaryComparison (=, >=, etc.) to a foldable
    * in (1, 2, 3) where all the values in the sequence are foldable
    * Not of any of the above
    * Or of any of the above
    
    ## How was this patch tested?
    I've added a new TransitiveClosureSuite with a series of unit tests

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/antonoal/spark transitive-closure

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/11777.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #11777
    
----
commit 7df4117749f7afc2e5e95190cf93a961b9c6ed3a
Author: Alex Antonov <3091...@gmail.com>
Date:   2016-03-16T21:53:38Z

    Added transitive closure transformation to Catalyst

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to