GitHub user ioana-delaney opened a pull request:

    https://github.com/apache/spark/pull/20868

    [SPARK-23750][SQL] Inner Join Elimination based on Informational RI 
constraints

    ## What changes were proposed in this pull request?
    
    This transformation detects RI joins and eliminates the parent/PK table if 
none of its columns, other than the PK columns, are referenced in the query. 
    
    **Example:**
    
    ```SQL
    select fact.c1
    from fact, dim1, dim2
    where fact.c1 = dim1.pk /* FK = PK */ and
          fact.c2 = dim2.pk /* FK = PK */ and
          dim1.pk = 10 and
          dim2.pk like ‘abc%’
    ```
    
    **Internal optimized query after join elimination:**
    
    ```SQL
    select fact.c1
    from fact 
    where fact.c1 = 10 and fact.c2 like ‘abc%’
    ```
    
    The transformation will apply under the following restrictions:
    
    - No columns from the parent table are retrieved.
    - No columns from the parent table other than the PK columns are referenced 
in the predicates.
    - Conservatively, only allow local predicates on PK columns or equi-joins 
between PK columns and other tables.
    - The join is directly above a base table access i.e. no aliases or other 
expressions above base table access
    - Other restrictions on string data types
    
    ## How was this patch tested?
    
    A new test suite HiveRIJElimSuite.scala was introduced.
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ioana-delaney/spark rijelim

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/20868.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #20868
    
----
commit 9d1f0f1841b4f7828534036c1b2cef4ef7f1d84a
Author: Ioana Delaney <ioanamdelaney@...>
Date:   2018-03-20T19:55:19Z

    [SPARK-23750] Add dependent DDL changes from SPARK-21784.

commit 0d189ab49b2dcb748b51f875f1a04e6b2fb9f69b
Author: Ioana Delaney <ioanamdelaney@...>
Date:   2018-03-20T23:29:11Z

    [SPARK-23750] Join elimination rewrite based on RI constraints.

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to