GitHub user rdblue opened a pull request:

    https://github.com/apache/spark/pull/21503

    [SPARK-24478][SQL] Move projection and filter push down to physical 
conversion

    ## What changes were proposed in this pull request?
    
    This removes the v2 optimizer rule for push-down and instead pushes filters 
and required columns when converting to a physical plan, as suggested by 
@marmbrus. This makes the v2 relation cleaner because the output and filters do 
not change in the logical plan.
    
    To solve the problem of getting accurate statistics in the optimizer 
(push-down happens later, now), this adds a new trait, SupportsPhysicalStats 
that calculates LeafNode stats using the filters and projection. This trait can 
also be implemented by v1 data sources to get more accurate stats for CBO.
    
    The first commit was proposed in #21262. This PR replaces #21262.
    
    ## How was this patch tested?
    
    Existing tests.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/rdblue/spark 
SPARK-24478-move-push-down-to-physical-conversion

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/21503.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #21503
    
----
commit c8517e145b1a460a8be07164c17ce20b1db86659
Author: Ryan Blue <blue@...>
Date:   2018-05-07T20:08:02Z

    DataSourceV2: push projection, filters when converting to physical plan.

commit 9d3a11e68bca6c5a56a2be47fb09395350362ac5
Author: Ryan Blue <blue@...>
Date:   2018-06-06T20:17:16Z

    SPARK-24478: Add trait to report stats with filters and projection.

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to