GitHub user sabanas opened a pull request:

    https://github.com/apache/spark/pull/20915

    [SPARK-23803][SQL] Filter prune buckets

    ## What changes were proposed in this pull request?
    support bucket pruning when filtering on a single bucketed column on the 
following predicates - 
    EqualTo, EqualNullSafe, In, And/Or predicates
    
    ## How was this patch tested?
    refactored unit tests to test the above. 
    
    based on @gatorsmile work in 
https://github.com/apache/spark/commit/e3c75c6398b1241500343ff237e9bcf78b5396f9


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/sabanas/spark filter-prune-buckets

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/20915.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #20915
    
----
commit 5be4dff693afe5379e4fe0ece081b91b98cc7c9a
Author: Asher Saban <asaban@...>
Date:   2018-03-23T23:17:29Z

    add bucket pruning functionality

commit 4ab1583d26ab039a8458784d0a93d6aa25077603
Author: Asher Saban <asaban@...>
Date:   2018-03-23T23:33:25Z

    add composite filters test cases and refactor pruning test

commit 3bb7a2eecf8c2d15e1abc728e8432be9e346e22a
Author: Asher Saban <asaban@...>
Date:   2018-03-26T18:52:56Z

    remove redundant code and move shared getBucketId to BucketingUtils

commit c45da4b6dd523fd5f2eaa041a5f684ab80db02f4
Author: Asher Saban <asaban@...>
Date:   2018-03-27T16:31:34Z

    fix variable name and add select buckets count to metadata

commit f0b84bd9231915f30f9643c083b2ab35cbbce472
Author: Asher Saban <asaban@...>
Date:   2018-03-27T17:10:25Z

    optimize imports

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to