GitHub user sabanas opened a pull request: https://github.com/apache/spark/pull/20915
[SPARK-23803][SQL] Filter prune buckets ## What changes were proposed in this pull request? support bucket pruning when filtering on a single bucketed column on the following predicates - EqualTo, EqualNullSafe, In, And/Or predicates ## How was this patch tested? refactored unit tests to test the above. based on @gatorsmile work in https://github.com/apache/spark/commit/e3c75c6398b1241500343ff237e9bcf78b5396f9 You can merge this pull request into a Git repository by running: $ git pull https://github.com/sabanas/spark filter-prune-buckets Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/20915.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #20915 ---- commit 5be4dff693afe5379e4fe0ece081b91b98cc7c9a Author: Asher Saban <asaban@...> Date: 2018-03-23T23:17:29Z add bucket pruning functionality commit 4ab1583d26ab039a8458784d0a93d6aa25077603 Author: Asher Saban <asaban@...> Date: 2018-03-23T23:33:25Z add composite filters test cases and refactor pruning test commit 3bb7a2eecf8c2d15e1abc728e8432be9e346e22a Author: Asher Saban <asaban@...> Date: 2018-03-26T18:52:56Z remove redundant code and move shared getBucketId to BucketingUtils commit c45da4b6dd523fd5f2eaa041a5f684ab80db02f4 Author: Asher Saban <asaban@...> Date: 2018-03-27T16:31:34Z fix variable name and add select buckets count to metadata commit f0b84bd9231915f30f9643c083b2ab35cbbce472 Author: Asher Saban <asaban@...> Date: 2018-03-27T17:10:25Z optimize imports ---- --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org