GitHub user pwoody opened a pull request:

    https://github.com/apache/spark/pull/15835

    SPARK-17059: Allow FileFormat to specify partition pruning strategy

    ## What changes were proposed in this pull request?
    
    This is a follow up to changes in #14649 to meet the new codebase changes. 
It is slightly different in that it will not filter files explicitly, but it 
allows a FileFormat to prune splits (if applicable). This is implemented in 
ParquetFileFormat and every other format maintains the same behavior.
    
    ## How was this patch tested?
    
    Passing current tests and added two new tests to validate the pruning and 
ensure that excessive filtering does not occur on malformed metadata.
    
    Thanks!


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/pwoody/spark pw/parquetPruning

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/15835.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #15835
    
----
commit 1d947021b6b7662b64d40039bad94b6b5b59122a
Author: Patrick Woody <pwo...@palantir.com>
Date:   2016-11-09T19:14:17Z

    SPARK-17059: Allow FileFormat to specify partition pruning strategy

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to