alamb opened a new pull request #500:
URL: https://github.com/apache/arrow-datafusion/pull/500


   Closes https://github.com/apache/arrow-datafusion/issues/490
   
   This PR adds support for pruning of boolean predicates such as `flag_col`, 
and `not flag_col` so that they can be used to prune row groups from parquet 
files and other predicates
   
   It does *not* add code to handle `flag_col = true` and `flag_col != false` 
(which currently error and continue to do so) as those are simplified in the 
ConstantEvaluation pass. 
   
   This ended up being a larger change than I wanted because the logic to 
create `col_min` and `col_max` references was intertwined in 
`PruningExpressionBuilder`
   
    # Rationale for this change
   See https://github.com/apache/arrow-datafusion/issues/490
   
   
   # What changes are included in this PR?
   
   Major changes:
   1. Encapsulate `stat_column_req `into a new `RequiredStatColumns` struct
   2. Move expression reference and rewriting logic to `StatisticsColumns`
   3. Add rules for boolean columns
   
   # Are there any user-facing changes?
   Additional predicates can be used to prune
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to