SubhamSinghal opened a new pull request, #21647:
URL: https://github.com/apache/datafusion/pull/21647

   ## Which issue does this PR close?
   
     - Closes #.                                                                
                                                    
      
     ## Rationale for this change                                               
                                                    '
   `PruningPredicate` currently cannot prune Parquet row groups for predicates 
with arithmetic expressions like `col + 5 > 10` or  `date_col + INTERVAL '30 
days' > '2024-01-01'`. The `rewrite_expr_to_prunable` function only handles 
plain columns, CAST, TRY_CAST, negation, and NOT — arithmetic `BinaryExpr` 
falls through to "can't prune", meaning every row group is scanned.      
   This is especially impactful for date/timestamp arithmetic in WHERE clauses (
   `WHERE order_date + INTERVAL '30 days' > CURRENT_DATE`), which is very 
common in analytics queries on Parquet tables.
                                                                                
                                                    
     ## What changes are included in this PR?                                   
                                                    
    Added support for arithmetic expressions (`+`, `-`) in 
`rewrite_expr_to_prunable`. The approach is "evaluate on min/max" — the 
arithmetic expression is passed through as the `column_expr`, and the existing 
`rewrite_column_expr` machinery substitutes `col` → `col_max`/`col_min` inside  
the arithmetic, producing predicates like `(col_max + 5) > 10`.
   
    ## Are these changes tested?                                                
                                                   
                                                                                
                                                    
     Yes, with UT
   
   ## Are there any user-facing changes?                                        
                                                  
    No


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to