alamb opened a new issue, #18320:
URL: https://github.com/apache/datafusion/issues/18320

   ### Is your feature request related to a problem or challenge?
   
   As mentioned in https://github.com/apache/datafusion/issues/18319, it is 
much easier  for optimizers to reason about predicates of the form `<col> op 
<constant> `expressions. They often can't optimize anywhere near as well if 
they have a scalar function wrapping them
   
   This includes DataFusion's 
[PruningPredicate](https://docs.rs/datafusion/latest/datafusion/physical_optimizer/pruning/struct.PruningPredicate.html#contains-analysis-and-minmax-rewrite)
   
   For example the predicate looking for a particular year
   ```sql
   WHERE EXTRACT (YEAR FROM k) = 2024
   ```
   
   Can be rewritten as
   ```sql
   k >= 2024-01-01 AND k < 2025-01-01.
   ```
   
   And then k is easier to pushdown and subject to range analysis, etc. 
   
   The ClickHouse paper : https://www.vldb.org/pvldb/vol17/p3731-schulze.pdf 
calles these "preimage"  (from the [mathematical 
term](https://en.wikipedia.org/wiki/Image_(mathematics))) for this rewrite (I 
think toYear(k) is the equivalent of EXTRACT(YEAR from k))
   
   
   >  Second, some functions can compute the preimage of a given function 
result. This is used to replace comparisons of constants with function calls on 
the key columns by comparing the key column value with the preimage. For 
example, toYear(k) = 2024 can be replaced by k >= 2024-01-01 && k < 2025-01-01.
   
   ### Describe the solution you'd like
   
   I would like DataFusion to do this rewrite too
   
   
   ### Describe alternatives you've considered
   
   This might be possible to do by implementing 
[ScalarUDFImpl::simplify](https://docs.rs/datafusion/latest/datafusion/logical_expr/trait.ScalarUDFImpl.html#method.simplify)
 for date_part 👍 
   
https://github.com/apache/datafusion/blob/main/datafusion/functions/src/datetime/date_part.rs
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to