drin commented on PR #18648:
URL: https://github.com/apache/datafusion/pull/18648#issuecomment-3993073808
Well, `date_trunc('month', date_col) < '2026-02-23'` can be optimized by
this framework, and if you fallback on not using `preimage` then you can't
optimize it at all. The literal, `'2026-02-23'`, could come from a variable;
then, it's not an edge case at all, it's just a typical engineering scenario:
`df = ctx.sql(f"SELECT date_trunc('month', date_col) < {date_literal}")`. It
seems like an oversight to me to require a user to normalize `date_literal` in
this case as a prerequisite to getting faster query performance. It's also
likely to lead to tribal knowledge: "if you are using `x` function, then you
can get better optimization if you do `y` to align it to a boundary."
The extra complexity is:
1. An extra comparison (`is_boundary = value == lower`)
2. An extra bool attribute in the `PreimageResult::Range` enum
3. A few extra match arms (`Operator::Eq && is_boundary`)
and the optimization potentially:
1. removes the predicate altogether
2. removes the operator (and potentially the plan sub-tree)
Considering the complexity of a time type vs timestamp type amplified by the
date trunc granularity, I think the extra complexity cost is basically 0 (for
`date_part`). So, to me it's a clear that it's worth the cost.
Related:
- duckdb accommodates this (https://github.com/duckdb/duckdb/pull/18457)
- This PR accommodates the same change for
[floor](https://github.com/apache/datafusion/pull/18648/changes#diff-077176fcf22cb36a0a51631a43739f5f015f46305be4f49142a450e25b152b84)
- although, its logic would have to be modified to return a proper value
instead of None to accommodate these cases
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]