pepijnve commented on code in PR #18821:
URL: https://github.com/apache/datafusion/pull/18821#discussion_r2542482873
##########
datafusion/expr/src/expr_rewriter/guarantees.rs:
##########
@@ -48,199 +34,282 @@ impl<'a> GuaranteeRewriter<'a> {
guarantees: impl IntoIterator<Item = &'a (Expr, NullableInterval)>,
) -> Self {
Self {
- // TODO: Clippy wants the "map" call removed, but doing so
generates
- // a compilation error. Remove the clippy directive once this
- // issue is fixed.
- #[allow(clippy::map_identity)]
guarantees: guarantees.into_iter().map(|(k, v)| (k, v)).collect(),
}
}
}
+/// Rewrite expressions to incorporate guarantees.
+///
+/// Guarantees are a mapping from an expression (which currently is always a
+/// column reference) to a [NullableInterval] that represents the known
possible
+/// values of the expression.
+///
+/// Rewriting expressions using this type of guarantee can make the work of
other expression
+/// simplifications, like const evaluation, easier.
+///
+/// For example, if we know that a column is not null and has values in the
+/// range [1, 10), we can rewrite `x IS NULL` to `false` or `x < 10` to `true`.
+///
+/// If the set of guarantees will be used to rewrite more than one expression,
consider using
+/// [rewrite_with_guarantees_map] instead.
+///
+/// A full example of using this rewrite rule can be found in
+///
[`ExprSimplifier::with_guarantees()`](https://docs.rs/datafusion/latest/datafusion/optimizer/simplify_expressions/struct.ExprSimplifier.html#method.with_guarantees).
+pub fn rewrite_with_guarantees<'a>(
+ expr: Expr,
+ guarantees: impl IntoIterator<Item = &'a (Expr, NullableInterval)>,
+) -> Result<Transformed<Expr>> {
+ let guarantees_map: HashMap<&Expr, &NullableInterval> =
+ guarantees.into_iter().map(|(k, v)| (k, v)).collect();
+ rewrite_with_guarantees_map(expr, &guarantees_map)
+}
+
+/// Rewrite expressions to incorporate guarantees.
+///
+/// Guarantees are a mapping from an expression (which currently is always a
+/// column reference) to a [NullableInterval]. The interval represents the
known
+/// possible values of the column.
+///
+/// For example, if we know that a column is not null and has values in the
+/// range [1, 10), we can rewrite `x IS NULL` to `false` or `x < 10` to `true`.
+pub fn rewrite_with_guarantees_map<'a>(
Review Comment:
Copy/paste of comment by @alamb at
https://github.com/apache/datafusion/pull/17813#discussion_r2542032635 here.
> Unless there is a good reason, I think we should avoid removing this API
as it will cause unecessary churn on downstream crates
>
> If you find rewrite_with_guarantees easier to work with, maybe you leave
GuaranteeRewriter and but implement rewrite_with_guarantees in terms of that
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]