adriangb commented on code in PR #19188:
URL: https://github.com/apache/datafusion/pull/19188#discussion_r2596554664
##########
datafusion/optimizer/src/eliminate_group_by_constant.rs:
##########
@@ -91,25 +91,20 @@ impl OptimizerRule for EliminateGroupByConstant {
}
}
-/// Checks if expression is constant, and can be eliminated from group by.
+/// Checks if expression is constant within a query, and can be eliminated
from group by.
///
-/// Intended to be used only within this rule, helper function, which heavily
-/// relies on `SimplifyExpressions` result.
+/// An expression is considered constant for GROUP BY purposes if it has the
same
+/// value for all rows within a single query execution. This includes:
+/// - [`ExprVolatility::Constant`]: Expressions that can be evaluated at
planning time
+/// - [`ExprVolatility::Stable`]: Expressions like `now()` that are constant
within a query
+///
+/// Note: `Immutable` expressions are NOT included because they depend on
input data
+/// and may have different values for different rows.
fn is_constant_expression(expr: &Expr) -> bool {
- match expr {
- Expr::Alias(e) => is_constant_expression(&e.expr),
- Expr::BinaryExpr(e) => {
- is_constant_expression(&e.left) && is_constant_expression(&e.right)
- }
- Expr::Literal(_, _) => true,
- Expr::ScalarFunction(e) => {
- matches!(
- e.func.signature().volatility,
- Volatility::Immutable | Volatility::Stable
- ) && e.args.iter().all(is_constant_expression)
- }
- _ => false,
- }
+ matches!(
+ expr.volatility(),
+ ExprVolatility::Constant | ExprVolatility::Stable
+ )
Review Comment:
I don't love this, it's a bit confusing. I wonder if the right distinction
is `Stable with no inputs` (e.g. `now()`?
##########
datafusion/expr-common/src/signature.rs:
##########
@@ -84,6 +84,220 @@ pub enum Volatility {
Volatile,
}
+/// Describes the volatility of an expression, considering both the expression
+/// type and its children.
+///
+/// This is distinct from [`Volatility`] which describes function behavior.
+/// `ExprVolatility` describes the overall expression behavior after
considering
+/// the expression structure and all its inputs.
+///
+/// The variants are ordered from least volatile to most volatile:
+/// `Constant < Immutable < Stable < Volatile`
+///
+/// # Examples
+///
+/// ```
Review Comment:
```suggestion
/// ```rust
```
##########
datafusion/expr-common/src/signature.rs:
##########
@@ -84,6 +84,220 @@ pub enum Volatility {
Volatile,
}
+/// Describes the volatility of an expression, considering both the expression
+/// type and its children.
+///
+/// This is distinct from [`Volatility`] which describes function behavior.
+/// `ExprVolatility` describes the overall expression behavior after
considering
+/// the expression structure and all its inputs.
+///
+/// The variants are ordered from least volatile to most volatile:
+/// `Constant < Immutable < Stable < Volatile`
+///
+/// # Examples
+///
+/// ```
+/// use datafusion_expr_common::signature::ExprVolatility;
+///
+/// // Ordering comparison
+/// assert!(ExprVolatility::Constant < ExprVolatility::Immutable);
+/// assert!(ExprVolatility::Immutable < ExprVolatility::Stable);
+/// assert!(ExprVolatility::Stable < ExprVolatility::Volatile);
+///
+/// // Combining volatilities takes the maximum
+/// let vol = ExprVolatility::Constant.max(ExprVolatility::Immutable);
+/// assert_eq!(vol, ExprVolatility::Immutable);
+/// ```
+#[derive(Debug, PartialEq, Eq, PartialOrd, Ord, Clone, Copy, Hash)]
+pub enum ExprVolatility {
+ /// The expression always evaluates to the same value regardless of input.
+ ///
+ /// Examples:
+ /// - Literals: `1`, `'hello'`, `NULL`
+ /// - Constant expressions: `1 + 2`, `abs(-5)`
+ /// - Immutable functions with all constant arguments: `concat('a', 'b')`
+ ///
+ /// Constant expressions can be evaluated at planning time.
+ Constant,
+
+ /// The expression returns the same output for the same input values.
+ ///
+ /// Examples:
+ /// - Column references: `col("a")`
+ /// - Immutable functions with non-constant arguments: `abs(col("a"))`
+ /// - Binary operations on columns: `col("a") + col("b")`
+ ///
+ /// Immutable expressions cannot be evaluated at planning time but are
+ /// deterministic given the input data.
+ Immutable,
+
+ /// The expression returns the same value within a single query execution
+ /// but may return different values across queries.
+ ///
+ /// Examples:
+ /// - `now()`, `current_date()`, `current_timestamp()`
+ /// - Stable functions with any arguments
+ ///
+ /// Stable expressions can be evaluated once per query execution.
+ Stable,
+
+ /// The expression may return different values on each evaluation.
+ ///
+ /// Examples:
+ /// - `random()`, `uuid()`
+ /// - Any expression containing a volatile function
+ ///
+ /// Volatile expressions must be evaluated for each row.
+ Volatile,
+}
+
+impl ExprVolatility {
+ /// Combines two volatilities, returning the "highest" (most volatile).
+ ///
+ /// This is used when computing the volatility of expressions with
+ /// multiple children (e.g., binary expressions, function arguments).
+ ///
+ /// # Examples
+ ///
+ /// ```
Review Comment:
```suggestion
/// ```rust
```
##########
datafusion/ffi/src/physical_expr/mod.rs:
##########
@@ -123,6 +124,8 @@ pub struct FFI_PhysicalExpr {
pub is_volatile_node: unsafe extern "C" fn(&Self) -> bool,
+ pub node_volatility: unsafe extern "C" fn(&Self) -> FFI_ExprVolatility,
Review Comment:
This looks reasonable but is my first time doing FFI work in DataFusion
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]