alamb opened a new issue, #23264:
URL: https://github.com/apache/datafusion/issues/23264

   ### Is your feature request related to a problem or challenge?
   
   DataFusion represents arbitrary binary Operator trees like 
   ```sql
   (a OR (b OR (c OR (d OR ...)))
   ```
   
   Using a binary tree structure called `BinaryExpr`:
   
   
https://github.com/apache/datafusion/blob/2da88876f6e764ca3052c92b9d1783929d13d7bf/datafusion/expr/src/expr.rs#L770-L779
   
   There are also many places in the codebase where recursive algorithms are 
used to walk over Expr trees, 
   
   An example of this is `Treenode::apply`: 
https://github.com/apache/datafusion/blob/0838a4ddb902535b0e95a1c5a254be7e9c7fe9bf/datafusion/common/src/tree_node.rs#L196-L209
   
   
   The stack depth is a function of the nesting depth of the expression, which 
means that
   
   > [!IMPORTANT]
   > In some cases processing deeply nested expressions can lean to a stack 
overflow (and thus a process `abort`)
   
   This shows up in practice from Machine-generated SQL/filters with thousands 
of OR/AND terms (e.g. IN lists rewritten to OR, or large predicate pushdown
   
   This representation has served us quite well, but we do have several 
mitigations
   
   1. [`recursive` crate](https://crates.io/crates/recursive) + 
recursive_protection feature (segmented stack growth)
   
   The primary mitigation. Recursive TreeNode drivers are annotated 
`#[cfg_attr(feature = "recursive_protection", recursive::recursive)]`, which 
uses the recursive crate  to grow the stack on the heap when it runs low, 
instead of overflowing. See the TreeNode walkers (6 sites — 
apply/transform_down/transform_up/rewrite/etc.): 
https://github.com/apache/datafusion/blob/0838a4ddb902535b0e95a1c5a254be7e9c7fe9bf/datafusion/common/src/tree_node.rs#L196-L209
   
   2. SQL parser recursion limit (bounded recursion → error, not crash)
   
   The [sqlparser crate ](https://crates.io/crates/sqlparser) itself caps 
nesting depth (default 50) and returns RecursionLimitExceeded rather than 
overflowing the stack.
   
   3. Explicit-stack (iterative) SQL→Expr conversion
   - See issue #1444
   
   We also hit this a bunch when converting a 
[sqlparser](https://crates.io/crates/sqlparser) binary-operator tree into 
`Expr` and have a stack based conversion approach (see 
https://github.com/apache/datafusion/blob/0838a4ddb902535b0e95a1c5a254be7e9c7fe9bf/datafusion/sql/src/expr/mod.rs#L76-L120)
   
   4. Out-of-line match arms in the expression simplifier
   
   Large match arms are extracted into separate functions specifically to 
shrink per-frame stack usage and avoid overflow during simplification.
   
   - 
https://github.com/apache/datafusion/blob/0838a4ddb902535b0e95a1c5a254be7e9c7fe9bf/datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs#L2335-L2341
 
   
   
   Recently people seem to be hitting recursion limits again and trying to put 
in patches, such as 
   * https://github.com/apache/datafusion/pull/23198  from @mdashti
   * https://github.com/apache/datafusion/pull/22943 from @nathanb
   
   ### Describe the solution you'd like
   
   I would like a more fool proof way to avoid stack overflow issues
   
   
   
   ### Describe alternatives you've considered
   
   ## Option 1: n-ary expressions
   
   The approach taken by DuckDb Mysql and Postgres is to represent `AND` and 
`OR` as n-ary rather than binary (2)
   
   So that would be something like this (could also have an enum for And/Or)
   ```rust
   enum ConjunctionType { And, Or }
   /// AND/OR of some number of expressions
   struct Conjunction {
     conjunction_type: ConjunctionType,
     /// Logically `exprs[0] AND (exprs[1] AND (exprs[2] AND ...`
     exprs : Vec<Expr>.
   }
   ```
   
   And then  
   ```rust
   /// extend the Expr enum
   enum Expr { 
     ...
     // AND
     Conjunction(Conjunction),
    ...
   }
   ```
   
   ### Examples:
   
   DuckDB uses ConjunctionExpression: 
https://github.com/duckdb/duckdb/blob/b8a06e4a22672e254cd0baa68a3dbed2eb51c56e/src/include/duckdb/parser/expression/conjunction_expression.hpp#L17-L27
   
   Postrgres uses BoolExpr 
https://github.com/postgres/postgres/blob/REL_17_0/src/include/nodes/primnodes.h#L929-L942
   
   Mysql uses [`Item_cond_and` / 
`Item_cond_or`](https://github.com/mysql/mysql-server/blob/mysql-8.0.40/sql/item_cmpfunc.h#L2711-L2768)
  (based on [Item_cond base which stores args as a  List<Item> 
list](https://github.com/mysql/mysql-server/blob/mysql-8.0.40/sql/item_cmpfunc.h#L2421-L2447);
   
   ### Cons
   This would be a major downstream API change 
   Also unless we removed Operator::And / Operator::Or we would potentially 
have multiple ways to express the same logical expression
   
   
   ## Keep Binary Expr, have some sort of "balance" operation
   Another approach taken by spark is to use Binary expressions: 
https://github.com/apache/spark/blob/fa33ea000a0bda9e5a3fa1af98e8e85b8cc5e4d4/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala#L828
   
   
[`buildBalancedPredicate`](https://github.com/apache/spark/blob/fa33ea000a0bda9e5a3fa1af98e8e85b8cc5e4d4/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala#L160-L184)
 rebuilds a predicate list into a height-balanced tree in 
    - https://github.com/apache/spark/pull/31724
   
   ### Additional context
   
   Recent PRs 
   * https://github.com/apache/datafusion/pull/23198  from @mdashti
   * https://github.com/apache/datafusion/pull/22943 from @nathanb
   
   🎯 Umbrella / strategic (open)
   
   - #16788 — the epic to explore iterative rewrites / Box-ing / n-ary to 
remove the need for #[recursive] and regain the 1–2% planning overhead. 
   - #14857 
   - #19787 
   - #14118 
   
   🐛 Open stack-overflow bugs (motivating cases)
   
   - #22229 
   - #22936
   - #8900 
   - #23246
   
   📜 Closed — the history that motivated the current mitigations
   
   - #1444 
   - #1434 
   - #9375 
   - #11102 
   - #16030
   - #4066
   - #3968
   - #23056
   - #9573
   - #9373
   - #15087
   - TPC-DS planning overflows: #6277, #6264, #6040, #4786, #4065, #8837


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to