Emily Sun created SPARK-57194:
---------------------------------
Summary: Add earlyOperatorOptimizationRules extension point to
Optimizer
Key: SPARK-57194
URL: https://issues.apache.org/jira/browse/SPARK-57194
Project: Spark
Issue Type: New Feature
Components: SQL
Affects Versions: 3.4.1
Reporter: Emily Sun
h2. Problem
Custom optimizer rules injected via *SparkSessionExtensions.injectOptimizerRule*
run inside the fixed-point {*}Operator Optimization batch{*}, alongside built-in
rewriters like {_}FoldablePropagation, ConstantFolding, and
PushDownPredicates{_}{*}.{*}
A custom rule that needs to observe the original plan shape (e.g. cross-side
join
predicates before they are folded into single-side constants) can be silently
defeated when a built-in rule transforms the plan first within the same fixed
point.
Existing extension points don't cover this:
* *extendedOperatorOptimizationRules:* runs inside the same fixed-point batch
* *extendedResolutionRules / postHocResolutionRules:* analyzer phase, too early
* *earlyScanPushDownRules:* runs after optimization, scoped to scan pushdown
h2. Proposed change
Add *earlyOperatorOptimizationRules* on {*}Optimizer{*}, executed in a *Once*
batch named "Early Operator Optimization" placed between *Replace Operators*
and *Aggregate,* before the fixed-point Operator Optimization batch.
Wired through *SparkSessionExtensions.injectEarlyOptimizerRule* and
{*}BaseSessionStateBuilder{*}. The batch is a no-op when no rule is registered.
h2. Compatibility
Purely additive: no existing API or batch ordering changes; default is \{{Nil}}.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]