Emily Sun created SPARK-57194:
---------------------------------

             Summary: Add earlyOperatorOptimizationRules extension point to 
Optimizer
                 Key: SPARK-57194
                 URL: https://issues.apache.org/jira/browse/SPARK-57194
             Project: Spark
          Issue Type: New Feature
          Components: SQL
    Affects Versions: 3.4.1
            Reporter: Emily Sun


h2. Problem

Custom optimizer rules injected via *SparkSessionExtensions.injectOptimizerRule*
run inside the fixed-point {*}Operator Optimization batch{*}, alongside built-in
rewriters like {_}FoldablePropagation, ConstantFolding, and 
PushDownPredicates{_}{*}.{*}
A custom rule that needs to observe the original plan shape (e.g. cross-side 
join
predicates before they are folded into single-side constants) can be silently
defeated when a built-in rule transforms the plan first within the same fixed 
point.

Existing extension points don't cover this:
 * *extendedOperatorOptimizationRules:* runs inside the same fixed-point batch
 * *extendedResolutionRules / postHocResolutionRules:* analyzer phase, too early
 * *earlyScanPushDownRules:* runs after optimization, scoped to scan pushdown

h2. Proposed change


Add *earlyOperatorOptimizationRules* on {*}Optimizer{*}, executed in a *Once*
batch named "Early Operator Optimization" placed between *Replace Operators*
and *Aggregate,* before the fixed-point Operator Optimization batch.

Wired through *SparkSessionExtensions.injectEarlyOptimizerRule* and
{*}BaseSessionStateBuilder{*}. The batch is a no-op when no rule is registered.
h2. Compatibility


Purely additive: no existing API or batch ordering changes; default is \{{Nil}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to