[PR] Avoid rerunning no-op logical optimizer rules [datafusion]

via GitHub Thu, 21 May 2026 02:44:54 -0700


Dandandan opened a new pull request, #22412:
URL: https://github.com/apache/datafusion/pull/22412


   ## Which issue does this PR close?
   
   - Closes #22411.
   
   ## Rationale for this change
   
   The logical optimizer currently reruns every rule on each pass. If a rule 
already returned `Transformed::no` and no later rule has changed the plan since 
then, rerunning that rule does not add new information and adds planning cost.
   
   Local ClickBench logical planning measurements over all 51 queries, 30 
iterations:
   
   - Baseline optimizer-only: 607.259 us/query
   - Patched optimizer-only: 409.473 us/query and 423.034 us/query
   - Baseline parse/analyze/optimize: 1001.498 us/query
   - Patched parse/analyze/optimize: 822.877 us/query and 844.032 us/query
   
   ## What changes are included in this PR?
   
   This tracks a cheap plan version inside `Optimizer::optimize`. The version 
increments whenever a rule reports `transformed = true`. If a rule previously 
returned `Transformed::no` for the current plan version, the optimizer skips 
rerunning it until some rule changes the plan.
   
   The observer is still invoked for skipped rules so explain/observer behavior 
keeps the same per-rule shape.
   
   ## Are these changes tested?
   
   Existing optimizer behavior is covered by:
   
   - `cargo fmt --all`
   - `cargo test -p datafusion-optimizer --lib`
   - `cargo clippy --all-targets --all-features -- -D warnings`
   
   I also measured ClickBench logical planning locally using a temporary probe.
   
   ## Are there any user-facing changes?
   
   No API or output changes are intended. This is an optimizer scheduling 
performance improvement.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[PR] Avoid rerunning no-op logical optimizer rules [datafusion]

Reply via email to