Dandandan opened a new pull request, #22412: URL: https://github.com/apache/datafusion/pull/22412
## Which issue does this PR close? - Closes #22411. ## Rationale for this change The logical optimizer currently reruns every rule on each pass. If a rule already returned `Transformed::no` and no later rule has changed the plan since then, rerunning that rule does not add new information and adds planning cost. Local ClickBench logical planning measurements over all 51 queries, 30 iterations: - Baseline optimizer-only: 607.259 us/query - Patched optimizer-only: 409.473 us/query and 423.034 us/query - Baseline parse/analyze/optimize: 1001.498 us/query - Patched parse/analyze/optimize: 822.877 us/query and 844.032 us/query ## What changes are included in this PR? This tracks a cheap plan version inside `Optimizer::optimize`. The version increments whenever a rule reports `transformed = true`. If a rule previously returned `Transformed::no` for the current plan version, the optimizer skips rerunning it until some rule changes the plan. The observer is still invoked for skipped rules so explain/observer behavior keeps the same per-rule shape. ## Are these changes tested? Existing optimizer behavior is covered by: - `cargo fmt --all` - `cargo test -p datafusion-optimizer --lib` - `cargo clippy --all-targets --all-features -- -D warnings` I also measured ClickBench logical planning locally using a temporary probe. ## Are there any user-facing changes? No API or output changes are intended. This is an optimizer scheduling performance improvement. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
