Dandandan opened a new issue, #22411: URL: https://github.com/apache/datafusion/issues/22411
### Describe the problem The logical optimizer reruns every rule on each optimizer pass, even when a rule already returned `Transformed::no` for the current logical plan and no later rule has changed that plan before the next invocation of the same rule. This adds avoidable planning cost on queries that need multiple optimizer passes. ClickBench queries are a good example: many converge only after a second pass, but most rules in that pass see the exact same input they already reported as unchanged. ### Proposed improvement Track a cheap logical plan version during optimization. Increment the version whenever a rule reports `transformed = true`. If a rule previously returned `Transformed::no` at the current plan version, skip rerunning it until some rule changes the plan. This keeps the existing optimizer fixed-point behavior while avoiding repeated no-op work for unchanged rule inputs. ### Local measurement Using a temporary local ClickBench logical planning probe over all 51 ClickBench queries, 30 iterations: - Baseline optimizer-only: 607.259 us/query - Patched optimizer-only: 409.473 us/query and 423.034 us/query - Baseline parse/analyze/optimize: 1001.498 us/query - Patched parse/analyze/optimize: 822.877 us/query and 844.032 us/query The patch preserves optimizer behavior and only skips rules after a prior `Transformed::no` for the same unchanged plan version. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
