zhuqi-lucas commented on PR #21976:
URL: https://github.com/apache/datafusion/pull/21976#issuecomment-4438274721

   Quick update — pushed all three steps. Branch is now `ensure-requirements` 
(commits `d5bd105e8` and `47cd38363` on top of the original).
   
   **Steps 1+2 (`d5bd105e8`)** — removed the disconnected `EnforceDistribution` 
/ `EnforceSorting` rule structs, and retargeted the existing tests in 
`core/tests/physical_optimizer/`:
   - `enforce_distribution.rs` — `Run::Distribution` / `Run::Sorting` branches 
both call `EnsureRequirements::new()`. Legacy run sequences kept verbatim so 
existing assertions retain their shape; the merged rule is idempotent so the 
previously-different orderings now converge.
   - `enforce_sorting.rs` — the historical `[Dist, Sort]` vs `[Sort, Dist, 
Sort]` idempotency comparison is rewritten as "running EnsureRequirements N 
times == running it once".
   - 78 snapshots refreshed; consistent pattern is `SortExec + 
CoalescePartitionsExec` (blocking) → `SortPreservingMergeExec` (streaming), 
since EnsureRequirements now applies `parallelize_sorts` + 
`replace_with_order_preserving_variants` on inputs the single-rule path didn't 
reach. Improvements rather than regressions.
   
   **Step 3 (`47cd38363`)** — went ahead and made the call: dropped 
`ensure_requirements/new_tests.rs` entirely.
   
   For the "which new-behavior cases to keep" question I raised earlier: turns 
out every test in both `new_tests.rs` and `ensure_requirements/mod.rs` inline 
tests uses either `optimize_and_sanity_check` (production-502 regression) or 
`assert_idempotent` (merged-rule core property). None are pure smoke tests. The 
real redundancy was organizational — two files with duplicated helpers and 
substantial plan-shape overlap (hash join, aggregate, window, projection, 
fetch, filter all covered in both).
   
   So I deleted `new_tests.rs` and kept the `mod.rs` inline tests as-is. All 
the named bug regressions (#14150, PR53, PR54) live there, plus a comprehensive 
sweep of plan shapes with idempotency + SanityCheckPlan-validity assertions. If 
you'd prefer a more aggressive cull (keep only `#14150` / PR53 / PR54 named 
regressions, drop the rest), happy to do that as a follow-up commit — just say 
the word.
   
   Test results after the cleanup:
   - `core_integration physical_optimizer::` 454 passed
   - `datafusion-physical-optimizer` (lib) 59 passed (down from 79 with 
`new_tests.rs` dropped)
   - `clippy --all-targets -- -D warnings` clean
   - `cargo fmt --check` clean
   
   CI is running on the push now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to