adriangb commented on PR #22144:
URL: https://github.com/apache/datafusion/pull/22144#issuecomment-4464041143

   ## Split into a reviewable stack
   
   This experiment has been broken into 4 stacked PRs so each piece can be 
reviewed (and where possible, merged) on its own. Each builds on the previous; 
the diffs are cumulative against `main`, but every PR adds exactly **one new 
commit** — review that commit.
   
   1. **#22234 — `OptionalFilterPhysicalExpr` + proto** (+400)
      A transparent `PhysicalExpr` wrapper marking a filter as 
droppable-without-affecting-correctness, plus proto round-trip support. Purely 
additive, no caller — inert until something reads the marker.
   
   2. **#22235 — Per-conjunct pruning statistics** (+500/-18)
      `PruningPredicate::try_new_tagged_conjuncts` / `prune_per_conjunct` and 
the row-group / page-index variants surface per-conjunct effectiveness as a 
free side effect of the pruning pass. Existing untagged paths unchanged.
   
   3. **#22236 — `SelectivityTracker` cost model** (+2973)
      The cross-file cost model that partitions filter conjuncts into row-level 
/ post-scan / dropped buckets, with ~45 unit tests and a benchmark. Not yet 
wired into the scan.
   
   4. **#22237 — Adaptive parquet scan integration** (+1823/-624)
      Wires it all together: `AdaptiveParquetStream`, re-partitioning at 
row-group boundaries, integration with the fully-matched run splitting from 
#21637, the hash-join `OptionalFilterPhysicalExpr` wrap, and config knobs.
   
   ### Notes
   
   - Each layer compiles and passes clippy (`-D warnings`) independently.
   - **PRs 1–3 have no external dependency** and can merge on their own merits.
   - **PR 4** pins a custom `arrow-rs` branch for the push-decoder 
`StrategySwap` APIs — it cannot merge upstream until those APIs land in a 
released `arrow-rs`.
   
   This PR remains as the integration reference / discussion thread.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to