zhuqi-lucas opened a new issue, #23216:
URL: https://github.com/apache/datafusion/issues/23216

   ## Summary
   
   Today page-level pruning in Parquet (`opener/mod.rs:1314` → 
`PagePruningPredicate::prune_plan_with_page_index_and_metrics`) runs **once at 
file open** with the static query predicate. #22450 added dynamic RG-level 
pruning at every RG boundary (`should_prune` in `push_decoder.rs:183`), but its 
rebuild path never re-evaluates the page-level predicate.
   
   This issue extends #22450's "refresh at RG boundary" pattern to **also 
refresh the `PagePruningPredicate`**, so the page-level `RowSelection` of 
upcoming RGs is tightened by the latest TopK threshold.
   
   ## Current state (source-confirmed)
   
   | Prune type | Where | Data | Dynamic? |
   |---|---|---|---|
   | RG-level (#22450) | `push_decoder.rs:183 should_prune` (RG boundary) | RG 
metadata min/max | ✅ rebuilt every RG boundary |
   | **Page-level** | `opener/mod.rs:1314` (**file open only**) | page index | 
❌ snapshot at file open |
   | Row-level (RowFilter) | per batch | filter column values | ✅ reads latest 
threshold |
   
   Gap: after #22450, RG-level is dynamic but page-level is still static. If 
TopK heap tightens after file open, surviving RGs still have their initial 
(loose) page-level `RowSelection` — pages whose min/max no longer survive the 
new threshold are still fetched + decompressed + decoded for filter-col 
evaluation.
   
   ## Proposal
   
   At every RG boundary (`PushDecoderStreamState::transition`):
   
   1. `tracker.changed()` — same single atomic load #22450 uses
   2. If changed: rebuild a fresh `PagePruningPredicate` from latest filter
   3. Walk remaining RGs in access plan; refine each `RowSelection` via 
`prune_plan_with_page_index_and_metrics`
   4. Apply via existing `into_builder() → with_row_groups(...) → build()`
   
   Errors fall back to "keep current selection" (mirrors `should_prune`).
   
   ## Expected wins
   
   Saves filter-column **IO + decompress + decode** for individual dead pages — 
extends #22450's "chip away Layer B residue" philosophy from RG to page 
granularity.
   
   Most useful when:
   - RGs are large (many pages each)
   - Threshold tightens significantly mid-scan (e.g. after first few RGs fill 
the heap)
   - Page index is enabled (prerequisite — without it, no-op)
   
   ## Prerequisites
   
   - `datafusion.execution.parquet.enable_page_index = true`
   - Filter column present in file schema
   - Predicate chain contains a `DynamicFilter` (TopK source)
   
   ## Open design questions
   
   1. **Refresh frequency**: every RG boundary, or only when 
`tracker.changed()` returns true?
   2. **Granularity**: refresh access plan for *all* surviving RGs, or only the 
next one to be touched?
   3. **arrow-rs API gap**: does the existing `with_row_groups(...)` path 
accept an updated per-RG `RowSelection`, or do we need a new arrow-rs API hook? 
(May overlap with arrow-rs#10158 territory.)
   4. **Stretch goal · mid-RG refresh**: refresh *between* pages of the same 
RG, not just at RG boundary. Needs a brand-new arrow-rs "mid-RG predicate 
adapt" callback hook.
   
   ## Related
   
   - #22450 — RG-level dynamic prune (the foundation this extends)
   - #23067 — Per-RG \`fully_matched\` RowFilter skip
   - arrow-rs#10158 — \`peek_next_row_group\` (related rebuild surface)
   - arrow-rs#9937 — Page-level reverse iteration (independent but adjacent)
   
   Part of the Sort Pushdown EPIC #23036, future direction.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to