jverhoeks commented on issue #2376:
URL: https://github.com/apache/iceberg-rust/issues/2376#issuecomment-4330042368

   Update — tried two follow-up "fixes" on top of the per-task DynamicPredicate 
impl, both backfired and got reverted. Sharing the lessons so anyone working on 
the actual API in iceberg-rust doesn't have to relearn:
   
   **Attempt 1 — cap on `Predicate::Set` size in the converter**
   A `HashJoinExec` build side can produce a runtime `InListExpr` with hundreds 
of thousands of values (e.g. TPC-H q04 at SF10: ~580K orderkeys probed back 
into lineitem). Setting a cap (4096) so oversized lists fall through to a 
per-batch evaluator looked safe but was net-negative: queries with multiple 
runtime filters keep paying construction cost on the unaffected filters while 
losing the row-group pruning that the big filter was actually delivering. 
Worst-of-both-worlds.
   
   **Attempt 2 — `Arc::ptr_eq` cache keyed on 
`DynamicFilterPhysicalExpr::current()`**
   The intention was to skip retranslation when the filter hadn't changed 
between tasks. In DataFusion 53 `current()` calls `remap_children`, which in 
the join-pushdown path returns a freshly-built `Arc<dyn PhysicalExpr>` each 
call (because column indices are remapped to the probe-side schema). So 
`Arc::ptr_eq` essentially never matched, the cache never hit, and I just added 
Mutex contention across the scan's concurrent FileScanTask processors. Net SF10 
wallclock: ~+5% vs no cache.
   
   **Practical takeaway for the API design**
   
   Per-task `current()` calls hit a cost floor of the underlying `Predicate` 
construction + `bind()` + row-group eval. For multi-join queries with many 
filters above one big scan, that floor is non-trivial.
   
   Two ways to lower it without breaking caching:
   1. Expose `DynamicFilterPhysicalExpr` `generation: u64` (or any cheap 
monotonic) so consumers can skip retranslation on the no-change path.
   2. Or have the `DynamicPredicate` trait return an opaque "version" alongside 
the predicate, letting the reader make its own cache key.
   
   Either is a small extension to what's proposed in this issue, and both are 
forward-compatible. Happy to fold this into the PR draft when we get to it.
   
   (Bench evidence for the original Path B-2 working as designed and these 
failed follow-ups is in our downstream branch; can share if useful.)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to