zhuqi-lucas commented on code in PR #21580:
URL: https://github.com/apache/datafusion/pull/21580#discussion_r3123555165
##########
datafusion/physical-plan/src/sorts/sort.rs:
##########
@@ -952,14 +962,16 @@ impl SortExec {
if fetch.is_some() && is_pipeline_friendly {
cache = cache.with_boundedness(Boundedness::Bounded);
}
- let filter = fetch.is_some().then(|| {
- // If we already have a filter, keep it. Otherwise, create a new
one.
- self.filter.clone().unwrap_or_else(|| self.create_filter())
- });
let mut new_sort = self.cloned();
new_sort.fetch = fetch;
new_sort.cache = cache.into();
- new_sort.filter = filter;
+ new_sort.filter = fetch.is_some().then(|| {
+ // If we already have a filter, keep it. Otherwise, create a new
one.
+ // Must be called after setting fetch so DynamicFilter gets the K
value.
+ self.filter
Review Comment:
This is actually safe in the current implementation because cumulative prune
is guarded by is_pure_dynamic_filter — it only fires when the predicate is
purely the DynamicFilterPhysicalExpr (no WHERE
clause). And cumulative prune runs before any data is read, so row_filter
hasn't filtered any rows yet — num_rows() is accurate at this point.
If we extend this to WHERE queries in the future (e.g. with dynamic RG
pruning at runtime via #21399), we'd need to switch from static row counting to
runtime early termination.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]