askalt commented on PR #19462:
URL: https://github.com/apache/datafusion/pull/19462#issuecomment-3737420272
> It might not be as bad as it sounds. For example, what if we added some
sort of fingerprint (maybe a hash) for EquivalenceProperties that is very fast
to compute. That would make it simple and straightforward to check for equality
🤔
It makes sense for me, I think, we can add a fast-path for
`with_new_children(...)` when there are no changes in children properties.
There is another problem we should address. Unfortunately, currently plans
are not reusable, also due to the dynamic filters stored in them. New versions
of the filters should be created for each execution. For example, if we use the
next function to re-use plans:
```rust
fn reset_plan_states(plan: Arc<dyn ExecutionPlan>) -> Arc<dyn ExecutionPlan>
{
plan.transform_up(|plan| {
let new_plan = Arc::clone(&plan).reset_state()?;
Ok(Transformed::yes(new_plan))
})
.unwrap()
.data
}
```
Imagine that there is an `AggregateExec` in the plan. It owns a dynamic
filter that is updated when aggregation is executed. For each plan execution, a
separate instance of such a filter should be created and somehow pushed down to
the child operator so that it can filter its inputs accordingly.
In the suggested approach (this patch), this is solved by splitting filters
into two types: planning-time filters and execution-time filters. When
`execute(...)` is called, an independent version of the filter is created and
then pushed into the children using state. Perhaps we could improve
reset_state(...) to support such pushes. It seems we cannot do it without
`reset_state` modification as the node that wants to poll filters should take
its new version from somewhere.
The same situation with working table of a recursive query plan: it should
someway be re-created and pushed into children for each plan re-execution.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]