askalt commented on issue #19351: URL: https://github.com/apache/datafusion/issues/19351#issuecomment-3858217250
> > > I agree that it would be great to have a clear split between a physical execution plan tree and the execution state. I also agree that calling `execute()` essentially produces an implicit tree of execution state via streams, but I don't think the API is good or clear, and e.g. the interaction with dynamic filters is poor. I'm not sure what the best way to solve this is. > > > > > > Could you please check this method? > > https://github.com/askalt/datafusion/blob/askalt/stateless-plan/datafusion/physical-plan/src/state.rs#L175-L207 > > The idea here is the following: the separate state tree stores execution versions of the dynamic filters (they are split into planning and execution versions) and the node owning the filter (e.g., `HashJoinExec`) creates an execution version in `ExecutionPlan::execute(...)` attaching it to the passed node state. Then, dynamic filter poller (e.g., parquet reader) could take it from the state, inspecting the state tree. What do you think about this API? > > Hmm that seems a bit kludgy and brittle - especially traversing the parent tree for each filter and matching by equality. I suspect it would be easier to add the right interior mutability to `DynamicFilterPhysicalExpr` to go from planning -> execution while the outer Arc pointer stays the same so both the producer and consumer can just keep their copy / clone it if needed for their transition from planning -> execution. It would be good to have an ability to concurrently call `.execute(...)` on the same plan and acquire streams that use different instances of the dynamic filter. In this case we cannot leave `Arc` the same, we need to someway share a **particular execution versioin** among nodes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
