askalt commented on issue #19351:
URL: https://github.com/apache/datafusion/issues/19351#issuecomment-3796239651

   > [@alamb](https://github.com/alamb) As I understand it, 
[this](https://github.com/askalt/datafusion/commit/07ebbd4d321e4753cf566c40192b124654ff7455)
 implementation doesn't seem convincing because it introduces too many changes 
to the trait's public API. I would still like to extract runtime state from 
`ExecutionPlan` one way or another. In our project, we actively reuse pre-built 
plans, and the current `reset_plan_states` implementation doesn't allow this if 
the plan contains dynamic filters or a recursive subquery.
   > 
   > To solve these problems, I thought about 
[updating](https://github.com/apache/datafusion/issues/19905) the dynamic 
filters in all nodes before reusing a plan. I don't like this solution because 
it doesn't scale: what if new expressions requiring updates before re-execution 
are added? In that case, similar ad-hoc manipulations would be needed for each 
of them. For example, we also actively use placeholders (query params) in 
physical plans; in the current design, updating placeholders forces a rebuild 
of all plan nodes.
   > 
   > I believe that current design (store state directly in plans) is not 
friendly for re-using. If we consider how this is done, for example in 
PostgreSQL, the state of plans and expressions needed for execution is stored 
separately from the planned expressions themselves. This allows re-using plans 
and their expressions without the need for constant recreation. For a 
reference, see the 
[function](https://github.com/postgres/postgres/blob/9b9eaf08ab2dc22c691b22e59f1574e0f1bcc822/src/backend/executor/execExpr.c#L108-L143).
   > 
   > ## Possible solution
   > How do you feel about the following plan?
   > 
   > 1. Add a new method `ExecutionPlan::execute_with_state(...)`. If there is 
a downstream which is not interested in stateless-plans it should not implement 
it, a default behavior will call `ExecutionPlan::execute(...)` and by default 
plan will store its state in self as it is works now.
   > 2. Support stateless execution for plans across DF crates. Stateless means 
that metrics, futures, work tables, actual dynamic filters are taken from some 
context passed into `execute_with_state` instead of taking them from `self`.
   > 3. In the `execute_with_state` we can convert plan expressions into 
expressions that could be actually evaluated. For example, convert 
`PlaceholderExpr` into `Literal` using query params taking from the query 
execution context. Or convert planned dynamic filter to its executable version 
created by the owner (and passed also through context).
   
   I made a tiny example how it would work for `ProjectionExec`:
   
   https://github.com/askalt/datafusion/pull/4
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to