askalt commented on issue #19351:
URL: https://github.com/apache/datafusion/issues/19351#issuecomment-3794520074
@alamb As I understand it, [this][1] implementation doesn't seem convincing
because it introduces too many changes to the trait's public API. I would still
like to extract runtime state from `ExecutionPlan` one way or another. In our
project, we actively reuse pre-built plans, and the current `reset_plan_states`
implementation doesn't allow this if the plan contains dynamic filters or a
recursive subquery.
To solve these problems, I thought about updating the dynamic filters in all
nodes before reusing a plan. I don't like this solution because it doesn't
scale: what if new expressions requiring updates before re-execution are added?
In that case, similar ad-hoc manipulations would be needed for each of them.
For example, we also actively use placeholders (query params) in physical
plans; in the current design, updating placeholders forces a rebuild of all
plan nodes.
I believe that current design (store state directly in plans) is not
friendly for re-using. If we consider how this is done, for example in
PostgreSQL, the state of plans and expressions needed for execution is stored
separately from the planned expressions themselves. This allows re-using plans
and their expressions without the need for constant recreation. For a
reference, see the [function][2].
## Possible solution
How do you feel about the following plan?
1. Add a new method `ExecutionPlan::execute_with_state(...)`. If there is a
downstream which is not interested in stateless-plans it should not implement
it, a default behavior will call `ExecutionPlan::execute(...)` and by default
plan will store its state in self as it is works now.
2. Support stateless execution for plans across DF crates. Stateless means
that metrics, futures, work tables, actual dynamic filters are taken from some
context passed into `execute_with_state` instead of taking them from `self`.
3. In the `execute_with_state` we can convert plan expressions into
expressions that could be actually evaluated. For example, convert
`PlaceholderExpr` into `Literal` using query params taking from the query
execution context. Or convert planned dynamic filter to its executable version
created by the owner (and passed also through context).
[1]:
https://github.com/askalt/datafusion/commit/07ebbd4d321e4753cf566c40192b124654ff7455
[2]:
https://github.com/postgres/postgres/blob/9b9eaf08ab2dc22c691b22e59f1574e0f1bcc822/src/backend/executor/execExpr.c#L108-L143
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]