adriangb commented on PR #18192:
URL: https://github.com/apache/datafusion/pull/18192#issuecomment-3447731663
Yeah that makes sense! I'm not sure how to encode that into the APIs, but
making it very explicit in the docs, etc. should be enough?
One thing I'm thinking is if there's a way to satisfy all of the input here
by making a new high level API.
An annoyance I've had with the Codec system is that it's not clear how to
inject extra behavior, it only functions as a fallback.
Something along the lines of:
```rust
pub trait Decoder {
fn decode_plan(&self, plan: PhysicalPlanNode) -> Result<Arc<dyn
ExecutionPlan>>;
fn decode_expression(&self, expression: PhysicalExprNode) ->
Result<Arc<dyn ExpressionNode>>;
}
pub struct DefaultDecoder {
ctx: TaskContext,
}
impl DefaultDecoder {
pub fn new(ctx: TaskContext) -> Self {
Self { ctx }
}
impl Decoder for DefaultDecoder {
fn decode_plan(decoder: & dyn Decoder, plan: PhysicalPlanNode) ->
Result<Arc<dyn ExecutionPlan>> {
// Essentially the code inside of
`PhysicalPlanNode::try_from_physical_plan`
// but passing around a reference to ourselves as `&dyn Decoder` so
that e.g. if we have to decode
// predicates inside of a plan it calls back into `decode_expression`
// Maybe delegates to
}
fn decode_expression(decoder: &dyn Decoder expression: ExpressionNode,
input_schema: &Schema) -> Result<Arc<dyn ExpressionNode>> {
// essentially the code inside of `parse_physical_expr` but again
passing around a reference to ourselves
}
}
```
Then it's easy to make a custom `DefaultDecoder` that injects before/after
behavior, e.g. caching or re-attaching custom bits to plans. I'd imagine we get
rid of the `Codec` bit and have defaults just error if you don't match and
handle extension/custom types yourself.
Anyway that's a half baked idea and that discussion may be a blocker for
this PR but I think it is largely unrelated to "is the deduplication worth
doing by default", I'll address that in my next comment.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]