thinkharderdev commented on pull request #1677:
URL: 
https://github.com/apache/arrow-datafusion/pull/1677#issuecomment-1022500419


   > Thank you for this proposal @thinkharderdev
   > 
   > While I am not an expert in Ballista and I will defer to others for this 
PR, I did have a few comments:
   > 
   > Using a trait that handled serializing an entire logical plan would allow 
use cases like using alternate serializers for the plan (like one could imagine 
using JSON for example), but I am not sure how it directly helps serializing 
extension points.
   > 
   > What would you think about something like the following:
   > 
   > ```rust
   > /// Describes something that knows how to serialize / deserialize the
   > /// contents of DataFusion user defined extension points
   > pub trait ExtensionSerializer {
   > 
   >     /// Serializes a [UserDefinedLogicalNode] into an opaque set of
   >     /// bytes for transport over the network
   >     fn serialize_extension_node(node: &dyn UserDefinedLogicalNode) -> 
Result<Bytes, BallistaError>;
   > 
   >     /// Deserializes a set of bytes created by
   >     /// [serialize_extension_node] into a new instance of a
   >     /// [UserDefinedLogicalNode]
   >     fn deserialize_extension_node(bytes: Bytes) -> Result<Arc<dyn 
UserDefinedLogicalNode>, BallistaError>;
   > 
   >     // TBD Similar functions for the other extension points
   > }
   > 
   > struct DefaultExtensionSerializer {
   > }
   > 
   > impl ExtensionSerializer for DefaultExtensionSerializer {
   >     fn serialize_extension_node(node: &dyn UserDefinedLogicalNode) -> 
Result<Bytes> {
   >         Err(BallistaError::NotImplemented(format!("Default serializer does 
not know how to serialize {:?}", node)))
   >     }
   > 
   >     fn deserialize_extension_node(_bytes: Bytes) -> Result<Arc<dyn 
UserDefinedLogicalNode>> {
   >         Err(BallistaError::NotImplemented("Default serializer does not 
know how to deserialize user defined extensions".to_string()))
   >     }
   > }
   > ```
   > 
   > And an instance of `&dyn ExtensionSerializer` could be passed when 
serializing plans?
   
   Yeah, this is what I tried to do originally but it didn't work out very 
well. The basic problem I was running into is that since we decoding the 
message recursively from the leaves, as soon as you "break out" of the specific 
types defined in Ballista, you can no longer use any of the decoding machinery. 
For example, if I define a logical node like:
   
   ```
   struct ReverseAllString {}
   
   impl UserDefinedLogicalNode for ReverseAllString {
     ...
   }
   ```
   
   I have to encode my `ReverseAllString` struct but also need to encode all of 
it's inputs. But I can't define the inputs using the protobuf types from 
Ballista and still hook into the machinery which converts from my serialized 
type to an actual `LogicalPlan`. At least not any way that I'm aware of (short 
of hand-rolling the binary codec). 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to