LantaoJin opened a new issue, #75:
URL: https://github.com/apache/datafusion-java/issues/75

   ### Is your feature request related to a problem or challenge?
   
   `SessionContext.fromProto(byte[])` accepts only DataFusion's *own* 
`LogicalPlanNode` proto. [Substrait](https://substrait.io/) — the cross-engine 
logical-plan standard that DataFusion already supports through the 
[`datafusion-substrait`](https://crates.io/crates/datafusion-substrait) crate — 
has no Java-side entry point.
   
   ### Describe the solution you'd like
   
   Add a single new entry point on `SessionContext`:
   
   ```java
   public DataFrame fromSubstrait(byte[] planBytes);
   ```
   
   `planBytes` is a serialized `substrait.proto.Plan` message. The 
implementation deserialises with `prost`, hands the resulting `Plan` to 
`datafusion_substrait::logical_plan::consumer::from_substrait_plan(state, 
&plan).await`, and wraps the resulting `LogicalPlan` in a `DataFrame` exactly 
the way `fromProto` does for DataFusion's native plan format. From there 
callers compose the usual `DataFrame` API:
   
   ```java
   byte[] plan = compileWithCalcite(...);   // any Substrait-emitting compiler
   try (DataFrame df = ctx.fromSubstrait(plan);
        ArrowReader reader = df.executeStream(allocator)) {
     while (reader.loadNextBatch()) { /* ... */ }
   }
   ```
   
   To avoid bloating builds for users who don't need it, the 
`datafusion-substrait` Cargo dependency lives behind a `substrait` Cargo 
feature on the `datafusion-jni` crate. The feature is on by default — Maven 
users get Substrait support out of the box — but downstream Rust embedders who 
strip features can opt out, and the JNI handler returns a clear error if 
invoked in a build that compiled the feature off.
   
   ### Describe alternatives you've considered
   
   _No response_
   
   ### Additional context
   
   - DataFusion 53.1 pulls `substrait = 0.62.2` (proto schema version). The 
Rust API is stable for the consumer side: `from_substrait_plan(state: 
&SessionState, plan: &Plan) -> Result<LogicalPlan>` plus 
`serializer::deserialize_bytes(Vec<u8>) -> Result<Box<Plan>>` for the 
wire-format decode.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to