devinjdangelo commented on issue #7744:
URL: 
https://github.com/apache/arrow-datafusion/issues/7744#issuecomment-1748622800

   For 2, rather than peaking, I am thinking about extending the 
RecordBatchStream trait like so:
   
   ```rust
   /// Trait for types that stream [arrow::record_batch::RecordBatch]
   pub trait RecordBatchStream: Stream<Item = Result<RecordBatch>> {
       /// Returns the schema of this `RecordBatchStream`.
       ///
       /// Implementation of this trait should guarantee that all 
`RecordBatch`'s returned by this
       /// stream should have the same schema as returned from this method.
       fn schema(&self) -> SchemaRef;
   
       fn partition_info(&self) -> &PartitionInfo
   }
   ```
   
   `PartitionInfo` would contain information needed to identify which hive 
style partition the stream belongs to, and also should be general enough to 
work for non hive style partitioning. Whichever execution plan partitions its 
output should provide each stream with its partitioning information.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to