liurenjie1024 commented on code in PR #207:
URL: https://github.com/apache/iceberg-rust/pull/207#discussion_r1507451268


##########
crates/iceberg/src/scan.rs:
##########
@@ -163,6 +178,54 @@ impl TableScan {
 
         Ok(iter(file_scan_tasks).boxed())
     }
+
+    /// Transforms a stream of FileScanTasks from plan_files into a stream of
+    /// Arrow RecordBatches.
+    pub fn open(&self, mut tasks: FileScanTaskStream) -> 
crate::Result<ArrowRecordBatchStream> {

Review Comment:
   > > This method body's implementation is correct for one parquet file. For a 
stream of input files, we can use 
[futures::StreamExt::flat_map](https://docs.rs/futures/latest/futures/stream/trait.StreamExt.html#method.flat_map)
 to combine them into a  stream of record batches.
   > 
   > I'm not sure what you mean. I'm achieving the same as what flat_map does 
by nesting the yielding of record batches inside the while loop that consumes 
the file scan tasks. It was more straightforward to do it this way by the use 
of try_stream than to get things working with flat_map - I tried that but ran 
into issues 😁
   
   Oh, I see. Your approach also works, so just ignore this🤣



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to