itsjunetime commented on code in PR #6690:
URL: https://github.com/apache/arrow-rs/pull/6690#discussion_r1838650259


##########
arrow-flight/src/encode.rs:
##########
@@ -327,6 +327,10 @@ impl FlightDataEncoder {
 
     /// Encodes batch into one or more `FlightData` messages in self.queue
     fn encode_batch(&mut self, batch: RecordBatch) -> Result<()> {
+        if batch.num_rows() == 0 {

Review Comment:
   So it looks like there are two contradicting comments regarding this 
behavior:
   1. The doc comment for this function says `one or more` FlightData messages 
- implying that if we send an empty batch, it should still load one 
`FlightData` message into the queue. This is at odds with the previous 
behavior, which just used the response from `split_batch_for_grpc_response` as 
an iterator for batches to load in, but `split_batch...` returned an empty 
vector when given a `RecordBatch` with 0 rows.
   2. A comment in the arrow-flight tests 
([here](https://github.com/apache/arrow-rs/blob/3ee5048c8ea3aa531d111afe33d0a3551eabcd84/arrow-flight/tests/encode_decode.rs#L454))
 that are relevant to this change says that empty batches should not be 
encoded. If we comment out the `filter` that removes empty batches AND comment 
out this check for empty batches, all tests pass.
   
   Because there are multiple conflicting records of what is the expected 
behavior, I'm not sure what to do. I feel like we should respect the 
doc-comment, as that's what's documented to the users, but the behavior seems 
to be contrary to that, and it's also contrary to some other tests in arrow-ipc 
that always expect encoding of a Batch to return some encoded data (even if 
it's just a header).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to