Re: [PR] Refactor arrow-avro `Decoder` to support partial decoding [arrow-rs]

via GitHub Mon, 11 Aug 2025 10:31:16 -0700


jecsand838 commented on code in PR #8100:
URL: https://github.com/apache/arrow-rs/pull/8100#discussion_r2267513571



##########
arrow-avro/src/reader/mod.rs:
##########
@@ -270,6 +271,24 @@ impl Decoder {
                 self.active_decoder = new_decoder;
             }
         }
+    }
+
+    fn apply_pending_schema_if_batch_empty(&mut self) {
+        if self.remaining_capacity != self.batch_size {
+            return;
+        }
+        self.apply_pending_schema();
+    }
+
+    /// Produce a `RecordBatch` if at least one row is fully decoded, returning
+    /// `Ok(None)` if no new rows are available.
+    pub fn flush(&mut self) -> Result<Option<RecordBatch>, ArrowError> {
+        if self.remaining_capacity == self.batch_size {
+            return Ok(None);
+        }
+        let batch = self.active_decoder.flush()?;
+        self.remaining_capacity = self.batch_size;
+        self.apply_pending_schema();

Review Comment:
   That's a good call out. I'll create a helper for:
   
   ```rust
           let batch = self.active_decoder.flush()?;
           self.remaining_capacity = self.batch_size;
   ```
   
   I'm wanting to keep the schema change logic completely decoupled from the 
block decoder/flush path for now. Just to avoid confusion for future 
contributors and to setup us up for any future `Decoder` decomposition efforts. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] Refactor arrow-avro `Decoder` to support partial decoding [arrow-rs]

Reply via email to