flxo opened a new issue, #431:
URL: https://github.com/apache/avro-rs/issues/431

   ## Context
   
   I'm using `SpecificSingleObjectWriter` for Avro single object encoding and 
noticed an inconsistency in behavior between 
`SpecificSingleObjectWriter::write_ref()` and 
`GenericSingleObjectWriter::write_value_ref()`.
   
   ## Current Behavior
   
   `GenericSingleObjectWriter::write_value_ref()` always writes the header on 
every call:
   
   ```rust
   pub fn write_value_ref<W: Write>(&mut self, v: &Value, writer: &mut W) -> 
AvroResult<usize> {
       let original_length = self.buffer.len();  // header is pre-stored in 
buffer
       // ...
       write_value_ref_owned_resolved(&self.resolved, v, &mut self.buffer)?;
       writer.write_all(&self.buffer)?;  // writes header + data
       self.buffer.truncate(original_length);  // resets to just header
       Ok(len)
   }
   ```
   
   However, `SpecificSingleObjectWriter` has a `header_written: bool` field and 
previously only wrote the header on the first call to `write_ref()`:
   
   ```rust
   pub fn write_ref<W: Write>(&mut self, data: &T, writer: &mut W) -> 
AvroResult<usize> {
       if !self.header_written {
           bytes_written += writer.write(self.inner.buffer.as_slice())?;
           self.header_written = true;
       }
       // ... serialize data without header on subsequent calls
   }
   ```
   
   ## Questions
   
   1. **What is the intended use case for `header_written`?** 
      - Is `SpecificSingleObjectWriter` meant for streaming multiple records 
after a single header (like the object container format)?
      - Or is it meant for single-object encoding where each message should be 
independently decodable (requiring header on each message)?
   
   2. **Why the inconsistency with `GenericSingleObjectWriter`?**
      - `GenericSingleObjectWriter::write_value_ref()` writes header every time
      - `SpecificSingleObjectWriter::write_ref()` only wrote header once
   
   
   ## Current State
   
   I've modified `write_ref()` to always write the header (matching 
`GenericSingleObjectWriter` behavior) and removed the `header_written` field. 
This allows reusing *one* writer for multiple independent messages without the 
cost of reconstructing the writer each time.
   This is kind of a band aid for - this issue is just for clarification.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to