Re: [PR] Added List and Struct Encoding to arrow-avro Writer [arrow-rs]

via GitHub Thu, 04 Sep 2025 11:20:01 -0700


jecsand838 commented on code in PR #8274:
URL: https://github.com/apache/arrow-rs/pull/8274#discussion_r2322647719



##########
arrow-avro/src/writer/format.rs:
##########
@@ -44,24 +43,6 @@ pub trait AvroFormat: Debug + Default {
 #[derive(Debug, Default)]
 pub struct AvroOcfFormat {
     sync_marker: [u8; 16],
-    /// Optional encoder behavior hints to keep file header schema ordering
-    /// consistent with value encoding (e.g. Impala null-second).
-    encoder_options: EncoderOptions,
-}
-
-impl AvroOcfFormat {
-    /// Optional helper to attach encoder options (i.e., Impala null-second) 
to the format.
-    #[allow(dead_code)]
-    pub fn with_encoder_options(mut self, opts: EncoderOptions) -> Self {
-        self.encoder_options = opts;
-        self
-    }
-
-    /// Access the options used by this format.
-    #[allow(dead_code)]
-    pub fn encoder_options(&self) -> &EncoderOptions {
-        &self.encoder_options
-    }

Review Comment:
   That's correct. I realized having encoder options created two sources of 
truth informing encoder behavior: The schema and the encoder options. The 
correct way is to have the schema be the only source of truth, otherwise we 
risk having records written that deviate form the associated writer schema, 
i.e. rows written to an Avro file that cannot be decoded using the writer 
schema provided in the file's header.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] Added List and Struct Encoding to arrow-avro Writer [arrow-rs]

Reply via email to