Re: [PR] Added List and Struct Encoding to arrow-avro Writer [arrow-rs]

via GitHub Thu, 04 Sep 2025 09:32:11 -0700


jecsand838 commented on code in PR #8274:
URL: https://github.com/apache/arrow-rs/pull/8274#discussion_r2322647719



##########
arrow-avro/src/writer/format.rs:
##########
@@ -44,24 +43,6 @@ pub trait AvroFormat: Debug + Default {
 #[derive(Debug, Default)]
 pub struct AvroOcfFormat {
     sync_marker: [u8; 16],
-    /// Optional encoder behavior hints to keep file header schema ordering
-    /// consistent with value encoding (e.g. Impala null-second).
-    encoder_options: EncoderOptions,
-}
-
-impl AvroOcfFormat {
-    /// Optional helper to attach encoder options (i.e., Impala null-second) 
to the format.
-    #[allow(dead_code)]
-    pub fn with_encoder_options(mut self, opts: EncoderOptions) -> Self {
-        self.encoder_options = opts;
-        self
-    }
-
-    /// Access the options used by this format.
-    #[allow(dead_code)]
-    pub fn encoder_options(&self) -> &EncoderOptions {
-        &self.encoder_options
-    }

Review Comment:
   That's correct. I realized having encoder options created two sources of 
truth informing encoder behavior: The schema and the encoder options. The 
correct way is to have the schema be the only source of truth, otherwise we 
risk having records written that deviate form the associated writer schema, 
i.e. rows written to an Avro file that cannot be decoding using the writer 
schema provided in the file's header.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] Added List and Struct Encoding to arrow-avro Writer [arrow-rs]

Reply via email to