jecsand838 commented on code in PR #8274: URL: https://github.com/apache/arrow-rs/pull/8274#discussion_r2322647719
########## arrow-avro/src/writer/format.rs: ########## @@ -44,24 +43,6 @@ pub trait AvroFormat: Debug + Default { #[derive(Debug, Default)] pub struct AvroOcfFormat { sync_marker: [u8; 16], - /// Optional encoder behavior hints to keep file header schema ordering - /// consistent with value encoding (e.g. Impala null-second). - encoder_options: EncoderOptions, -} - -impl AvroOcfFormat { - /// Optional helper to attach encoder options (i.e., Impala null-second) to the format. - #[allow(dead_code)] - pub fn with_encoder_options(mut self, opts: EncoderOptions) -> Self { - self.encoder_options = opts; - self - } - - /// Access the options used by this format. - #[allow(dead_code)] - pub fn encoder_options(&self) -> &EncoderOptions { - &self.encoder_options - } Review Comment: That's correct. I realized having encoder options created two sources of truth informing encoder behavior: The schema and the encoder options. The correct way is to have the schema be the only source of truth, otherwise we risk having records written that deviate form the associated writer schema, i.e. rows written to an Avro file that cannot be decoding using the writer schema provided in the file's header. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org