adamreeve commented on code in PR #7111: URL: https://github.com/apache/arrow-rs/pull/7111#discussion_r2015196618
########## parquet/src/arrow/arrow_writer/mod.rs: ########## @@ -727,85 +819,173 @@ pub fn get_column_writers( ) -> Result<Vec<ArrowColumnWriter>> { let mut writers = Vec::with_capacity(arrow.fields.len()); let mut leaves = parquet.columns().iter(); + let column_factory = ArrowColumnWriterFactory::new(); for field in &arrow.fields { - get_arrow_column_writer(field.data_type(), props, &mut leaves, &mut writers)?; + column_factory.get_arrow_column_writer( + field.data_type(), + props, + &mut leaves, + &mut writers, + )?; } Ok(writers) } -/// Gets the [`ArrowColumnWriter`] for the given `data_type` -fn get_arrow_column_writer( - data_type: &ArrowDataType, +/// Returns the [`ArrowColumnWriter`] for a given schema and supports columnar encryption +#[cfg(feature = "encryption")] +fn get_column_writers_with_encryptor( Review Comment: `get_column_writers` above is a `pub fn` to allow users to have low-level control over column writing, eg. see docs at https://github.com/apache/arrow-rs/blob/4e9e1570ef28c160492891539bbc9649ec069a53/parquet/src/arrow/arrow_writer/mod.rs#L606 That method won't work with encryption though. This new method is private as `FileEncryptor` is only `pub(crate)`, but maybe we should make this pub and only require pub types in order to support the same use case? Or modify `get_column_writers` to take a `row_group_idx` and `&SerializedFileWriter` if we can make breaking changes? If going with the first approach, that could always be done later as a non-breaking change. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org