Dandandan commented on code in PR #7650: URL: https://github.com/apache/arrow-rs/pull/7650#discussion_r2148839141
########## arrow-select/src/coalesce.rs: ########## @@ -222,122 +249,339 @@ impl BatchCoalescer { } } -/// Heuristically compact `StringViewArray`s to reduce memory usage, if needed -/// -/// Decides when to consolidate the StringView into a new buffer to reduce -/// memory usage and improve string locality for better performance. -/// -/// This differs from `StringViewArray::gc` because: -/// 1. It may not compact the array depending on a heuristic. -/// 2. It uses a precise block size to reduce the number of buffers to track. -/// -/// # Heuristic +/// Return a new `InProgressArray` for the given data type +fn create_in_progress_array(data_type: &DataType, batch_size: usize) -> Box<dyn InProgressArray> { Review Comment: For using it in DataFusion (https://github.com/apache/datafusion/pull/16249) this needs to be Send + Sync. ########## arrow-select/src/coalesce.rs: ########## @@ -222,122 +249,339 @@ impl BatchCoalescer { } } -/// Heuristically compact `StringViewArray`s to reduce memory usage, if needed -/// -/// Decides when to consolidate the StringView into a new buffer to reduce -/// memory usage and improve string locality for better performance. -/// -/// This differs from `StringViewArray::gc` because: -/// 1. It may not compact the array depending on a heuristic. -/// 2. It uses a precise block size to reduce the number of buffers to track. -/// -/// # Heuristic +/// Return a new `InProgressArray` for the given data type +fn create_in_progress_array(data_type: &DataType, batch_size: usize) -> Box<dyn InProgressArray> { Review Comment: For using it in DataFusion (https://github.com/apache/datafusion/pull/16249) this needs to be Send + Sync. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org