houqp commented on a change in pull request #7365: URL: https://github.com/apache/arrow/pull/7365#discussion_r436336846
########## File path: rust/arrow/src/array/builder.rs ########## @@ -577,6 +632,81 @@ where self } + /// Appends data from other arrays into the builder + /// + /// This is most useful when concatenating arrays of the same type into a builder. + fn append_data(&mut self, data: &[ArrayDataRef]) -> Result<()> { + if !check_array_data_type(&self.data_type(), data) { + return Err(ArrowError::InvalidArgumentError( + "Cannot append data to builder if data types are different".to_string(), + )); + } + // determine the latest offset on the builder + let mut cum_offset = if self.offsets_builder.len() == 0 { + 0 + } else { + // peek into buffer to get last appended offset + let buffer = self.offsets_builder.buffer.data(); + let len = self.offsets_builder.len(); + let (start, end) = ((len - 1) * 4, len * 4); + let slice = &buffer[start..end]; + i32::from_le_bytes(slice.try_into().unwrap()) + }; + for array in data { + if array.child_data().len() != 1 { Review comment: I think we can leave the memory allocation performance optimization to future PRs given that number of arrays in `data: &[ArrayDataRef]` should not be too large. But I do think we should make sure error handling behavior is consistent, i.e. invalid input should not lead to partial append to the array. If we are to use ArrayDataRef as input type, then I feel like we do need to have custom validation logic for each array type as you mentioned. The idea of using `ArrayRef` as input type is interesting. It does look like a simpler interface for end users and could simplify the error handling logic. What's the downside of using `ArrayRef` here compared to `ArrayDataRef`? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org