nevi-me commented on a change in pull request #7365:
URL: https://github.com/apache/arrow/pull/7365#discussion_r436307832



##########
File path: rust/arrow/src/array/builder.rs
##########
@@ -577,6 +632,81 @@ where
         self
     }
 
+    /// Appends data from other arrays into the builder
+    ///
+    /// This is most useful when concatenating arrays of the same type into a 
builder.
+    fn append_data(&mut self, data: &[ArrayDataRef]) -> Result<()> {
+        if !check_array_data_type(&self.data_type(), data) {
+            return Err(ArrowError::InvalidArgumentError(
+                "Cannot append data to builder if data types are 
different".to_string(),
+            ));
+        }
+        // determine the latest offset on the builder
+        let mut cum_offset = if self.offsets_builder.len() == 0 {
+            0
+        } else {
+            // peek into buffer to get last appended offset
+            let buffer = self.offsets_builder.buffer.data();
+            let len = self.offsets_builder.len();
+            let (start, end) = ((len - 1) * 4, len * 4);
+            let slice = &buffer[start..end];
+            i32::from_le_bytes(slice.try_into().unwrap())
+        };
+        for array in data {
+            if array.child_data().len() != 1 {

Review comment:
       The validation is only for data type, so we'd have to make a call on 
whether passing array data that's invalid should be undefined behaviour. If we 
passed in ArrayRef, we'd be certain that data is valid, but otherwise nothing 
stops someone from manually constructing ArrayDataRef incorrectly and passing 
it in. The validation check here at least give the user feedback, otherwise it 
would be a generic bounds error.
   
   I could alternatively customise the validation for different types, with 
potential allocation for both value and bitmap builders for primitive arrays. 
It becomes a slippery slope for lists and structs because those can be deeply 
nested.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to