jorgecarleitao commented on a change in pull request #9235:
URL: https://github.com/apache/arrow/pull/9235#discussion_r561163855



##########
File path: rust/arrow/src/buffer.rs
##########
@@ -963,11 +968,131 @@ impl MutableBuffer {
 
     /// Extends the buffer by `additional` bytes equal to `0u8`, incrementing 
its capacity if needed.
     #[inline]
-    pub fn extend(&mut self, additional: usize) {
+    pub fn extend_zeros(&mut self, additional: usize) {
         self.resize(self.len + additional, 0);
     }
 }
 
+/// # Safety
+/// `ptr` must be allocated for `old_capacity`.
+#[inline]
+unsafe fn reallocate(
+    ptr: NonNull<u8>,
+    old_capacity: usize,
+    new_capacity: usize,
+) -> (NonNull<u8>, usize) {
+    let new_capacity = bit_util::round_upto_multiple_of_64(new_capacity);
+    let new_capacity = std::cmp::max(new_capacity, old_capacity * 2);
+    let ptr = memory::reallocate(ptr, old_capacity, new_capacity);
+    (ptr, new_capacity)
+}
+
+impl<A: ArrowNativeType> Extend<A> for MutableBuffer {
+    #[inline]
+    fn extend<T: IntoIterator<Item = A>>(&mut self, iter: T) {
+        let iterator = iter.into_iter();
+        self.extend_from_iter(iterator)
+    }
+}
+
+impl MutableBuffer {
+    #[inline]
+    fn extend_from_iter<T: ArrowNativeType, I: Iterator<Item = T>>(
+        &mut self,
+        mut iterator: I,
+    ) {
+        let size = std::mem::size_of::<T>();
+
+        // this is necessary because of 
https://github.com/rust-lang/rust/issues/32155
+        let (mut ptr, mut capacity, mut len) = (self.data, self.capacity, 
self.len);
+        let mut dst = unsafe { ptr.as_ptr().add(len) as *mut T };
+
+        while let Some(item) = iterator.next() {
+            if len + size >= capacity {
+                let (lower, _) = iterator.size_hint();
+                let additional = (lower + 1) * size;
+                let (new_ptr, new_capacity) =
+                    unsafe { reallocate(ptr, capacity, len + additional) };

Review comment:
       Thanks a lot. I do agree that this is unsound atm.
   
   Note that arrow does not support complex structs on its buffers, which means 
that we never need to call `drop` on the elements. Under this, do we still need 
a valid `len`? I understood that the `len` was only needed because we could 
have to call drop on the elements up to `len`.
   
   With `SetLenOnDrop`, we borrow a mutable reference to `self.len`, which wont 
allow us to call `self.reserve` inside the `if`. Could this be the reason why 
`SetLenOnDrop` is not being used on the `extend_desugared` (which in my 
understanding this part of the code is reproducing)?
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to