alamb commented on a change in pull request #505:
URL: https://github.com/apache/arrow-rs/pull/505#discussion_r660128501



##########
File path: arrow/src/array/array.rs
##########
@@ -198,10 +198,14 @@ pub trait Array: fmt::Debug + Send + Sync + JsonEqual {
     }
 
     /// Returns the total number of bytes of memory occupied by the buffers 
owned by this array.

Review comment:
       ```suggestion
       /// Returns the total number of bytes of memory pointed to by this array.
       /// The buffers store bytes in the Arrow memory format, and include the 
data as well as the validity map.
   ```
   
   The distinction between `buffers` and `physically occupied` has always been 
somewhat confusing to me. Perhaps we can take this opportunity to clarify what 
they mean

##########
File path: arrow/src/array/array.rs
##########
@@ -661,4 +666,63 @@ mod tests {
             null_array.data().buffers()[0].len()
         );
     }
+
+    #[test]
+    fn test_memory_size_primitive() {
+        let arr = PrimitiveArray::<Int64Type>::from_iter_values(0..128);
+        let empty =
+            
PrimitiveArray::<Int64Type>::from(ArrayData::new_empty(arr.data_type()));
+
+        // substract empty array to avoid magic numbers for the size of 
additional fields
+        assert_eq!(
+            arr.get_array_memory_size() - empty.get_array_memory_size(),

Review comment:
       this is a cool calculation 👍 

##########
File path: arrow/src/array/array.rs
##########
@@ -198,10 +198,14 @@ pub trait Array: fmt::Debug + Send + Sync + JsonEqual {
     }
 
     /// Returns the total number of bytes of memory occupied by the buffers 
owned by this array.
-    fn get_buffer_memory_size(&self) -> usize;
+    fn get_buffer_memory_size(&self) -> usize {
+        self.data_ref().get_buffer_memory_size()
+    }
 
     /// Returns the total number of bytes of memory occupied physically by 
this array.

Review comment:
       ```suggestion
       /// Returns the total number of bytes of memory occupied physically by 
this array.
       /// This value will always be greater than returned by 
`get_buffer_memory_size()` and
       /// includes the overhead of the data structures that contain the 
pointers to the various buffers.
   ```

##########
File path: arrow/src/array/data.rs
##########
@@ -354,12 +354,7 @@ impl ArrayData {
 
     /// Returns the total number of bytes of memory occupied physically by 
this [ArrayData].
     pub fn get_array_memory_size(&self) -> usize {
-        let mut size = 0;
-        // Calculate size of the fields that don't have 
[get_array_memory_size] method internally.
-        size += mem::size_of_val(self)
-            - mem::size_of_val(&self.buffers)

Review comment:
       since `child_data` and `null_bitmap` include the size of `self` in the 
results of `bitmap.get_array_memory_size()` and `child.get_array_memory_size()` 
I think we still need to subtract them off. 
   
   Perhaps a pattern such as 
   
   ```rust
           if let Some(bitmap) = &self.null_bitmap {
               size += bitmap.get_array_memory_size()
           }
           for child in &self.child_data {
               size += child.get_array_memory_size();
           }
   ```
   would make the intent clearer

##########
File path: arrow/src/array/data.rs
##########
@@ -354,12 +354,7 @@ impl ArrayData {
 
     /// Returns the total number of bytes of memory occupied physically by 
this [ArrayData].
     pub fn get_array_memory_size(&self) -> usize {
-        let mut size = 0;
-        // Calculate size of the fields that don't have 
[get_array_memory_size] method internally.
-        size += mem::size_of_val(self)
-            - mem::size_of_val(&self.buffers)

Review comment:
       since `child_data` and `null_bitmap` include the size of `self` in the 
results of `bitmap.get_array_memory_size()` and `child.get_array_memory_size()` 
I think we still need to subtract them off. 
   
   Perhaps a pattern such as *edited*
   
   ```rust
           if let Some(bitmap) = &self.null_bitmap {
               size += bitmap.get_array_memory_size()
               size -= mem::size_of_val(&bitmap);
           }
   ```
   would make the intent clearer




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to