Jefffrey commented on code in PR #20154:
URL: https://github.com/apache/datafusion/pull/20154#discussion_r2771981300


##########
datafusion/common/src/hash_utils.rs:
##########
@@ -448,7 +462,13 @@ fn hash_struct_array(
     array: &StructArray,
     random_state: &RandomState,
     hashes_buffer: &mut [u64],
+    rehash: bool,
 ) -> Result<()> {
+    // Hashing for nested types follows the same initialize-vs-combine 
convention as

Review Comment:
   We don't need to repeat this explanation in each function; the top level 
explanation is sufficient



##########
datafusion/common/src/hash_utils.rs:
##########
@@ -462,9 +482,23 @@ fn hash_struct_array(
     let mut values_hashes = vec![0u64; row_len];
     create_hashes(array.columns(), random_state, &mut values_hashes)?;
 
-    for i in valid_row_indices {
-        let hash = &mut hashes_buffer[i];
-        *hash = combine_hashes(*hash, values_hashes[i]);
+    if rehash {
+        for i in valid_row_indices {
+            hashes_buffer[i] = combine_hashes(values_hashes[i], 
hashes_buffer[i]);
+        }
+    } else {
+        for i in valid_row_indices {
+            hashes_buffer[i] = values_hashes[i];
+        }
+    }
+
+    // Hash null struct values consistently with other array types

Review Comment:
   I thought other array types ignore nulls when hashing? Except for 
`DataType::Null`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to