advancedxy commented on code in PR #8579:
URL: https://github.com/apache/iceberg/pull/8579#discussion_r1448606533


##########
format/spec.md:
##########
@@ -322,14 +322,22 @@ The `void` transform may be used to replace the transform 
in an existing partiti
 
 #### Bucket Transform Details
 
-Bucket partition transforms use a 32-bit hash of the source value. The 32-bit 
hash implementation is the 32-bit Murmur3 hash, x86 variant, seeded with 0.
+Bucket partition transforms use a 32-bit hash of the source value or source 
value list. The 32-bit hash implementation is the 32-bit Murmur3 hash, x86 
variant, seeded with 0.
 
 Transforms are parameterized by a number of buckets [1], `N`. The hash mod `N` 
must produce a positive value by first discarding the sign bit of the hash 
value. In pseudo-code, the function is:
 
 ```
   def bucket_N(x) = (murmur3_x86_32_hash(x) & Integer.MAX_VALUE) % N
 ```
 
+When bucket transforming a list of values(a.k.a. multi-arg bucket), the input 
is treated as a struct. The struct fields are hashed and the hashes are 
combined using the same hash function. In pseudo-code, the hash function is:
+
+```
+  def murmur3_x86_hash(struct(x1, x2, ..., xn)) = 
hasher.put(x1).put(x2)...put(xn).hash().asInt

Review Comment:
   Answered in another thread.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to