alliasgher commented on code in PR #876:
URL: https://github.com/apache/iceberg-go/pull/876#discussion_r3074607543


##########
manifest.go:
##########
@@ -1650,6 +1667,48 @@ func avroPartitionData(input map[int]any, logicalTypes 
map[int]avro.LogicalType)
        return out
 }
 
+// convertFixedValue wraps a FixedType partition value for hamba/avro's
+// union encoder. The encoder cannot dispatch a bare [N]byte array inside
+// a union (see https://github.com/hamba/avro/issues/571), but it does
+// accept the explicit {"<fixed-name>": [N]byte} form used elsewhere for
+// UUID and decimal fixed branches. Nil (the other union arm) is passed
+// through untouched.
+func convertFixedValue(v any, size int) any {
+       if v == nil {
+               return map[string]any{"null": nil}
+       }
+
+       switch b := v.(type) {
+       case []byte:
+               return map[string]any{fixedSchemaName(size): 
convertToFixedArray(padOrTruncateBytes(b, size), size)}
+       case FixedLiteral:
+               return map[string]any{fixedSchemaName(size): 
convertToFixedArray(padOrTruncateBytes([]byte(b), size), size)}

Review Comment:
   Removed — it was speculative dead code. Partition values in `map[int]any` 
reach `convertFixedValue` only via `DataFile.Partition()` or avro decode, both 
of which produce plain `[]byte`, not `FixedLiteral`. Since `FixedLiteral` is a 
named type (`type FixedLiteral []byte`), it wouldn't match `case []byte:` in 
the type switch anyway, so keeping the branch would be misleading.



##########
manifest.go:
##########
@@ -1650,6 +1667,48 @@ func avroPartitionData(input map[int]any, logicalTypes 
map[int]avro.LogicalType)
        return out
 }
 
+// convertFixedValue wraps a FixedType partition value for hamba/avro's
+// union encoder. The encoder cannot dispatch a bare [N]byte array inside
+// a union (see https://github.com/hamba/avro/issues/571), but it does
+// accept the explicit {"<fixed-name>": [N]byte} form used elsewhere for
+// UUID and decimal fixed branches. Nil (the other union arm) is passed
+// through untouched.
+func convertFixedValue(v any, size int) any {
+       if v == nil {
+               return map[string]any{"null": nil}
+       }
+
+       switch b := v.(type) {
+       case []byte:
+               return map[string]any{fixedSchemaName(size): 
convertToFixedArray(padOrTruncateBytes(b, size), size)}
+       case FixedLiteral:
+               return map[string]any{fixedSchemaName(size): 
convertToFixedArray(padOrTruncateBytes([]byte(b), size), size)}
+       }
+
+       return v
+}
+
+// unwrapFixedValue reverses [convertFixedValue]: hamba/avro surfaces a
+// nullable fixed branch as map[string]any{"<fixed-name>": [N]byte}. We
+// flatten that back into the []byte slice shape Iceberg partition data
+// uses everywhere else.
+func unwrapFixedValue(v any, size int) any {
+       if unionMap, ok := v.(map[string]any); ok {
+               if inner, ok := unionMap[fixedSchemaName(size)]; ok {
+                       v = inner
+               }
+       }
+       rv := reflect.ValueOf(v)
+       if rv.Kind() == reflect.Array && rv.Type().Elem().Kind() == 
reflect.Uint8 {

Review Comment:
   Added a comment explaining this in f896a44 — hamba/avro decodes a fixed 
union branch as a Go `[N]byte` array (not a `[]byte` slice), and `reflect` is 
the only portable way to copy out of an array whose element count is not known 
at compile time.



##########
manifest.go:
##########
@@ -1650,6 +1667,48 @@ func avroPartitionData(input map[int]any, logicalTypes 
map[int]avro.LogicalType)
        return out
 }
 
+// convertFixedValue wraps a FixedType partition value for hamba/avro's
+// union encoder. The encoder cannot dispatch a bare [N]byte array inside
+// a union (see https://github.com/hamba/avro/issues/571), but it does
+// accept the explicit {"<fixed-name>": [N]byte} form used elsewhere for
+// UUID and decimal fixed branches. Nil (the other union arm) is passed
+// through untouched.
+func convertFixedValue(v any, size int) any {
+       if v == nil {
+               return map[string]any{"null": nil}
+       }
+
+       switch b := v.(type) {
+       case []byte:
+               return map[string]any{fixedSchemaName(size): 
convertToFixedArray(padOrTruncateBytes(b, size), size)}
+       case FixedLiteral:
+               return map[string]any{fixedSchemaName(size): 
convertToFixedArray(padOrTruncateBytes([]byte(b), size), size)}
+       }
+
+       return v
+}
+
+// unwrapFixedValue reverses [convertFixedValue]: hamba/avro surfaces a
+// nullable fixed branch as map[string]any{"<fixed-name>": [N]byte}. We
+// flatten that back into the []byte slice shape Iceberg partition data
+// uses everywhere else.
+func unwrapFixedValue(v any, size int) any {

Review Comment:
   Yes — it's on the manifest read path: `initializeMapData` → 
`convertAvroValueToIcebergType` → `unwrapFixedValue` is called for every entry 
read.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to