laskoviymishka commented on code in PR #932:
URL: https://github.com/apache/iceberg-go/pull/932#discussion_r3125521106
##########
table/metadata_schema_compatibility.go:
##########
@@ -168,7 +168,9 @@ func (v *unknownTypeValidator) Field(field
iceberg.NestedField, fieldResult erro
if fieldResult != nil {
return fieldResult
}
- if _, isUnknown := field.Type.(iceberg.UnknownType); isUnknown {
+ _, isUnknown := field.Type.(iceberg.UnknownType)
+ _, isVariant := field.Type.(iceberg.VariantType)
+ if isUnknown || isVariant {
Review Comment:
Error says "unknown type field" when the type is Variant — confusing. Either
split into variantTypeValidator or generalize the message with
field.Type.String().
Also: please verify against the v3 spec whether required Variant is actually
disallowed. Unknown must be optional because it means "no value". Variant holds
real data — PyIceberg permits required.
##########
exprs.go:
##########
@@ -450,6 +450,8 @@ func createBoundRef(field NestedField, acc accessor)
BoundReference {
return &boundRef[Decimal]{field: field, acc: acc}
case UUIDType:
return &boundRef[uuid.UUID]{field: field, acc: acc}
+ case VariantType:
+ return &boundRef[[]byte]{field: field, acc: acc}
Review Comment:
Variant's runtime value is variant.Value, not []byte.
This works only because createBoundLiteralPredicate and
createBoundSetPredicate don't have a Variant case, so ordered/equality
predicates return an error before eval() is reached. Only IsNull/NotNull
actually bind, and those don't inspect the value. Add a comment explaining this
invariant — future readers (and contributors adding shredding) need to know why
[]byte is safe here.
##########
table/internal/parquet_files.go:
##########
@@ -573,7 +575,10 @@ func (p parquetFormat) DataFileStatsFromMeta(meta
Metadata, statsCols map[int]St
panic(err)
}
- fieldID := colMapping[colChunk.PathInSchema().String()]
+ fieldID, ok :=
colMapping[colChunk.PathInSchema().String()]
+ if !ok {
Review Comment:
This is needed for Variant's sub-columns (metadata/value) having no Iceberg
field ID, but it silently skips any column chunk missing from the mapping —
previously the zero-value fieldID = 0 would surface the problem downstream.
Narrow the skip to Variant sub-columns only, or add a comment stating this
is specifically for Variant's nested children.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]