mosenberg opened a new issue, #11300:
URL: https://github.com/apache/iceberg/issues/11300
### Apache Iceberg version
None
### Query engine
Spark
### Please describe the bug 🐞
The issue repros using the following SQL:
```sql
CREATE TABLE iceberg.NullabilityPartition(
group STRING NOT NULL,
val INTEGER
)
Partitioned BY (group)
TBLPROPERTIES(`format-version`=2,
`write.parquet.compression-codec`='snappy',
write.delete.format.default='parquet',
write.delete.mode='merge-on-read',
write.update.mode='merge-on-read',
write.merge.mode='merge-on-read');
INSERT INTO iceberg.NullabilityPartition Select * from VALUES
('foo',1),('foo',2);
```
As per the above SQL, the column `group` is defined as `NOT NULL` (i.e.
`required`) column in the Iceberg metadata schema. However, in the generated
avro manifest file, the partition tuple - which stores the value of the `group`
column by which the table is identity-partitioned - the partition value is
stored as an avro union type ["null", "string"].
As per my understanding of the Iceberg spec, this is not correct:
The output value of an identity [partition
transform](https://iceberg.apache.org/spec/#partition-transforms) is equal to
the source type - in this case `STRING NOT NULL`.
The section on [manifest files](https://iceberg.apache.org/spec/#manifests)
further states:
> Partition data tuple, schema based on the partition spec output
> using partition field ids for the struct field ids
Hence the schema of the partition tuple should be `"string"` and not
`["null","string"]`.
### Willingness to contribute
- [ ] I can contribute a fix for this bug independently
- [X] I would be willing to contribute a fix for this bug with guidance from
the Iceberg community
- [ ] I cannot contribute a fix for this bug at this time
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]