RussellSpitzer commented on code in PR #10835:
URL: https://github.com/apache/iceberg/pull/10835#discussion_r1700372842
##########
format/spec.md:
##########
@@ -241,7 +241,9 @@ Struct evolution requires the following rules for default
values:
#### Column Projection
-Columns in Iceberg data files are selected by field id. The table schema's
column names and order may change after a data file is written, and projection
must be done using field ids. If a field id is missing from a data file, its
value for each row should be `null`.
+Columns in Iceberg data files are selected by field id. The table schema's
column names and order may change after a data file is written, and projection
must be done using field ids.
+
+When a projected column has an [identity partition
transform](#partition-transforms) applied to it for a data file, the value from
the [manifest file](#manifests) must be used for that column (i.e. the column
should not be read from the data file). This is to support tables that were
migrated from other table formats (notably Hive) that do not write partition
values to data files. Otherwise, if a field id is missing from a data file, its
value for each row should be `null`.
Review Comment:
I'm not sure I understand the wording here.
Can we be more explicit? Something like :
"Values for Field Ids which are not present in a data file **must** be
projected as
1. The value in the partition metadata if an Identity Transform exists for
that field or
2. The default value as defined in [Default values} or
3. null
^ Just a thought though, I'm open to other wordings or presentations
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]