shardulm94 opened a new pull request, #6327:
URL: https://github.com/apache/iceberg/pull/6327
Closes #4604
This is an alternate and arguably simpler implementation to #4599. The issue
is that ORC read path did not support projecting nested structs which have just
the partition columns selected. As part of the ORC read path, we drop constant
fields from the projected schema before passing it to the ORC file reader.
Example:
```
Schema readSchemaWithoutConstantAndMetadataFields =
TypeUtil.selectNot(
readSchema, Sets.union(idToConstant.keySet(),
MetadataColumns.metadataFieldIds()));
```
This step also results in dropping of structs which contain just the
partition columns as they now become empty.
#4599 tries to fix this by not dropping nested struct containing partition
columns, thus reading the partition values from the file. This PR instead takes
a different approach by preserving empty struct when dropping constant fields.
This allows the existing constant handling in the ORC read path to work as
expected even for nested partition fields.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]