autumnust commented on code in PR #6327:
URL: https://github.com/apache/iceberg/pull/6327#discussion_r1042526245
##########
orc/src/main/java/org/apache/iceberg/orc/ORCSchemaUtil.java:
##########
@@ -442,4 +445,23 @@ static TypeDescription applyNameMapping(TypeDescription
orcSchema, NameMapping n
public static Map<Integer, String> idToOrcName(Schema schema) {
return TypeUtil.visit(schema, new IdToOrcName());
}
+
+ /**
+ * Returns a {@link Schema} which has constant fields and metadata fields
removed from the
+ * provided schema. This utility can be used to create a "read schema" which
can be passed to the
+ * ORC file reader and hence avoiding deserialization and memory costs
associated with column
+ * values already available through Iceberg metadata.
+ *
+ * <p>NOTE: This method, unlike {@link TypeUtil#selectNot(Schema, Set)},
preserves empty structs
+ * (caused due to a struct having all constant fields) so that Iceberg ORC
readers can later add
+ * constant fields in these structs
Review Comment:
nit: Doesn't have to mention the cause for empty structs as there might be
other scenarios like intentional empty struct as part of schema ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]