HonahX commented on code in PR #11947:
URL: https://github.com/apache/iceberg/pull/11947#discussion_r1915460216
##########
core/src/main/java/org/apache/iceberg/TableMetadataParser.java:
##########
@@ -352,6 +352,7 @@ public static TableMetadata fromJson(String
metadataLocation, JsonNode node) {
ImmutableList.Builder<Schema> builder = ImmutableList.builder();
for (JsonNode schemaNode : schemaArray) {
Schema current = SchemaParser.fromJson(schemaNode);
+ Schema.checkCompatibility(current, formatVersion);
Review Comment:
I’ve given this some thought.
I think the requirements of compatibility checking in the 2 code paths are
different. When parsing metadata from a JSON file, we need to perform
compatibility checks for every schema in the metadata. However, when building
new metadata from an existing one, we only need to check the compatibility of
newly added schemas, as the existing schemas in the `TableMetadata` object can
be trusted.
TableMetadata constructor is also directly called here:
https://github.com/apache/iceberg/blob/2551587bf9d340507c2a4ca8ee355ee43c02383c/core/src/main/java/org/apache/iceberg/RewriteTablePathUtil.java#L107-L109
For this and similar use case in the future, I think we do not need
additional compatibility checks.
In general, compatibility checks are expensive because they require
iterating through fields, and the cost will increase as we add more fields and
features to schemas in v4, v5, and beyond. Therefore, I think it’s better to
minimize the number of checks whenever possible.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]