Blazer-007 commented on code in PR #4145:
URL: https://github.com/apache/gobblin/pull/4145#discussion_r2432604911
##########
gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/iceberg/IcebergTable.java:
##########
@@ -315,4 +317,29 @@ protected void overwritePartition(List<DataFile>
dataFiles, String partitionColN
log.info("~{}~ SnapshotId after overwrite: {}", tableId,
accessTableMetadata().currentSnapshot().snapshotId());
}
+ /**
+ * update table's schema to the provided {@link Schema}
+ * @param updatedSchema the updated schema to be set on the table.
+ * @throws TableNotFoundException if the table does not exist.
+ */
+ public void updateSchema(Schema updatedSchema) throws TableNotFoundException
{
+ TableMetadata currentTableMetadata = accessTableMetadata();
+ Schema currentSchema = currentTableMetadata.schema();
+
+ if (currentSchema.sameSchema(updatedSchema)) {
+ log.info("~{}~ schema is already up-to-date", tableId);
+ return;
+ }
+
+ log.info("~{}~ updating schema from {} to {}", tableId, currentSchema,
updatedSchema);
+
+ TableMetadata updatedTableMetadata =
currentTableMetadata.updateSchema(updatedSchema,
updatedSchema.highestFieldId());
+
Preconditions.checkArgument(updatedTableMetadata.schema().sameSchema(updatedSchema),
"Schema mismatch after update, please check destination table");
+
+ tableOps.commit(currentTableMetadata, updatedTableMetadata);
+ tableOps.refresh();
+
+ log.info("~{}~ schema updated successfully", tableId);
Review Comment:
Lets not update schema here itself, since we haven't copied files yet
updating schema will leave table in unwanted state if copying files fail due to
any reason
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]