dramaticlly commented on code in PR #12228:
URL: https://github.com/apache/iceberg/pull/12228#discussion_r2036236652


##########
core/src/main/java/org/apache/iceberg/BaseMetastoreCatalog.java:
##########
@@ -71,23 +70,35 @@ public Table loadTable(TableIdentifier identifier) {
   }
 
   @Override
-  public Table registerTable(TableIdentifier identifier, String 
metadataFileLocation) {
+  public Table registerTable(
+      TableIdentifier identifier, String metadataFileLocation, boolean 
overwrite) {
     Preconditions.checkArgument(
         identifier != null && isValidIdentifier(identifier), "Invalid 
identifier: %s", identifier);
     Preconditions.checkArgument(
         metadataFileLocation != null && !metadataFileLocation.isEmpty(),
         "Cannot register an empty metadata file location as a table");
 
-    // Throw an exception if this table already exists in the catalog.
-    if (tableExists(identifier)) {
+    // If the table already exists and overwriting is disabled, throw an 
exception.
+    if (tableExists(identifier) && !overwrite) {
       throw new AlreadyExistsException("Table already exists: %s", identifier);
     }
 
     TableOperations ops = newTableOps(identifier);
-    InputFile metadataFile = ops.io().newInputFile(metadataFileLocation);
-    TableMetadata metadata = TableMetadataParser.read(ops.io(), metadataFile);
-    ops.commit(null, metadata);
-
+    TableMetadata newMetadata =
+        TableMetadataParser.read(ops.io(), 
ops.io().newInputFile(metadataFileLocation));
+
+    TableMetadata existing = ops.current();
+    if (existing != null && overwrite) {
+      if (existing.metadataFileLocation().equals(metadataFileLocation)) {
+        LOG.info(
+            "The requested metadata matches the existing metadata. No changes 
will be committed.");
+        return new BaseTable(ops, fullTableName(name(), identifier), 
metricsReporter());
+      }
+      dropTable(identifier, false /* Keep all data and metadata files */);

Review Comment:
   thank you @guykhazma , I don't think we have any clear semantics expectation 
for register-table with overwrite in REST API to complete atomically. Table 
specification states that 
   > Table state is maintained in metadata files. All changes to table state 
create a new metadata file and replace the old metadata with an atomic swap. 
   
   IMO, overwrite of table metadata is not a valid state change but rather a 
state overwrite, where state can even come from another table with a different 
table UUID. I had some offline discussion with @RussellSpitzer on this where we 
do agree on this can be catalog implementation specific and open the room for 
atomic swap if catalog can support this.
   
   As for your proposed alternative approach, i think we can write multiple 
metadata.json on file system first and rely on catalog for atomic swap, but we 
might hit the same limitation in TableOperations API, where new metadata.json 
will be rewritten with a different file name as input and difficult to verify. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to