guykhazma commented on code in PR #12228:
URL: https://github.com/apache/iceberg/pull/12228#discussion_r2040032952


##########
core/src/main/java/org/apache/iceberg/BaseMetastoreCatalog.java:
##########
@@ -71,23 +70,35 @@ public Table loadTable(TableIdentifier identifier) {
   }
 
   @Override
-  public Table registerTable(TableIdentifier identifier, String 
metadataFileLocation) {
+  public Table registerTable(
+      TableIdentifier identifier, String metadataFileLocation, boolean 
overwrite) {
     Preconditions.checkArgument(
         identifier != null && isValidIdentifier(identifier), "Invalid 
identifier: %s", identifier);
     Preconditions.checkArgument(
         metadataFileLocation != null && !metadataFileLocation.isEmpty(),
         "Cannot register an empty metadata file location as a table");
 
-    // Throw an exception if this table already exists in the catalog.
-    if (tableExists(identifier)) {
+    // If the table already exists and overwriting is disabled, throw an 
exception.
+    if (tableExists(identifier) && !overwrite) {
       throw new AlreadyExistsException("Table already exists: %s", identifier);
     }
 
     TableOperations ops = newTableOps(identifier);
-    InputFile metadataFile = ops.io().newInputFile(metadataFileLocation);
-    TableMetadata metadata = TableMetadataParser.read(ops.io(), metadataFile);
-    ops.commit(null, metadata);
-
+    TableMetadata newMetadata =
+        TableMetadataParser.read(ops.io(), 
ops.io().newInputFile(metadataFileLocation));
+
+    TableMetadata existing = ops.current();
+    if (existing != null && overwrite) {
+      if (existing.metadataFileLocation().equals(metadataFileLocation)) {
+        LOG.info(
+            "The requested metadata matches the existing metadata. No changes 
will be committed.");
+        return new BaseTable(ops, fullTableName(name(), identifier), 
metricsReporter());
+      }
+      dropTable(identifier, false /* Keep all data and metadata files */);

Review Comment:
   @dramaticlly Could you elaborate on why you see this as a state overwrite?
   
   I can imagine a scenario where some state or partial state is transferred 
from another table (with a different UUID), which might be interpreted as a 
state change. However, I’m not sure the register API is the appropriate 
mechanism for that.
   
   From my understanding, the purpose of the register API is to create a named 
reference to a metadata JSON. It doesn’t inherently imply any change to the 
actual state of a table. Even if you register it against an existing table and 
the resulting metadata reflects a different state, it doesn’t mean that the 
underlying storage state has changed.
   
   For instance, it's possible to register the same table multiple times under 
different names using distinct metadata files—effectively simulating branching 
using different entries in the catalog.



##########
core/src/main/java/org/apache/iceberg/BaseMetastoreCatalog.java:
##########
@@ -71,23 +70,35 @@ public Table loadTable(TableIdentifier identifier) {
   }
 
   @Override
-  public Table registerTable(TableIdentifier identifier, String 
metadataFileLocation) {
+  public Table registerTable(
+      TableIdentifier identifier, String metadataFileLocation, boolean 
overwrite) {
     Preconditions.checkArgument(
         identifier != null && isValidIdentifier(identifier), "Invalid 
identifier: %s", identifier);
     Preconditions.checkArgument(
         metadataFileLocation != null && !metadataFileLocation.isEmpty(),
         "Cannot register an empty metadata file location as a table");
 
-    // Throw an exception if this table already exists in the catalog.
-    if (tableExists(identifier)) {
+    // If the table already exists and overwriting is disabled, throw an 
exception.
+    if (tableExists(identifier) && !overwrite) {
       throw new AlreadyExistsException("Table already exists: %s", identifier);
     }
 
     TableOperations ops = newTableOps(identifier);
-    InputFile metadataFile = ops.io().newInputFile(metadataFileLocation);
-    TableMetadata metadata = TableMetadataParser.read(ops.io(), metadataFile);
-    ops.commit(null, metadata);
-
+    TableMetadata newMetadata =
+        TableMetadataParser.read(ops.io(), 
ops.io().newInputFile(metadataFileLocation));
+
+    TableMetadata existing = ops.current();
+    if (existing != null && overwrite) {
+      if (existing.metadataFileLocation().equals(metadataFileLocation)) {
+        LOG.info(
+            "The requested metadata matches the existing metadata. No changes 
will be committed.");
+        return new BaseTable(ops, fullTableName(name(), identifier), 
metricsReporter());
+      }
+      dropTable(identifier, false /* Keep all data and metadata files */);

Review Comment:
   thank you @dramaticlly, Could you elaborate on why you see this as a state 
overwrite?
   
   I can imagine a scenario where some state or partial state is transferred 
from another table (with a different UUID), which might be interpreted as a 
state change. However, I’m not sure the register API is the appropriate 
mechanism for that.
   
   From my understanding, the purpose of the register API is to create a named 
reference to a metadata JSON. It doesn’t inherently imply any change to the 
actual state of a table. Even if you register it against an existing table and 
the resulting metadata reflects a different state, it doesn’t mean that the 
underlying storage state has changed.
   
   For instance, it's possible to register the same table multiple times under 
different names using distinct metadata files—effectively simulating branching 
using different entries in the catalog.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to