Re: [PR] Add SqlCatalog _commit_table support [iceberg-python]

via GitHub Thu, 11 Jan 2024 20:33:01 -0800


HonahX commented on code in PR #265:
URL: https://github.com/apache/iceberg-python/pull/265#discussion_r1449791012



##########
pyiceberg/catalog/sql.py:
##########
@@ -329,8 +330,46 @@ def _commit_table(self, table_request: CommitTableRequest) 
-> CommitTableRespons
 
         Raises:
             NoSuchTableError: If a table with the given identifier does not 
exist.
+            CommitFailedException: If the commit failed.
         """
-        raise NotImplementedError
+        identifier_tuple = self.identifier_to_tuple_without_catalog(
+            tuple(table_request.identifier.namespace.root + 
[table_request.identifier.name])
+        )
+        current_table = self.load_table(identifier_tuple)
+        from_database_name, from_table_name = 
self.identifier_to_database_and_table(identifier_tuple, NoSuchTableError)

Review Comment:
   To me, the `from_*` pattern, typically used alongside `to_*`, suggests a 
transformation, as we see in `rename_table` where the database and table name 
shift from A to B. But in `_commit_table`, where the names remain unchanged, I 
think sticking to `database_name` and `table_name` might be more 
straightforward. What do you think?



##########
pyiceberg/catalog/sql.py:
##########
@@ -329,8 +330,46 @@ def _commit_table(self, table_request: CommitTableRequest) 
-> CommitTableRespons
 
         Raises:
             NoSuchTableError: If a table with the given identifier does not 
exist.
+            CommitFailedException: If the commit failed.
         """
-        raise NotImplementedError
+        identifier_tuple = self.identifier_to_tuple_without_catalog(
+            tuple(table_request.identifier.namespace.root + 
[table_request.identifier.name])
+        )
+        current_table = self.load_table(identifier_tuple)
+        from_database_name, from_table_name = 
self.identifier_to_database_and_table(identifier_tuple, NoSuchTableError)
+        base_metadata = current_table.metadata
+        for requirement in table_request.requirements:
+            requirement.validate(base_metadata)
+
+        updated_metadata = update_table_metadata(base_metadata, 
table_request.updates)
+        if updated_metadata == base_metadata:
+            # no changes, do nothing
+            return CommitTableResponse(metadata=base_metadata, 
metadata_location=current_table.metadata_location)
+
+        # write new metadata
+        new_metadata_version = 
self._parse_metadata_version(current_table.metadata_location) + 1
+        new_metadata_location = 
self._get_metadata_location(current_table.metadata.location, 
new_metadata_version)
+        self._write_metadata(updated_metadata, current_table.io, 
new_metadata_location)
+
+        with Session(self.engine) as session:
+            stmt = (
+                update(IcebergTables)
+                .where(
+                    IcebergTables.catalog_name == self.name,
+                    IcebergTables.table_namespace == from_database_name,
+                    IcebergTables.table_name == from_table_name,
+                    IcebergTables.metadata_location == 
current_table.metadata_location,
+                )
+                .values(metadata_location=new_metadata_location, 
previous_metadata_location=current_table.metadata_location)
+            )
+            result = session.execute(stmt)
+            if result.rowcount < 1:

Review Comment:
   ```suggestion
               if result.rowcount != 1:
   ```
   Shall we use `!=` to be more restrictive? This can help reflect that we 
expect exactly one row to be updated in a successful commit.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] Add SqlCatalog _commit_table support [iceberg-python]

Reply via email to