HonahX commented on code in PR #265: URL: https://github.com/apache/iceberg-python/pull/265#discussion_r1449791012
########## pyiceberg/catalog/sql.py: ########## @@ -329,8 +330,46 @@ def _commit_table(self, table_request: CommitTableRequest) -> CommitTableRespons Raises: NoSuchTableError: If a table with the given identifier does not exist. + CommitFailedException: If the commit failed. """ - raise NotImplementedError + identifier_tuple = self.identifier_to_tuple_without_catalog( + tuple(table_request.identifier.namespace.root + [table_request.identifier.name]) + ) + current_table = self.load_table(identifier_tuple) + from_database_name, from_table_name = self.identifier_to_database_and_table(identifier_tuple, NoSuchTableError) Review Comment: To me, the `from_*` pattern, typically used alongside `to_*`, suggests a transformation, as we see in `rename_table` where the database and table name shift from A to B. But in `_commit_table`, where the names remain unchanged, I think sticking to `database_name` and `table_name` might be more straightforward. What do you think? ########## pyiceberg/catalog/sql.py: ########## @@ -329,8 +330,46 @@ def _commit_table(self, table_request: CommitTableRequest) -> CommitTableRespons Raises: NoSuchTableError: If a table with the given identifier does not exist. + CommitFailedException: If the commit failed. """ - raise NotImplementedError + identifier_tuple = self.identifier_to_tuple_without_catalog( + tuple(table_request.identifier.namespace.root + [table_request.identifier.name]) + ) + current_table = self.load_table(identifier_tuple) + from_database_name, from_table_name = self.identifier_to_database_and_table(identifier_tuple, NoSuchTableError) + base_metadata = current_table.metadata + for requirement in table_request.requirements: + requirement.validate(base_metadata) + + updated_metadata = update_table_metadata(base_metadata, table_request.updates) + if updated_metadata == base_metadata: + # no changes, do nothing + return CommitTableResponse(metadata=base_metadata, metadata_location=current_table.metadata_location) + + # write new metadata + new_metadata_version = self._parse_metadata_version(current_table.metadata_location) + 1 + new_metadata_location = self._get_metadata_location(current_table.metadata.location, new_metadata_version) + self._write_metadata(updated_metadata, current_table.io, new_metadata_location) + + with Session(self.engine) as session: + stmt = ( + update(IcebergTables) + .where( + IcebergTables.catalog_name == self.name, + IcebergTables.table_namespace == from_database_name, + IcebergTables.table_name == from_table_name, + IcebergTables.metadata_location == current_table.metadata_location, + ) + .values(metadata_location=new_metadata_location, previous_metadata_location=current_table.metadata_location) + ) + result = session.execute(stmt) + if result.rowcount < 1: Review Comment: ```suggestion if result.rowcount != 1: ``` Shall we use `!=` to be more restrictive? This can help reflect that we expect exactly one row to be updated in a successful commit. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org