Re: [PR] WIP: Glue catalog commit [iceberg-python]

via GitHub Tue, 12 Dec 2023 00:52:37 -0800


nicor88 commented on code in PR #140:
URL: https://github.com/apache/iceberg-python/pull/140#discussion_r1423647501



##########
pyiceberg/catalog/glue.py:
##########
@@ -177,6 +191,23 @@ def _create_glue_table(self, database_name: str, 
table_name: str, table_input: T
         except self.glue.exceptions.EntityNotFoundException as e:
             raise NoSuchNamespaceError(f"Database {database_name} does not 
exist") from e
 
+    def _update_glue_table(self, database_name: str, table_name: str, 
table_input: TableInputTypeDef, version_id: str) -> None:
+        try:
+            self.glue.update_table(DatabaseName=database_name, 
TableInput=table_input, VersionId=version_id)

Review Comment:
   every time that a glue table is updated, a new version is created, and the 
previous versions are retained by default. The amount of table versions per AWS 
account is limited, and I've seen such limited reached many times specifically 
when using iceberg - see also this issue: 
https://github.com/dbt-athena/dbt-athena/issues/524 and this one 
https://github.com/dbt-athena/dbt-athena/pull/522
   
   I'm wondering if you considered setting `SkipArchive` to True by default? 
-(refer to boto3 
[docs](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/glue/client/update_table.html#Glue.Client.update_table))
   Alternatively, you can give the final user control over such parameter. 
   
   Previous table versions are only relevant for debugging e.g. spotting which 
was the old metadata location, but not really helpful for operations like 
snapshot rollback, where you need to use spark for it.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] WIP: Glue catalog commit [iceberg-python]

Reply via email to