johnclara opened a new issue #1944:
URL: https://github.com/apache/iceberg/issues/1944


   One of our table's TableMetadata was referencing a missing ManifestList 
somehow (happened well before any table properties for path changes). Still 
rootcausing how the corrupt state occurred.
   
   Because of the corrupt state, we thought we shouldn't use the normal 
rollback. It looks like it will commit snapshots and that will keep the corrupt 
snapshot in snapshot history. (We're not 100% sure on the code path for this).
   
   Since ManifestLists are lazily evaluated, our control plane was able to 
continue making property updates/purge snapshots. This meant the MetadataLog 
grew past 100 (default retention size).
   
   We recursively looped back through TableMetadata files in the 
TableMetadata's MetadataLog until we found the entry which matches the 
TableMetadata path with the missing ManifestList.
   
   Then we updated the Metastore to reference the path of the MetadataLog entry 
immediately before the broken MetadataLog entry. Afterwards, we reset upstream 
state (kafka offsets) to the before the time this entry was created.
   
   Should catalogs support this type of operation? Or should Iceberg assume 
state will never get corrupted and only support auditable rollback 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to