[GitHub] [iceberg] flyrain commented on issue #1617: Support relative paths in Table Metadata

GitBox Mon, 12 Jul 2021 13:06:47 -0700


flyrain commented on issue #1617:
URL: https://github.com/apache/iceberg/issues/1617#issuecomment-878457209



   > Is there an easier way to create a full working replica of an Iceberg 
table where we do not use any files/data from the original table and the 2 
tables (original and the new) can live independently after the creation of the 
replica? 
   
   @pvary , ideally, table replication doesn't involve data file rewrite and 
metadata(manifest-list, manifest, metadata.json) rewrite. The process would be 
as simple as that user copys all files needed, then changes the target table 
properties to get the new status. It isn't the case in reality though.
   
   In this issue thread, we were talking about two ways to replicate a table. 
1. relative path 2. rebuild the metadata files. Neither of them require data 
file rewrite. However, the relative-path approach requires the minimal metadata 
file rewrite, probably only metadata.json per our discussion. But 
metadata-rebuild approach involves rewrite of all three type of metadata files. 
They are metadata.json, manifest-list, and manifest. Every type of file stores 
table information cannot be recreated only by looking at the data files. For 
example, the partition spec in metadata.json and its id in manifest file, and 
the snapshot relative metadata. 
   
   To your question, both source and target tables should be able to live 
independently after the replication. That's relative easy to archive. The hard 
part is to enable incremental sync-up between them and bidirectional 
replication, which are quite common DR(Disaster recovery) use cases.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] flyrain commented on issue #1617: Support relative paths in Table Metadata

Reply via email to