JanKaul commented on issue #6420:
URL: https://github.com/apache/iceberg/issues/6420#issuecomment-1426863641

   It would be great if we could make it a catalog-specific decision. But for 
that the metadata has to be designed to enable both strategies.
   
   I think the question is what do we use as an unique identifier for the 
storage table in the representation of the common view?
   
   One approach is to use **table_name, namespace and catalog** as unique 
identifier. But this only works if the storage table is registered in the 
catalog. Furthermore, without being registered in the catalog, an atomic swap 
of the storage table metadata file cannot be guaranteed.
   
   Another approach would be to use the **storage table metadata file 
location** as unique identifier. This makes using the catalog a bit more 
awkward because the tables have to be filtered for the metadata file location. 
But this approach wouldn't require the storage table to be registered in the 
catalog. And it would enable atomic transactions on the storage table by making 
atomic changes to the storage table metadata file location. This is what I 
meant by commit procedure.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to