JanKaul commented on issue #6420: URL: https://github.com/apache/iceberg/issues/6420#issuecomment-1426863641
It would be great if we could make it a catalog-specific decision. But for that the metadata has to be designed to enable both strategies. I think the question is what do we use as an unique identifier for the storage table in the representation of the common view? One approach is to use **table_name, namespace and catalog** as unique identifier. But this only works if the storage table is registered in the catalog. Furthermore, without being registered in the catalog, an atomic swap of the storage table metadata file cannot be guaranteed. Another approach would be to use the **storage table metadata file location** as unique identifier. This makes using the catalog a bit more awkward because the tables have to be filtered for the metadata file location. But this approach wouldn't require the storage table to be registered in the catalog. And it would enable atomic transactions on the storage table by making atomic changes to the storage table metadata file location. This is what I meant by commit procedure. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
