Just to put another alternative solution on the table. In S3FileIO, we implemented the support for S3 access point and bucket alias, which actually accidentally enabled "relative path" if you are just switching bucket name.
At read time, you can supply a catalog property "s3.access-points.<bucket-name>=<bucket-alias-name>" indicating data in <bucket-name> should be read using <bucket-alias-name> which comes from an access point. However, bucket alias name is basically the same as bucket name, so there is nothing preventing me to say something like "s3.access-points.my-bucket-us-east-1=my-bucket-us-west-2". If I configure that, then any file path like "s3://my-bucket-us-east-1/some/path" will be converted to "s3://my-bucket-us-west-2/some/path" during read, achieving technically the same effect without the need to change the Iceberg spec. Is it possible to extend this feature, so instead of supporting relative path, we can support some form of replacing absolute path, so the Iceberg metadata tree is still self-complete without the need to reference external information like a prefix in a catalog? For example, user can provide a map saying that any path with prefix "my-bucket-us-east-1/table1" should now be read through "my-bucket-us-west-2/table1-backup". And we already have built-in integration for catalog to set customized catalog properties per table. For example, this is achieved in REST through the config field in LoadTableResponse, which is used to vend S3 access credentials today. There were also thoughts about allowing similar features in Glue to provide these configs through Glue table parameters, as an implementation for non-REST catalogs. We just did not add that feature because Glue already supports S3 access credentials vending through LakeFormation. Has this option been considered? I quickly scanned through the linked doc, it seems to be not discussed, but I might have missed it. Best, Jack Ye On Tue, Feb 20, 2024 at 9:21 AM Jean-Baptiste Onofré <j...@nanthrax.net> wrote: > Hi Ryan > > Ah ok, I thought that an Iceberg release is "based"/implement a spec > (I assumed the opposite is wrong). > > Thanks for the explanation! > > Regards > JB > > On Tue, Feb 20, 2024 at 6:04 PM Ryan Blue <b...@tabular.io> wrote: > > > > JB, > > > > The spec and the reference implementation are released separately so v3 > and 2.0 are independent. There's no requirement that v3 is completed for > Iceberg Java 2.0 and the goal of a 2.0 is to have an opportunity to > deprecate and remove things so that we don't continue to carry forward and > maintain older interfaces. > > > > Ryan > > > > On Tue, Feb 20, 2024 at 1:58 AM Jean-Baptiste Onofré <j...@nanthrax.net> > wrote: > >> > >> Hi Manu > >> > >> Thanks for the reminder. It sounds like a good feature and worth > >> discussing it :). > >> > >> It was my intention to define what we plan to include (or not) in Spec > >> v3 / Iceberg 2.0.0 (I sent a message about that last week). > >> > >> Regards > >> JB > >> > >> On Tue, Feb 20, 2024 at 10:36 AM Manu Zhang <owenzhang1...@gmail.com> > wrote: > >> > > >> > Do we still want to move forward with this feature? It's on the > roadmap for Spec V3 but it hasn't appeared in our discussion for a while. > >> > > >> > Manu > >> > > >> > On Sat, Aug 26, 2023 at 2:43 AM Mohit Garg <mohitga...@gmail.com> > wrote: > >> >> > >> >> hi > >> >> > >> >> Please review the approach captured here Iceberg Table Portability > This is a continuation from the previous effort here - Support relative > paths and multiple root locations. > >> >> > >> >> -- > >> >> > >> >> kind regards > >> >> Mohit > > > > > > > > -- > > Ryan Blue > > Tabular >