sreejasahithi commented on PR #10053:
URL: https://github.com/apache/ozone/pull/10053#issuecomment-4215779613
>
> If I understand correctly, RewriteTablePath is needed because when we use
the ofs:// protocol as the warehouse URL, the manifest files store data file
paths with that ofs:// prefix. An Iceberg reader configured with an S3
endpoint then can’t resolve those paths.
Yes, that’s correct.
When an Iceberg table is created in an Ozone volume/bucket using an ofs://
warehouse path, all file references stored across the table’s metadata
hierarchy are written as absolute ofs:// paths. This includes:
- Table metadata files (table location, manifest-list location, previous
metadata file locations)
- Manifest list files (pointing to manifest files)
- Manifest files (pointing to data files)
- Position delete files (referencing affected data files)
Because these paths are fully qualified with the ofs:// scheme, any engine
or catalog (such as Polaris) that only understands s3:// or s3a:// cannot
resolve them. As a result, it cannot access the table data.
To make the table accessible via S3-compatible systems without copying data,
we need a mechanism (like RewriteTablePath) to rewrite these embedded paths
from ofs:// to s3://.
> Could you add an example to the Jira epic description, maybe using Apache
Polaris, to walk through an end-to-end scenario? For example: create a catalog
with an ofs:// warehouse, insert some data, inspect the manifest and data
file locations, show how an Iceberg S3 reader fails to read the table, and then
demonstrate how the table path rewrite fixes it.
sure, will update the epic description for better clarity.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]