sreejasahithi commented on PR #10053:
URL: https://github.com/apache/ozone/pull/10053#issuecomment-4215779613

   > 
   > If I understand correctly, RewriteTablePath is needed because when we use 
the ‎⁠ofs://⁠ protocol as the warehouse URL, the manifest files store data file 
paths with that ‎⁠ofs://⁠ prefix. An Iceberg reader configured with an S3 
endpoint then can’t resolve those paths.
   
   Yes, that’s correct.
   When an Iceberg table is created in an Ozone volume/bucket using an ofs:// 
warehouse path, all file references stored across the table’s metadata 
hierarchy are written as absolute ofs:// paths. This includes:
   
     - Table metadata files (table location, manifest-list location, previous 
metadata file locations)
     - Manifest list files (pointing to manifest files)
     - Manifest files (pointing to data files)
     - Position delete files (referencing affected data files)
   
   Because these paths are fully qualified with the ofs:// scheme, any engine 
or catalog (such as Polaris) that only understands s3:// or s3a:// cannot 
resolve them. As a result, it cannot access the table data.
   To make the table accessible via S3-compatible systems without copying data, 
we need a mechanism (like RewriteTablePath) to rewrite these embedded paths 
from ofs:// to s3://.
   
   > Could you add an example to the Jira epic description, maybe using Apache 
Polaris, to walk through an end-to-end scenario? For example: create a catalog 
with an ‎⁠ofs://⁠ warehouse, insert some data, inspect the manifest and data 
file locations, show how an Iceberg S3 reader fails to read the table, and then 
demonstrate how the table path rewrite fixes it.
   
   sure, will update the epic description for better clarity.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to