[
https://issues.apache.org/jira/browse/HDDS-14937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18083273#comment-18083273
]
Chu Cheng Li commented on HDDS-14937:
-------------------------------------
Found recent improvement on table path in iceberg, FYI.
* [https://github.com/apache/iceberg/pull/15630]
* [https://github.com/ClickHouse/ClickHouse/issues/102321]
* [https://github.com/apache/iceberg/issues/13141]
*
[https://docs.google.com/document/d/1a6tXvbWVbvOxiRexCiaIsIA6NOOqcbxPKjeoJGqgYtg/edit?tab=t.p92mo7cvg08q]
> Ozone native implementation of Iceberg RewriteTablePath
> -------------------------------------------------------
>
> Key: HDDS-14937
> URL: https://issues.apache.org/jira/browse/HDDS-14937
> Project: Apache Ozone
> Issue Type: Epic
> Reporter: Sreeja
> Assignee: Sreeja
> Priority: Major
>
> Iceberg tables stored in Apache Ozone traditionally(table created via ofs)
> use absolute paths with the "ofs://" protocol prefix in the path. These
> absolute paths prevent the table from being accessed via S3, even when a
> bucket link exists.
> This Epic introduces a native Ozone implementation of the Iceberg's
> [RewriteTablePath
> |https://github.com/apache/iceberg/blob/1.10.x/api/src/main/java/org/apache/iceberg/actions/RewriteTablePath.java]
> action to enable seamless protocol migration with zero data file copy.
> Iceberg also provides the core util methods in
> [RewriteTablePathUtil|https://github.com/apache/iceberg/blob/1.10.x/core/src/main/java/org/apache/iceberg/RewriteTablePathUtil.java]
> that can be used by Ozone for the same purpose.
> This approach is particularly useful when integrating with REST-based
> catalogs for example Apache Polaris etc .., which expect S3-compatible
> locations.
> We will implement the Iceberg's action and use RewriteTablePathUtil to
> perform a "metadata-only" migration.
> * *Traverse* the table’s metadata history.
> * *Rewrite* all internal absolute paths from a sourcePrefix (e.g., ofs://)
> to a targetPrefix (e.g., s3a:// or s3://).
> * *Stage* the updated metadata files in a temporary location.
> * *Perform Zero Data Copy:* The actual data files remain untouched, only the
> "pointers" in the metadata(metadata version file, manifest list , manifest
> file , position delete file) are updated.
> For example:
> Suppose an Iceberg table is present in an Ozone volume/bucket using an ofs://
> path say "{*}ofs://om:9862/vol1/buck1/my_db/test_table"{*}, all file
> references stored across the table’s metadata hierarchy are mentioned as
> absolute ofs:// paths. This includes:
> * Table metadata files (table location, manifest-list location, previous
> metadata file locations)
> * Manifest list files (pointing to manifest files)
> * Manifest files (pointing to data files)
> * Position delete files (referencing affected data files)
> sample metadata file (before rewrite):
> {code:java}
> { "format-version": 2,
> "table-uuid": "9b791462-d257-45e5-92f8-435302d2c335",
> "location": "ofs://ozone-om:9862/vol1/buck1/my_db/test_table",
> .
> .
> .
> },
> "snapshots": [{...},
> "manifest-list":
> "ofs://ozone-om:9862/vol1/buck1/my_db/test_table/metadata/snap-1753351619419365870-1-5ac51133-8cbf-4327-bbf8-0559b463e1f9.avro",
> "schema-id": 0 },
> {...},
> "manifest-list":
> "ofs://ozone-om:9862/vol1/buck1/my_db/test_table/metadata/snap-176890185746044789-1-5061c816-61b1-43e4-84e8-0ad689c2ea86.avro",
> "schema-id": 0 } ],
> .
> .
> .
> "metadata-log": [
> {
> "timestamp-ms": 1774448474465,
> "metadata-file":
> "ofs://ozone-om:9862/vol1/buck1/my_db/test_table/metadata/00000-d480d223-a92f-4255-be8c-fef1714bb423.metadata.json"
>
> },
> {
> "timestamp-ms": 1774448493051,
> "metadata-file":
> "ofs://ozone-om:9862/vol1/buck1/my_db/test_table/metadata/00001-3d20e8d6-e151-4442-a0d7-55533f27cf09.metadata.json"
> } ]} {code}
> Now if we try to access this table via a REST based catalog like Apache
> Polaris then it would fail as polaris expects s3:// or s3a://
> {code:java}
> org.apache.iceberg.exceptions.ForbiddenException: Forbidden: Invalid
> locations '[ofs://om:9862/vol1/buck1/my_db/test_table]' for identifier
> 'my_db.test_table': ofs://om:9862/vol1/buck1/my_db/test_table is not in the
> list of allowed locations{code}
> we won't even be able to register the table with polaris catalog as it sees
> ofs:// paths in the files. Or if we use any engine that tries to access the
> table via s3 it would also fail as it won't be able to resolve ofs:// paths.
> h3. Rewriting paths to S3
> To make the table accessible via S3-compatible systems {*}without copying
> data files{*}, we use Ozone's native implementation of Iceberg's
> {{RewriteTablePath}} action.
> Steps:
> # *Create a bucket link* in the Ozone {{/s3v}} volume pointing to the bucket
> where the table exists.
> # Provide the *sourcePrefix*
> ({{{}ofs://om:9862/vol1/buck1/my_db/test_table{}}}) and *targetPrefix*
> ({{{}s3://buck1link/my_db/test_table{}}}).
> # Optionally, provide *start/end metadata versions* or a *staging location*
> for the rewritten metadata files.
> # Run the rewrite action — this updates all embedded paths in metadata
> version files, manifest lists, manifests, and position delete files
> {*}without touching the actual data{*}.
> # Copy the rewritten metadata from the staging location back to the table’s
> location (not handled automatically by Ozone's implementation).
> sample metadata file (after rewrite):
> {code:java}
> { "format-version": 2,
> "table-uuid": "9b791462-d257-45e5-92f8-435302d2c335",
> "location": "s3://buck1link/my_db/test_table",
> .
> .
> .
> },
> "snapshots": [{...},
> "manifest-list":
> "s3://buck1link/my_db/test_table/metadata/snap-1753351619419365870-1-5ac51133-8cbf-4327-bbf8-0559b463e1f9.avro",
> "schema-id": 0 },
> {...},
> "manifest-list":
> "s3://buck1link/my_db/test_table/metadata/snap-176890185746044789-1-5061c816-61b1-43e4-84e8-0ad689c2ea86.avro",
> "schema-id": 0 } ],
> . . .
> "metadata-log": [
> {
> "timestamp-ms": 1774448474465,
> "metadata-file":
> "s3://buck1link/my_db/test_table/metadata/00000-d480d223-a92f-4255-be8c-fef1714bb423.metadata.json"
>
> },
> {
> "timestamp-ms": 1774448493051,
> "metadata-file":
> "s3://buck1link/my_db/test_table/metadata/00001-3d20e8d6-e151-4442-a0d7-55533f27cf09.metadata.json"
> } ]} {code}
> After this rewrite:
> * The table is {*}accessible via S3{*}.
> * It can now be *registered with Polaris* without any path-related errors.
> NOTE: The hadoop-ozone/iceberg module should be enabled only when building
> with JDK ≥ 11
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]