rambleraptor commented on code in PR #15630: URL: https://github.com/apache/iceberg/pull/15630#discussion_r3003118543
########## format/spec.md: ########## @@ -134,6 +178,8 @@ Tables do not require rename, except for tables that use atomic rename to implem * **Manifest** -- A file that lists data or delete files; a subset of a snapshot. * **Data file** -- A file that contains rows of a table. * **Delete file** -- A file that encodes rows of a table that are deleted by position or data values. +* **Absolute path** -- A path string that includes a URI scheme and can be used directly. Review Comment: Including the RFC link here will help people understand URI schemes. They might not have read the section on Paths in Metadata to find that link. ########## format/spec.md: ########## @@ -1767,6 +1838,24 @@ Note that these requirements apply when writing data to a v2 table. Tables that This section covers topics not required by the specification but recommendations for systems implementing the Iceberg specification to help maintain a uniform experience. +### Path Construction + +Path construction is the process by which new file locations are created for output files referenced by metadata. While the specific construction logic is not strictly required by the spec, the following guidance is provided for reference implementations to encourage consistency. + +The table properties `write.metadata.path` and `write.data.path` control where metadata and data files are written relative to the table location. When not specified, these default to `metadata` and `data` respectively. Review Comment: ```suggestion The table properties `write.metadata.path` and `write.data.path` control where metadata and data files are written relative to the table location. When not specified, these default to the values`metadata` and `data` respectively. ``` For casual readers, we don't want them looking for a `metadata` table property when this is the word "metadata" ########## format/spec.md: ########## @@ -134,6 +178,8 @@ Tables do not require rename, except for tables that use atomic rename to implem * **Manifest** -- A file that lists data or delete files; a subset of a snapshot. * **Data file** -- A file that contains rows of a table. * **Delete file** -- A file that encodes rows of a table that are deleted by position or data values. +* **Absolute path** -- A path string that includes a URI scheme and can be used directly. +* **Relative path** -- A path string without a URI scheme that must be resolved against the table location. Review Comment: nit: maybe even put the link here too? People usually only look at the part of the glossary that they explicitly care about. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
