danielcweeks commented on code in PR #15630: URL: https://github.com/apache/iceberg/pull/15630#discussion_r3174493626
########## format/spec.md: ########## @@ -168,6 +188,35 @@ All columns must be written to data files even if they introduce redundancy with Writers are not allowed to commit files with a partition spec that contains a field with an unknown transform. +### Paths in Metadata + +Path strings stored in Iceberg metadata files are classified as one of two types: + +* **Absolute path** -- A path string that includes a [URI scheme](https://datatracker.ietf.org/doc/html/rfc3986#section-3.1) (e.g., `s3:`, `gs:`, `hdfs:`, `file:`). Absolute paths are used as-is without modification. Review Comment: I agree that we should just refer to the RFC. Implementations can make reasonable optimizations. The intent here is to be clear about what the expected content is, not to force a full and rigorous validation anywhere a path is seen. In practice, I we haven't really had any problems, but if there is a dispute about whether a representation is valid, we can always refer back to the RFC for clarification. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
