This is an automated email from the ASF dual-hosted git repository.
zeroshade pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/iceberg-go.git
The following commit(s) were added to refs/heads/main by this push:
new d188a8e fix(manifest): Interpret manifest files written without
`content` metadata as `data` files (#545)
d188a8e is described below
commit d188a8ed935100f231840a755344a726638c9aec
Author: Joel Lubinitsky <[email protected]>
AuthorDate: Fri Aug 22 11:10:21 2025 -0400
fix(manifest): Interpret manifest files written without `content` metadata
as `data` files (#545)
fixes: #544
V1 writers may not specify `content` at all when writing manifest files,
but since this predates `delete` files these should be interpretted as
`data` files
([https://iceberg.apache.org/spec/#manifests](https://iceberg.apache.org/spec/#manifests)).
_Side Note: It could be helpful to have a `iceberg-testing` repo similar
to `parquet-testing` or `arrow-testing` that contains a bunch of files
serialized to different standards of the spec. I don't know if this
exists already but it was a bit tricky producing a manifest file to use
for a unit test since the V1 writer included writes `content=data` for
compatibility. I started creating new writer that omits that one field
but it seemed like more to maintain than it was worth for that one
test._
---------
Co-authored-by: Joel Lubinitsky <[email protected]>
---
manifest.go | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/manifest.go b/manifest.go
index 0b2932a..4669726 100644
--- a/manifest.go
+++ b/manifest.go
@@ -601,6 +601,13 @@ func NewManifestReader(file ManifestFile, in io.Reader)
(*ManifestReader, error)
var content ManifestContent
switch contentStr := string(metadata["content"]); contentStr {
+ case "":
+ if formatVersion != 1 {
+ return nil, fmt.Errorf("manifest file's 'content'
metadata is missing and 'format-version' is %d, but the field is required for
'format-version' 2 and beyond",
+ formatVersion)
+ }
+ // V1 manifests do not contain the 'content' field, but should
be interpretted as 'data' files
+ fallthrough
case "data":
content = ManifestContentData
case "deletes":