The GitHub Actions job "Go Integration" on 
iceberg-go.git/missing-is-not-invalid has succeeded.
Run started by GitHub user shtripat (triggered by shtripat).

Head commit for run:
5eaac389f0671938ebfb02086ddd2d777ba25a0e / Shubhendu Ram Tripathi 
<[email protected]>
fix: field id:0 incorrectly rejected as missing by NestedField.UnmarshalJSON

Problem

Commit 95df02c introduced validation in NestedField.UnmarshalJSON to catch 
payloads
that omit the required "id" key (e.g. using "field_id" by mistake). The check 
used was:
```
if n.ID == 0 {
    return fmt.Errorf("%w: field is missing required 'id' key in JSON", 
ErrInvalidSchema)
}
```
This conflates two distinct situations:

1. The "id" key is absent from JSON → n.ID stays at its zero value 0 (invalid, 
should error).
2. The "id" key is present with value 0 → n.ID is also 0 (valid, should not 
error).

Go's encoding/json leaves fields at their zero value when a key is absent, so 
both cases are
indistinguishable using n.ID == 0. As a result, any schema payload with a field 
using
id:0 — which Spark sends for freshly created tables — is incorrectly rejected 
with ErrInvalidSchema.

Example failing payload:
```
{
  "name": "my_table1",
  "schema": {
    "type": "struct",
    "schema-id": 0,
    "fields": [
      {"id": 0, "name": "id",      "required": false, "type": "long"},
      {"id": 1, "name": "strings", "required": false, "type": "string"},
      {"id": 2, "name": "floats",  "required": false, "type": "double"}
    ]
  },
  "stage-create": true,
  ...
}
```

Fix

Use *int for the id field in the unmarshaling aux struct. A nil pointer 
unambiguously means
the key was absent from JSON; a non-nil pointer (including &0) means the key 
was present.
The outer ID *int field shadows the embedded Alias.ID int during JSON decoding, 
and n.ID
is then set explicitly from the dereferenced pointer.

```
aux := struct {
    ID   *int      `json:"id"`   // nil = key absent, &0 = key present with 
value 0
    Type typeIFace `json:"type"`
    *Alias
}{Alias: (*Alias)(n)}
```

Changes

types.go
- NestedField.UnmarshalJSON: introduce ID *int in the aux struct; check aux.ID 
== nil
instead of n.ID == 0; set n.ID = *aux.ID after the nil guard.

types_test.go
- TestNestedFieldUnmarshalZeroIDIsValid: new test asserting that {"id":0,...} 
unmarshals
without error and produces ID == 0.

Behaviour after this change

|          Scenario          |            Before             |      After       
|
|----------------------------|-------------------------------|------------------|
| "id" key absent from JSON  | ErrInvalidSchema              | ErrInvalidSchema 
|
| "field_id" instead of "id" | ErrInvalidSchema              | ErrInvalidSchema 
|
| "id":0 present in JSON     | ErrInvalidSchema (regression) | OK, ID == 0      
|
| "id":N (N > 0) present     | OK                            | OK               
|

Signed-off-by: Shubhendu Ram Tripathi <[email protected]>

Report URL: https://github.com/apache/iceberg-go/actions/runs/22483981773

With regards,
GitHub Actions via GitBox

Reply via email to