Adding Matt Topol to the discussion. Thanks, Krutika
On Fri, Feb 6, 2026 at 3:55 PM Krutika Dhananjay <[email protected]> wrote: > Hi everyone, > > I'd like to bring up a race condition between snapshot expiration and > concurrent client commits that add refs to snapshots. > > *The Problem* > Consider a table with snap-1 (unreferenced) and snap-2 (main). Two > operations happen concurrently: > > 1. A maintenance job identifies snap-1 as eligible for expiration. > 2. A client commits SetSnapshotRef("feature-branch", snap-1). > > If the client commits first, the maintenance job loads the fresh metadata > (which now contains a ref for feature-branch -> snap-1), builds its > RemoveSnapshots([snap-1]) update, and commits. The CAS succeeds because the > maintenance job has the latest metadata pointer. But the resulting metadata > is corrupt: feature-branch points to a deleted snapshot. > > CAS-based optimistic concurrency protects against concurrent pointer > updates, but it does not validate the semantic correctness of the operation > against the current state. The maintenance job's commit is structurally > valid (latest pointer) but semantically invalid (removes a referenced > snapshot). > > *Proposed Fix in iceberg-go* > We addressed <https://github.com/apache/iceberg-go/pull/716> this in > iceberg-go by modifying MetadataBuilder.RemoveSnapshots() to skip removing > any snapshot that is currently referenced by a branch or tag. This works > correctly -- the validation happens against the same metadata state that > the CAS will verify hasn't changed. > However, this changes the existing behavior of RemoveSnapshots, which > currently removes the snapshot unconditionally and cleans up any refs that > pointed to it. This would be a deviation from the behaviour in iceberg-java > and iceberg-rs. > > *Proposed Alternative: AssertSnapshotNotReferenced* > A less invasive approach that preserves the existing RemoveSnapshots > behavior would be to introduce a new assertion update -- > AssertSnapshotNotReferenced. This would be analogous to existing > assertions, but for validating that a snapshot has no refs pointing to it. > > During commit, a catalog-side maintenance job would include this assertion > alongside its RemoveSnapshots update: > > updates: [ > AssertSnapshotNotReferenced(snap-1), > RemoveSnapshots([snap-1]) > ] > > The assertion would be validated against the freshly loaded metadata > during commit. If a concurrent client had linked a ref to snap-1 in the > meantime, the assertion fails, the CAS-based commit is never attempted, and > the maintenance job can retry with updated knowledge of the table state. > > This approach: > - Does not change existing RemoveSnapshots semantics > - Can be implemented uniformly across Java, Rust, and Go > > Would love to hear thoughts on the preferred way to handle this. > > Thanks, > Krutika >
