Andres Freund <and...@anarazel.de> writes: > On 2022-04-14 12:16:45 -0400, Robert Haas wrote: >> I got curious and looked at the underlying problem here and I am >> wondering whether HaveRegisteredOrActiveSnapshot() is just buggy. It >> seems to me that the code is always going to return true if there are >> any active snapshots, and the rest of the function is intended to test >> whether there is a registered snapshot other than the catalog >> snapshot. But I don't think that's what this code does: >> >> if (pairingheap_is_empty(&RegisteredSnapshots) || >> !pairingheap_is_singular(&RegisteredSnapshots)) >> return false; >> >> return CatalogSnapshot == NULL;
> Certainly looks off... Yeah, this is broken. Whilst waiting around for a build on wrasse's host, I reproduced the problem shown in this thread, and here's what I see at the point of the exception: (gdb) p RegisteredSnapshots $5 = {ph_compare = 0x9a6000 <xmin_cmp>, ph_arg = 0x0, ph_root = 0xec3168 <CatalogSnapshotData+72>} (gdb) p *RegisteredSnapshots.ph_root $6 = {first_child = 0x2d85d70, next_sibling = 0x0, prev_or_parent = 0x0} (gdb) p CatalogSnapshotData $7 = {snapshot_type = SNAPSHOT_MVCC, xmin = 52155, xmax = 52155, xip = 0x2d855b0, xcnt = 0, subxip = 0x2de9130, subxcnt = 0, suboverflowed = false, takenDuringRecovery = false, copied = false, curcid = 0, speculativeToken = 0, vistest = 0x0, active_count = 0, regd_count = 0, ph_node = {first_child = 0x2d85d70, next_sibling = 0x0, prev_or_parent = 0x0}, whenTaken = 0, lsn = 0, snapXactCompletionCount = 1} (gdb) p CatalogSnapshot $8 = (Snapshot) 0xec3120 <CatalogSnapshotData> (gdb) p *(Snapshot) (0x2d85d70-72) $9 = {snapshot_type = SNAPSHOT_MVCC, xmin = 52155, xmax = 52155, xip = 0x0, xcnt = 0, subxip = 0x0, subxcnt = 0, suboverflowed = false, takenDuringRecovery = false, copied = true, curcid = 0, speculativeToken = 0, vistest = 0x0, active_count = 0, regd_count = 2, ph_node = {first_child = 0x0, next_sibling = 0x0, prev_or_parent = 0xec3168 <CatalogSnapshotData+72>}, whenTaken = 0, lsn = 0, snapXactCompletionCount = 0} (gdb) p ActiveSnapshot $10 = (ActiveSnapshotElt *) 0x0 So in fact there IS another registered snapshot, and HaveRegisteredOrActiveSnapshot is just lying. I think the correct test is more nearly what we have in InvalidateCatalogSnapshotConditionally: if (CatalogSnapshot && ActiveSnapshot == NULL && pairingheap_is_singular(&RegisteredSnapshots)) // then the CatalogSnapshot is the only one. Ergo, this actually is a bug in 277692220. regards, tom lane