Andres Freund <and...@anarazel.de> writes:
> On 2022-04-14 12:16:45 -0400, Robert Haas wrote:
>> I got curious and looked at the underlying problem here and I am
>> wondering whether HaveRegisteredOrActiveSnapshot() is just buggy. It
>> seems to me that the code is always going to return true if there are
>> any active snapshots, and the rest of the function is intended to test
>> whether there is a registered snapshot other than the catalog
>> snapshot. But I don't think that's what this code does:
>> 
>> if (pairingheap_is_empty(&RegisteredSnapshots) ||
>>     !pairingheap_is_singular(&RegisteredSnapshots))
>>      return false;
>> 
>> return CatalogSnapshot == NULL;

> Certainly looks off...

Yeah, this is broken.  Whilst waiting around for a build on wrasse's
host, I reproduced the problem shown in this thread, and here's what
I see at the point of the exception:

(gdb) p RegisteredSnapshots
$5 = {ph_compare = 0x9a6000 <xmin_cmp>, ph_arg = 0x0, 
  ph_root = 0xec3168 <CatalogSnapshotData+72>}
(gdb) p *RegisteredSnapshots.ph_root
$6 = {first_child = 0x2d85d70, next_sibling = 0x0, prev_or_parent = 0x0}
(gdb) p CatalogSnapshotData
$7 = {snapshot_type = SNAPSHOT_MVCC, xmin = 52155, xmax = 52155, 
  xip = 0x2d855b0, xcnt = 0, subxip = 0x2de9130, subxcnt = 0, 
  suboverflowed = false, takenDuringRecovery = false, copied = false, 
  curcid = 0, speculativeToken = 0, vistest = 0x0, active_count = 0, 
  regd_count = 0, ph_node = {first_child = 0x2d85d70, next_sibling = 0x0, 
    prev_or_parent = 0x0}, whenTaken = 0, lsn = 0, snapXactCompletionCount = 1}
(gdb) p CatalogSnapshot    
$8 = (Snapshot) 0xec3120 <CatalogSnapshotData>
(gdb) p *(Snapshot) (0x2d85d70-72)
$9 = {snapshot_type = SNAPSHOT_MVCC, xmin = 52155, xmax = 52155, xip = 0x0, 
  xcnt = 0, subxip = 0x0, subxcnt = 0, suboverflowed = false, 
  takenDuringRecovery = false, copied = true, curcid = 0, 
  speculativeToken = 0, vistest = 0x0, active_count = 0, regd_count = 2, 
  ph_node = {first_child = 0x0, next_sibling = 0x0, 
    prev_or_parent = 0xec3168 <CatalogSnapshotData+72>}, whenTaken = 0, 
  lsn = 0, snapXactCompletionCount = 0}
(gdb) p ActiveSnapshot
$10 = (ActiveSnapshotElt *) 0x0

So in fact there IS another registered snapshot, and
HaveRegisteredOrActiveSnapshot is just lying.  I think the
correct test is more nearly what we have in
InvalidateCatalogSnapshotConditionally:

    if (CatalogSnapshot &&
        ActiveSnapshot == NULL &&
        pairingheap_is_singular(&RegisteredSnapshots))
            // then the CatalogSnapshot is the only one.

Ergo, this actually is a bug in 277692220.

                        regards, tom lane


Reply via email to