Hi,

While investigating other bugs I noticed that
ResolveRecoveryConflictWithLock() wasn't really working. Turns out
GetLockConflicts() violates it's API contract which says:

 * The result array is palloc'd and is terminated with an invalid VXID.

Problem 1:
We don't actually put the terminator there. It happens to more or less
accidentally work on a master because the array is palloc0()ed there and
while a 0 is valid backend id it happens to not be a valid local
transaction id.  In HS we don't actually allocate the array every time,
but it's instead statically allocated. Without zeroing.

I have no idea why this doesn't crash
ResolveRecoveryConflictWithInterrupt() which does:
        while (!lock_acquired)
        {
                if (++num_attempts < 3)
                        backends = GetLockConflicts(locktag, 
AccessExclusiveLock);
...
                ResolveRecoveryConflictWithVirtualXIDs(backends,
                                                                                
         PROCSIG_RECOVERY_CONFLICT_LOCK);
and ResolveRecoveryConflictWithVirtualXIDs does:
...
        while (VirtualTransactionIdIsValid(*waitlist))
        {
        /* kill kill kill */

                /* The virtual transaction is gone now, wait for the next one */
                waitlist++;
        }

I guess we just accidentally happen to come across appropriately
set memory at some point.

I'm really baffled that this hasn't caused significant problems so far.


Problem 2:
Since bcd8528f001 and 29eedd312274 the "the result array is palloc'd" is
wrong because we're now doing:

    static VirtualTransactionId *vxids;
        /*
         * Allocate memory to store results, and fill with InvalidVXID.  We only
         * need enough space for MaxBackends + a terminator, since prepared 
xacts
         * don't count. InHotStandby allocate once in TopMemoryContext.
         */
        if (InHotStandby)
        {
                if (vxids == NULL)
                        vxids = (VirtualTransactionId *)
                                MemoryContextAlloc(TopMemoryContext,
                                                   sizeof(VirtualTransactionId) 
* (MaxBackends + 1));
        }
        else
                vxids = (VirtualTransactionId *)
                        palloc0(sizeof(VirtualTransactionId) * (MaxBackends + 
1));

Obviously that violates the API contract. I'm inclined to rip the HS
special case out and add a pfree() to the single HS caller. The commit
message introducing it says:

    Use malloc() in GetLockConflicts() when called InHotStandby to avoid
    repeated palloc calls. Current code assumed this was already true, so
    this is a bug fix.
But I can't really believe this is relevant.

Greetings,

Andres Freund

--
 Andres Freund                     http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to