On 2017-04-24 13:29:11 +0100, Simon Riggs wrote:
> On 24 April 2017 at 00:25, Andres Freund <and...@anarazel.de> wrote:
> > if the subxid->xid mapping doesn't actually exist - as it's the case
> > with this bug afaics - we'll not get the correct toplevel
> > transaction.
> 
> The nature of the corruption is that in some cases
> * a subxid will point to nothing (even though in most cases it was
> already set correctly)
> * the parent will point to a subxid

Right. Those cases aren't that different from the point of trying to
find the parent of an subxid.


> > Which'll mean the following block:
> >                 /*
> >                  * We now have either a top-level xid higher than xmin or an
> >                  * indeterminate xid. We don't know whether it's top level 
> > or subxact
> >                  * but it doesn't matter. If it's present, the xid is 
> > visible.
> >                  */
> >                 for (j = 0; j < snapshot->subxcnt; j++)
> >                 {
> >                         if (TransactionIdEquals(xid, snapshot->subxip[j]))
> >                                 return true;
> >                 }
> > won't work correctly if suboverflowed.
> 
> Your example of snapshots taken during recovery is not correct.

Oh?


> Note that SubTransGetTopmostTransaction() returns a valid, running
> xid, even though it is the wrong one.

Sure.


> Snapshots work differently on standbys - we store all known running
> xids, so the test still passes correctly, even when overflowed.

I don't think that's generally true.  Isn't that precisely what
ProcArrayStruct->lastOverflowedXid is about?  If we have a snapshot
that's suboverflowed due to the lastOverflowedXid cutoff, then we the
subxip array does *not* contain all known running xids anymore, we rely
on pg_subtrans to only guarantee that toplevel xids are stored in the
KnownAssignedXids machinery.

See:
 * When we throw away subXIDs from KnownAssignedXids, we need to keep track of
 * that, similarly to tracking overflow of a PGPROC's subxids array.  We do
 * that by remembering the lastOverflowedXID, ie the last thrown-away subXID.
 * As long as that is within the range of interesting XIDs, we have to assume
 * that subXIDs are missing from snapshots.  (Note that subXID overflow occurs
 * on primary when 65th subXID arrives, whereas on standby it occurs when 64th
 * subXID arrives - that is not an error.)

        /*
         * Highest subxid that has been removed from KnownAssignedXids array to
         * prevent overflow; or InvalidTransactionId if none.  We track this for
         * similar reasons to tracking overflowing cached subxids in PGXACT
         * entries.  Must hold exclusive ProcArrayLock to change this, and 
shared
         * lock to read it.
         */
        TransactionId lastOverflowedXid;


Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to