Hi,

On 2022-11-09 17:32:46 -0800, Peter Geoghegan wrote:
> > The xmin horizon is very coarse grained. Just because it is 7 doesn't mean
> > that xid 10 is still running. All it means that one backend or slot has an
> > xmin or xid of 7.
> 
> Of course that's true. But I wasn't talking about the general case --
> I was talking about your "xmin 10, xmax 5 -> xmin 5, xmax invalid"
> update chain case specifically, with its "skewered" OldestXmin of 7.

The sequence below produces such an OldestXmin:

> > s1: acquire xid 5
> > s2: acquire xid 7
> > s3: acquire xid 10
> >
> > s3: insert
> > s3: commit
> > s1: update
> > s1: commit
> >
> > s2: get a new snapshot, xmin 7 (or just hold no snapshot)
> >
> > At this point the xmin horizon is 7. The first tuple's xmin can't be
> > frozen. The second tuple's xmin can be.
> 
> Basically what I'm saying about OldestXmin is that it ought to "work
> transitively", from the updater to the inserter that inserted the
> now-updated tuple. That is, the OldestXmin should either count both
> XIDs that appear in the update chain, or neither XID.

It doesn't work that way. The above sequence shows one case where it doesn't.


> > > I believe you're right that an update chain that looks like this one
> > > is possible. However, I don't think it's possible for
> > > OldestXmin/FreezeLimit to take on a value like that (i.e. a value that
> > > "skewers" the update chain like this, the value 7 from your example).
> > > We ought to be able to rely on an OldestXmin value that can never let
> > > such a situation emerge. Right?
> >
> > I don't see anything that'd guarantee that currently, nor do immediately 
> > see a
> > possible way to get there.
> >
> > What do you think prevents such an OldestXmin?
> 
> ComputeXidHorizons() computes VACUUM's OldestXmin (actually it
> computes h->data_oldest_nonremovable values) by scanning the proc
> array. And counts PGPROC.xmin from each running xact. So ultimately
> the inserter and updater are tied together by that. It's either an
> OldestXmin that includes both, or one that includes neither.

> Here are some facts that I think we both agree on already:
> 
> 1. It is definitely possible to have an update chain like your "xmin
> 10, xmax 5 -> xmin 5, xmax invalid" example.
> 
> 2. It is definitely not possible to "freeze xmax" by setting its value
> to FrozenTransactionId or something similar -- there is simply no code
> path that can do that, and never has been. (The term "freeze xmax" is
> a bit ambiguous, though it usually means set xmax to
> InvalidTransactionId.)
> 
> 3. There is no specific reason to believe that there is a live bug here.

I don't think there's a live bug here. I think the patch isn't dealing
correctly with that issue though.


> Putting all 3 together: doesn't it seem quite likely that the way that
> we compute OldestXmin is the factor that prevents "skewering" of an
> update chain? What else could possibly be preventing corruption here?
> (Theoretically it might never have been discovered, but that seems
> pretty hard to believe.)

I don't see how that follows. The existing code is just ok with that. In fact
we have explicit code trying to exploit this:

                /*
                 * If the DEAD tuple is at the end of the chain, the entire 
chain is
                 * dead and the root line pointer can be marked dead.  
Otherwise just
                 * redirect the root to the correct chain member.
                 */
                if (i >= nchain)
                        heap_prune_record_dead(prstate, rootoffnum);
                else
                        heap_prune_record_redirect(prstate, rootoffnum, 
chainitems[i]);


Greetings,

Andres Freund


Reply via email to