Re: BUG: Postgres 14 + vacuum_defer_cleanup_age + FOR UPDATE + UPDATE

Matthias van de Meent Tue, 10 Jan 2023 12:33:18 -0800

On Tue, 10 Jan 2023 at 20:14, Andres Freund <[email protected]> wrote:
>
> Hi,
>
> On 2023-01-10 15:03:42 +0100, Matthias van de Meent wrote:
> > On Mon, 9 Jan 2023 at 20:34, Andres Freund <[email protected]> wrote:
> > > It's not too hard to fix in individual places, but I suspect that we'll
> > > introduce the bug in future places without some more fundamental 
> > > protection.
> > >
> > > Locally I fixed it by clamping vacuum_defer_cleanup_age to a reasonable 
> > > value
> > > in ComputeXidHorizons() and GetSnapshotData().
> >
> > I don't think that clamping the value with oldestXid (as seen in patch
> > 0001, in GetSnapshotData) is right.
>
> I agree that using oldestXid to clamp is problematic.
>
>
> > It would clamp the value relative to the oldest frozen xid of all
> > databases, which can be millions of transactions behind oldestXmin,
> > and thus severely skew the amount of transaction's changes you keep on
> > disk (that is, until oldestXid moves past 1000_000).
>
> What precisely do you mean with "skew" here? Do you just mean that it'd take a
> long time until vacuum_defer_cleanup_age takes effect? Somehow it sounds like
> you might mean more than that?


h->oldest_considered_running can be extremely old due to the global
nature of the value and the potential existence of a snapshot in
another database that started in parallel to a very old running
transaction.

Example: With vacuum_defer_cleanup_age set to 1000000, it is possible
that a snapshot in another database (thus another backend) would
result in a local intermediate status result of h->o_c_r = 20,
h->s_o_n = 20, h->d_o_n = 10030. The clamped offset would then be 20
(clamped using h->o_c_r), which updates h->data_oldest_nonremovable to
10010. The obvious result is that all but the last 20 transactions
from this database's data files are available for cleanup, which
contradicts with the intention of the vacuum_defer_cleanup_age GUC.

> I'm tempted to go with reinterpreting 64bit xids as signed. Except that it
> seems like a mighty invasive change to backpatch.

I'm not sure either. Protecting against underflow by halving the
effective valid value space is quite the intervention, but if it is
necessary to make this work in a performant manner, it would be worth
it. Maybe someone else with more experience can provide their opinion
here.

Kind regards,

Matthias van de Meent

Re: BUG: Postgres 14 + vacuum_defer_cleanup_age + FOR UPDATE + UPDATE

Reply via email to