On Tue, 10 Jan 2023 at 20:14, Andres Freund <and...@anarazel.de> wrote: > > Hi, > > On 2023-01-10 15:03:42 +0100, Matthias van de Meent wrote: > > On Mon, 9 Jan 2023 at 20:34, Andres Freund <and...@anarazel.de> wrote: > > > It's not too hard to fix in individual places, but I suspect that we'll > > > introduce the bug in future places without some more fundamental > > > protection. > > > > > > Locally I fixed it by clamping vacuum_defer_cleanup_age to a reasonable > > > value > > > in ComputeXidHorizons() and GetSnapshotData(). > > > > I don't think that clamping the value with oldestXid (as seen in patch > > 0001, in GetSnapshotData) is right. > > I agree that using oldestXid to clamp is problematic. > > > > It would clamp the value relative to the oldest frozen xid of all > > databases, which can be millions of transactions behind oldestXmin, > > and thus severely skew the amount of transaction's changes you keep on > > disk (that is, until oldestXid moves past 1000_000). > > What precisely do you mean with "skew" here? Do you just mean that it'd take a > long time until vacuum_defer_cleanup_age takes effect? Somehow it sounds like > you might mean more than that?
h->oldest_considered_running can be extremely old due to the global nature of the value and the potential existence of a snapshot in another database that started in parallel to a very old running transaction. Example: With vacuum_defer_cleanup_age set to 1000000, it is possible that a snapshot in another database (thus another backend) would result in a local intermediate status result of h->o_c_r = 20, h->s_o_n = 20, h->d_o_n = 10030. The clamped offset would then be 20 (clamped using h->o_c_r), which updates h->data_oldest_nonremovable to 10010. The obvious result is that all but the last 20 transactions from this database's data files are available for cleanup, which contradicts with the intention of the vacuum_defer_cleanup_age GUC. > I'm tempted to go with reinterpreting 64bit xids as signed. Except that it > seems like a mighty invasive change to backpatch. I'm not sure either. Protecting against underflow by halving the effective valid value space is quite the intervention, but if it is necessary to make this work in a performant manner, it would be worth it. Maybe someone else with more experience can provide their opinion here. Kind regards, Matthias van de Meent