(CC list trimmed, gmail wouldn't let me send otherwise)

On 22/02/2023 16:29, Maxim Orlov wrote:
On Tue, 21 Feb 2023 at 16:59, Aleksander Alekseev <aleksan...@timescale.com <mailto:aleksan...@timescale.com>> wrote:
    One thing that still bothers me is that during the upgrade we only
    migrate the CLOG segments (i.e. pg_xact / pg_clog) and completely
    ignore all the rest of SLRUs:

    * pg_commit_ts
    * pg_multixact/offsets
    * pg_multixact/members
    * pg_subtrans
    * pg_notify
    * pg_serial

Hi! We do ignore these values, since in order to pg_upgrade the server it must be properly stopped and no transactions can outlast this moment.

That sounds right for pg_serial, pg_notify, and pg_subtrans. But not for pg_commit_ts and the pg_multixacts.

This needs tests for pg_upgrading those SLRUs, after 0, 1 and N wraparounds.

I'm surprised that these patches extend the page numbering to 64 bits, but never actually uses the high bits. The XID "epoch" is not used, and pg_xact still wraps around and the segment names are still reused. I thought we could stop doing that. Certainly if we start supporting 64-bit XIDs properly, that will need to change and we will pg_upgrade will need to rename the segments again.

The previous versions of these patches did that, but I think you changed tact in response to Robert's suggestion at [1]:

Lest we miss the forest for the trees, there is an aspect of this
patch that I find to be an extremely good idea and think we should try
to get committed even if the rest of the patch set ends up in the
rubbish bin. Specifically, there are a couple of patches in here that
have to do with making SLRUs indexed by 64-bit integers rather than by
32-bit integers. We've had repeated bugs in the area of handling SLRU
wraparound in the past, some of which have caused data loss. Just by
chance, I ran across a situation just yesterday where an SLRU wrapped
around on disk for reasons that I don't really understand yet and
chaos ensued. Switching to an indexing system for SLRUs that does not
ever wrap around would probably enable us to get rid of a whole bunch
of crufty code, and would also likely improve the general reliability
of the system in situations where wraparound is threatened. It seems
like a really, really good idea.

These new versions of this patch don't achieve the goal of avoiding wraparound. I think the previous versions that did that was the right approach.

[1] https://www.postgresql.org/message-id/CA%2BTgmoZFmTGjgkmjgkcm2-vQq3_TzcoMKmVimvQLx9oJLbye0Q%40mail.gmail.com

- Heikki



Reply via email to