On 04/12/2025 16:36, Álvaro Herrera wrote:
On 2025-Dec-04, Heikki Linnakangas wrote:
While working on the 64-bit multixid offsets patch, I noticed one more bug
with this. At offset wraparound, when we set the next multixid's offset in
RecordNewMultiXact, we incorrectly set it to 0 instead of 1. We're supposed
to skip over offset 1, because 0 is reserved to mean invalid. We do that
correctly when setting the "current" multixid's offset, because the caller
of RecordNewMultiXact has already skipped over offset 0, but I missed it for
the next offset.
Ouch.
Committed the fix.
I tried to modify the new wraparound TAP test to reproduce that, but it
turned out to be difficult because you need to have multiple backends
assigning multixids concurrently to hit that.
Hmm, would it make sense to add a pgbench-based test on
src/test/modules/xid_wraparound? That module is already known to be
expensive, and it doesn't run unless explicitly enabled, so I think it's
not a bad fit.
Doesn't seem worth it. The repro with pgbench is a bit unreliable too.
It actually only readily reproduces on 'master'. In backbranches, I also
had to add a random sleep before RecordNewMultiXact() to trigger it. I
think that's because on 'master', I removed the special case in
GetMultiXactIdMembers() to use the in-memory nextOffset value instead of
reading the next offset from the SLRU page, when we're reading the last
assigned multixid.
Furthermore, we will hopefully get rid of offset wraparounds soon with
the 64-bit offsets.
- Heikki