Re: IPC/MultixactCreation on the Standby server

Ivan Bykov Mon, 24 Nov 2025 06:33:21 -0800

Hi, Andrey!

> Thanks for your review!


Review still in progress, sorry for the delay. I didn't have enough time to 
fully understand the changes you suggest, but it seems there is 
only a small gap in my understanding of what the patch does. Here is my 
explanation of the problem.

The main problem
=============

The main problem is that we may lose session context before writing the offset 
to SLRU (but we may write 
a WAL record). It seems that the writer process got stuck in the XLogInsert 
procedure or even failed between GetNewMultiXactId 
and RecordNewMultiXact call. In this case, readers will wait to receive a 
conditional variable signal (from new multixacts) 
but could not find a valid offset for the "failed" (it may be in WAL) multixid.

I illustrate this using the next diagram.

Writer                                        Reader             
--------------------------------------------------------------------------------
 MultiXactIdCreateFromMembers
 -> GetNewMultiXactId (101)
                                              GetMultiXactIdMembers(100)
                                              -> LWLockAcquire(MultiXactOffset)
                                              -> read offset 100
                                              -> read offset 101
                                              -> LWLockRelease(MultiXactOffset)
                                                 offset 101 == 0
                                              -> ConditionVariableSleep()
+--------------------------------------------------------------------------------------+
|-> XLogInsert                                                                  
       |
+--------------------------------------------------------------------------------------+
 -> RecordNewMultiXact
  -> LWLockAcquire(MultiXactOffset)
  -> write offset 101
  -> LWLockRelease(MultiXactOffset)
  -> ConditionVariableBroadcast(nextoff_cv);  
                                              -> retry:
                                              -> LWLockAcquire(MultiXactOffset)
                                              -> read offset 100
                                              -> read offset 101
                                              -> LWLockRelease(MultiXactOffset)
                                                 offset 101 != 0
                                              -> length = offset 101 - read 
offset 100
  -> LWLockAcquire(MultiXactMember)
  -> write members 101
  -> LWLockRelease(MultiXactOffset)

--------------------------------------------------------------------------------------

As far as I can see, your proposal seems to address exactly that problem.
The main difference from the former solution is writing to MultiXactOffset SLRU 
all required 
information for the reader atomically on multiact insertion. 
Before this change, we actually extended the multixact insertion time window to 
the next multixact 
insertion time, and it seems a risky design.

I illustrate the new solution using the next diagram.

Writer                                        Reader             
--------------------------------------------------------------------------------
 MultiXactIdCreateFromMembers
 -> GetNewMultiXactId (100)
 -> XLogInsert 
 -> RecordNewMultiXact
  -> LWLockAcquire(MultiXactOffset)
  -> write offset 100 
  -> write offset 101 
*****************************************************************
  -> LWLockRelease(MultiXactOffset)
                                              GetMultiXactIdMembers(100)
                                              -> LWLockAcquire(MultiXactOffset)
                                              -> read offset 100
                                              -> read offset 101
                                              -> LWLockRelease(MultiXactOffset)
                                              Assert(offset 101 == 0)
                                              -> ConditionVariableSleep()
                                              -> length = offset 101 - read 
offset 100
--------------------------------------------------------------------------------------

So if I understand the core idea of your solution right, I think that the code 
in the last patch 
(v10-0001-Avoid-multixact-edge-case-2-by-writing-the-next-.patch) is correct 
and does what it should.

Re: IPC/MultixactCreation on the Standby server

Reply via email to