Hi! On Sun, Sep 29, 2024 at 1:40 PM Alexander Korotkov <aekorot...@gmail.com> wrote: > > Thank you for your review. > > On Fri, Sep 27, 2024 at 7:35 AM Michael Paquier <mich...@paquier.xyz> wrote: > > > > On Fri, Sep 20, 2024 at 03:00:20PM +0300, Alexander Korotkov wrote: > > > Please, check the attached patchset for implementation of proposed > > > approach. > > > > 0001 looks like it requires an indentation in its .h diffs. > > > > +typedef enum > > +{ > > + WaitLSNResultSuccess, /* Target LSN is reached */ > > + WaitLSNResultTimeout, /* Timeout occured */ > > > > Perhaps use WAIT_LSN_RESULT_SUCCESS, etc, rather than camel casing. > > > > + * Results statuses for WaitForLSNReplay(). > > > > Too much plural here. > > > > What you are doing with WaitForLSNReplay() sounds kind of right. > > > > rc = WaitLatch(MyLatch, wake_events, delay_ms, > > WAIT_EVENT_WAIT_FOR_WAL_REPLAY); > > Thank you, fixed in the attached patchset. > > > Question about this existing bit in waitlsn.c. Shouldn't this issue a > > FATAL if rc reports a WL_POSTMASTER_DEATH? Or am I missing an > > intention here. That was already the case before this patch set. > > Fixed in the separate patch. > > > pg_wal_replay_wait() is new to v18, meaning that we still have some > > time to agree on its final shape for this release cycle. This > > discussion shares similarities with the recent exercise done in > > f593c5517d14, and I think that we should be more consistent between > > both to not repeat the same mistakes as in the past, even if this case > > if more complex because we have more reasons behind why a wait could > > not have happened. > > > > I would suggest to keep things simple and have one single function > > rather than introduce two more pg_proc entries with slight differences > > in their error reporting, making the original function return a text > > about the reason of the failure when waiting (be it a timeout, > > success, promotion or outside recovery). > > I also like to keep things simple. Keeping this as a single function > is not possible due to the reasons I described in [1]. However, it's > possible to fit into one stored procedure. I made 'no_error' as an > argument for the pg_wal_replay_wait() procedure. Done so in the > attached patchset. > > > FWIW, I am confused regarding the need for WaitLSNResultNotInRecovery > > and WaitLSNResultPromotedConcurrently in the error states. On a code > > basis, they check the same thing: RecoveryInProgress(). I'd suggest > > to cut one of them. > > Agreed. I initially intended to distinguish situations when the user > mistakenly calls pg_wal_replay_wait() on replication leader and when > concurrent promotion happens. However, given that the promotion could > happen after the user issued the query and before waiting starts, it > doesn't make much sense. > > > This also points to the fact that > > WaitForLSNReplay() has some duplication that's not required. We could > > have less GetXLogReplayRecPtr() calls and do all the checks in the for > > loop. The current structure of the code in waitlsn.c is more complex > > than it could be. > > Not sure about this. I'd like to keep the fast-path check before we > call addLSNWaiter().
Please, check the revised patchset. It contains more tests for new function pg_wal_replay_wait_status() and changes for documentation. ------ Regards, Alexander Korotkov Supabase
v4-0003-Add-no_error-argument-to-pg_wal_replay_wait.patch
Description: Binary data
v4-0002-Refactor-WaitForLSNReplay-to-return-the-result-of.patch
Description: Binary data
v4-0001-Make-WaitForLSNReplay-issue-FATAL-on-postmaster-d.patch
Description: Binary data