On Thu, Dec 25, 2025 at 2:52 PM Xuneng Zhou <[email protected]> wrote: > On Thu, Dec 25, 2025 at 7:13 PM Alexander Korotkov <[email protected]> > wrote: > > > > Hi, Xuneng! > > > > On Mon, Dec 22, 2025 at 9:57 AM Xuneng Zhou <[email protected]> wrote: > > > > > > On Sun, Dec 21, 2025 at 12:37 PM Xuneng Zhou <[email protected]> wrote: > > > > > > > > Hi Alexander, > > > > > > > > Thanks for your feedback! > > > > > > > > > I see that we can't specify WAIT_LSN_TYPE_PRIMARY_FLUSH by setting > > > > > mode parameter. Should we allow this? > > > > > > > > I think this constraint could be relaxed if needed. I was previously > > > > unsure about the use cases. > > > > > > Flush mode on the primary seems useful when synchronous_commit is set > > > to off [1]. In that mode, a transaction in primary may return success > > > before its WAL is durably flushed to disk, trading durability for > > > lower latency. A “wait for primary flush” operation provides an > > > explicit durability barrier for cases where applications or tools > > > occasionally need stronger guarantees. > > > > > > [1] https://postgresqlco.nf/doc/en/param/synchronous_commit/ > > > > > > > > If we allow specifying WAIT_LSN_TYPE_PRIMARY_FLUSH, should it be > > > > > separate mode value or the same with WAIT_LSN_TYPE_STANDBY_FLUSH? In > > > > > principle, we could encode both as just 'flush' mode, and detect which > > > > > WaitLSNType to pick by checking if recovery is in progress. However, > > > > > how should we then react to unreached flush location after standby > > > > > promotion (technically it could be still reached but on the different > > > > > timeline)? > > > > > > > > > > > > > Technically, we can use 'flush' mode to specify WAIT FOR behavior in > > > > both primary and replica. Currently, wait for commands error out if > > > > promotion occurs since: either the requested LSN type does not exist > > > > on the primary, or we do not yet have the infrastructure to support > > > > continuing the wait. If we allow waiting for flush on the primary as a > > > > user-visible command and the wake-up calls for flush in primary are > > > > introduced, the question becomes whether we should still abort the > > > > wait on promotion, or continue waiting—as you noted—given that the > > > > target LSN might still be reached, albeit on a different timeline. The > > > > question behind this might be: do users care and should be aware of > > > > the state change of the server while waiting? If they do, then we > > > > better stop the waiting and report the error. In this case, I am > > > > inclined to to break the unified flush mode to something like > > > > primary_flush/standby_flush mode and > > > > WAIT_LSN_TYPE_PRIMARY_FLUSH/WAIT_LSN_TYPE_STANDBY_FLUSH respectively. > > > > > > > > > > After further consideration, it also seems reasonable to use a single, > > > unified flush mode that works on both primary and standby servers, > > > provided its semantics are clearly documented to avoid the potential > > > confusion on failure. I don’t have a strong preference between these > > > two and would be interested in your thoughts. > > > > > > If a standby is promoted while a session is waiting, the command > > > better abort and return an error (or report “not in recovery” when > > > using NO_THROW). At that point, the meaning of the LSN being waited > > > for may have changed due to the timeline switch and the transition > > > from standby to primary. An LSN such as 0/5000000 on TLI 2 can > > > represent entirely different WAL content from 0/5000000 on TLI 1. > > > Allowing the wait to silently continue across promotion risks giving > > > users a false sense of safety—for example, interpreting “wait > > > completed” as “the original data is now durable,” which would no > > > longer be true. > > > > Agree, but there is still risk that promotion happens after user send > > the query but before we started to wait. In this case we will still > > silently start to wait on primary, while user probably meant to wait > > on replica. Probably it would be safer to have separate user-visible > > modes for waiting on primary and on replica? > > > > Thanks for your thoughts. You're right about the race condition. If > promotion happens between query submission and execution, a unified > 'flush' mode could silently switch semantics without the user knowing. > Separate modes like 'standby_flush' and 'primary_flush' would make > user intent explicit and catch this case with an error, which is > safer. Do these two terms look reasonable to you, or would you suggest > better names? If they look ok, I plan to update the implementation to > use these two modes.
Thank you, Xuneng. 'standby_flush' and 'primary_flush' look good for me. Please, go ahead. I think we should name other modes 'standby_write' and 'standby_replay' for the sake of unity. ------ Regards, Alexander Korotkov Supabase
