On Fri, Jan 19, 2024 at 3:55 PM shveta malik <shveta.ma...@gmail.com> wrote: > > On Fri, Jan 19, 2024 at 10:35 AM Masahiko Sawada <sawada.m...@gmail.com> > wrote: > > > > > > Thank you for updating the patch. I have some comments: > > > > --- > > + latestWalEnd = GetWalRcvLatestWalEnd(); > > + if (remote_slot->confirmed_lsn > latestWalEnd) > > + { > > + elog(ERROR, "exiting from slot synchronization as the > > received slot sync" > > + " LSN %X/%X for slot \"%s\" is ahead of the > > standby position %X/%X", > > + LSN_FORMAT_ARGS(remote_slot->confirmed_lsn), > > + remote_slot->name, > > + LSN_FORMAT_ARGS(latestWalEnd)); > > + } > > > > IIUC GetWalRcvLatestWalEnd () returns walrcv->latestWalEnd, which is > > typically the primary server's flush position and doesn't mean the LSN > > where the walreceiver received/flushed up to. > > yes. I think it makes more sense to use something which actually tells > flushed-position. I gave it a try by replacing GetWalRcvLatestWalEnd() > with GetWalRcvFlushRecPtr() but I see a problem here. Lets say I have > enabled the slot-sync feature in a running standby, in that case we > are all good (flushedUpto is the same as actual flush-position > indicated by LogstreamResult.Flush). But if I restart standby, then I > observed that the startup process sets flushedUpto to some value 'x' > (see [1]) while when the wal-receiver starts, it sets > 'LogstreamResult.Flush' to another value (see [2]) which is always > greater than 'x'. And we do not update flushedUpto with the > 'LogstreamResult.Flush' value in walreceiver until we actually do an > operation on primary. Performing a data change on primary sends WALs > to standby which then hits XLogWalRcvFlush() and updates flushedUpto > same as LogstreamResult.Flush. Until then we have a situation where > slots received on standby are ahead of flushedUpto and thus slotsync > worker keeps one erroring out. I am yet to find out why flushedUpto is > set to a lower value than 'LogstreamResult.Flush' at the start of > standby. Or maybe am I using the wrong function > GetWalRcvFlushRecPtr() and should be using something else instead? >
Can we think of using GetStandbyFlushRecPtr()? We probably need to expose this function, if this works for the required purpose. -- With Regards, Amit Kapila.