On Wed, Oct 25, 2023 at 8:49 PM Drouvot, Bertrand
<bertranddrouvot...@gmail.com> wrote:
>
> On 10/9/23 12:30 PM, shveta malik wrote:
> > PFA v22 patch-set. It has below changes:
> >
> > patch 001:
> > 1) Now physical walsender wakes up logical walsender(s) by using a new
> > CV as suggested in [1]
>
> Thanks!
>
> I think that works fine as long as the standby is up and running and catching 
> up.
>
> The problem I see with the current WalSndWaitForStandbyConfirmation() 
> implementation
> is that if the standby is not running then:
>
> +   for (;;)
> +   {
> +       ListCell   *l;
> +       long        sleeptime = -1;
>
> will loop until we reach the "terminating walsender process due to 
> replication timeout" if we
> explicitly want to end with SIGINT or friends.
>
> For example a scenario like:
>
> - standby down
> - pg_recvlogical running
>
> then CTRL-C on pg_recvlogical would not "respond" immediately but when we 
> reach the replication timeout.
>
> So it seems that we should use something like WalSndWait() instead of 
> ConditionVariableTimedSleep() here:
>
> +               /*
> +                * Sleep until other physical walsenders awaken us or until a 
> timeout
> +                * occurs.
> +                */
> +               sleeptime = WalSndComputeSleeptime(GetCurrentTimestamp());
> +
> +               ConditionVariableTimedSleep(&WalSndCtl->wal_confirm_rcv_cv, 
> sleeptime,
> +                                                                       
> WAIT_EVENT_WAL_SENDER_WAIT_FOR_STANDBY_CONFIRMATION);
>
> In that case I think that WalSndWait() should take care of the new CV 
> WalSndCtl->wal_confirm_rcv_cv too.
> The wait on the socket should allow us to stop waiting when, for example, 
> CTRL-C on pg_recvlogical is triggered.
>
> Then we would need to deal with this scenario: Standby down or not catching 
> up and exited WalSndWait() due to the socket
> to break the loop or shutdown the walsender.
>
> Thoughts?
>

Good point, I think we should enhance the WalSndWait() logic to
address this case. Additionally, I think we should ensure that
WalSndWaitForWal() shouldn't wait twice once for wal_flush and a
second time for wal to be replayed by physical standby. It should be
okay to just wait for Wal to be replayed by physical standby when
applicable, otherwise, just wait for Wal to flush as we are doing now.
Does that make sense?

-- 
With Regards,
Amit Kapila.


Reply via email to