On Fri, Oct 24, 2025 at 3:23 AM Masahiko Sawada <[email protected]> wrote: > > On Thu, Oct 23, 2025 at 11:52 AM Masahiko Sawada <[email protected]> > wrote: > > > > On Thu, Oct 23, 2025 at 3:39 AM shveta malik <[email protected]> wrote: > > > > > > Please find a few comments: > > > > > > 1) > > > RequestDisableLogicalDecoding: > > > Shall we have 'Assert(!MyReplicationSlot)' here to ensure that this > > > function is invoked after we have released and dropped the slot. This > > > is similar to 'Assert(MyReplicationSlot)' in > > > EnsureLogicalDecodingEnabled(). > > > > I think the reason why we need 'Assert(MyReplicationSlot)' in > > EnsureLogicalDecodingEnabled() is that otherwise the caller cannot get > > the expected result. IOW, suppose w can call > > EnsureLogicalDecodingEnabled() without holding a logical slot, logical > > decoding is not enabled if all logical slots are dropped concurrently > > in spite of the function name having the term "Ensure". As for > > RequestDisableLogicalDecoding(), it can request to disable logical > > decoding even while holding a logical slot. It might practically make > > less sense but it's no problem in principle. I guess it makes sense to > > put the assertion into DisableLogicalDecodingIfNecessary() instead. > > > > > > > > 2) > > > EnsureLogicalDecodingEnabled sets status_change_inprogress to false > > > before writing the WAL record. Initially, I thought this could lead to > > > a situation where another session might drop the same slot, since > > > there’s nothing preventing it (as status_change_inprogress is false > > > and LogicalDecodingControlLock has been released). This could, in > > > theory, result in the checkpointer writing the WAL record that > > > disables logical decoding before EnsureLogicalDecodingEnabled() writes > > > its WAL record that enables it — potentially causing an issue. But > > > this problem could not be reproduced in practice, since the slot was > > > acquired by session1, and therefore another session attempting to drop > > > it couldn’t acquire it. That said, I still lean towards setting > > > status_change_inprogress = false after the WAL record has been written > > > in EnsureLogicalDecodingEnabled(). Thoughts? > > > If not, we could add a comment explaining why this scenario is not a > > > problem. > > > > Good point. I'll incorporate your suggestion. > > > > > > > > 3) > > > + # Drop the logical slot, requesting to disable logical decoding to > > > the checkpointer. > > > + # It has to wait for the recovery to complete before disabling > > > logical decoding. > > > + $standby5->safe_psql('postgres', > > > + qq[select pg_drop_replication_slot('standby5_slot');]); > > > + > > > + # Resume the startup process to complete the recovery. > > > + $standby5->safe_psql('postgres', > > > + qq[select > > > injection_points_wakeup('startup-logical-decoding-status-change-end-of-recovery')] > > > + ); > > > + > > > + $standby5->wait_for_log( > > > + "waiting for recovery completion to change logical decoding status"); > > > > > > Shouldn’t we check the log for "waiting for recovery completion..." > > > before triggering injection_points_wakeup? > > > > > > IIUC, the current order may cause intermittent failures. Imagine that > > > drop-slot has not yet reached the LogicalDecodingStatusChangeAllowed > > > and RecoveryInProgress checks, and we release the injection point in > > > the meantime. In that case, drop-slot may never end up waiting, and we > > > might not see the expected log message. > > > > Agreed. > > > > I'll self-review the patch again and share the updated version patch. > > > > I've addressed the above comments and made some cosmetic changes. I > think this patch is in good shape, so I am planning to push it next > week or so unless there are major review comments. >
I also think the patch is in good shape now. I could not find any bugs in my testing of the previous version. I will review and validate the new version once early next week. I had a quick look at v21, I see that the CreatePublication() check is not changed. It still checks 'IsLogicalDecodingEnabled'. As per our discussion earlier, I was under the impression that 'if wal_level == minimal' check and related error-message will make more sense there instead of current-one. Isn't that the case? thanks Shveta
