On Wed, Jun 11, 2025 at 2:31 AM Masahiko Sawada <sawada.m...@gmail.com> wrote: > > I think it's the user's responsibility to keep at least one logical > slot. It seems that setting wal_level to 'logical' would be the most > reliable solution for this case. We might want to provide a way to > keep 'logical' WAL level somehow but I don't have a good idea for now. >
Okay, Let me think more on this. > > Considering cascading replication cases too, 2) could be tricky as > cascaded standbys need to propagate the information of logical slot > creation up to the most upstream server. > Yes, I understand the challenges here. Thanks for the v2 patches, few concerns: 1) Now when the slot on standby is invalidated due to effective_wal_level switched back to replica and if we restart standby, it fails to restart even if wal_level is explicitly changed to logical in conf file. FATAL: logical replication slot "slot_st" exists, but logical decoding is not enabled HINT: Change "wal_level" to be "replica" or higher. 2) I see that when primary switches back its effective wal_level to replica while standby has wal_level=logical in conf file, then standby has this status: postgres=# show wal_level; wal_level ----------- logical postgres=# show effective_wal_level; effective_wal_level --------------------- replica Is this correct? Can effective_wal_level be < wal_level anytime? I feel it can be greater but never lesser. 3) When standby invalidate obsolete slots due to effective_wal_level on primary changed to replica, it dumps below: LOG: invalidating obsolete replication slot "slot_st2" DETAIL: Logical decoding on standby requires "wal_level" >= "logical" on the primary server Shall we update this message as well to convey about slot-presence on primary. DETAIL: Logical decoding on standby requires "wal_level" >= "logical" or presence of logical slot on the primary server. 4) I see that the slotsync worker is running all the time now as against the previous behaviour where it will not start if wal_level is less than logical or switched to '< logical' anytime. Even with wal_level and effective_wal_level set to replica, slot-sync keeps on attempting synchronization. This does not look correct. I think we need to find a way to stop sot-sync worker when effective_wal_level is switched to replica from logical. 5) Can you please help me understand the changes at [1]. a) Why is it needed when we have code logic at [2] b) in [1], why do we check n_inuse_logical_slots on standby and then make decisions? Why not to disable logical-decoding directly just like [2] [1]: + if (xlrec.wal_level == WAL_LEVEL_LOGICAL) + { + /* + * If the primary increase WAL level to 'logical', we can + * unconditionally enable the logical decoding on the standby. + */ + UpdateLogicalDecodingStatus(true); + } + else if (xlrec.wal_level == WAL_LEVEL_REPLICA && + pg_atomic_read_u32(&ReplicationSlotCtl->n_inuse_logical_slots) == 0) + { + /* + * Disable the logical decoding if there is no in-use logical slot + * on the standby. + */ + UpdateLogicalDecodingStatus(false); + } [2]: + else if (info == XLOG_LOGICAL_DECODING_STATUS_CHANGE) + { + bool logical_decoding; + + memcpy(&logical_decoding, XLogRecGetData(record), sizeof(bool)); + UpdateLogicalDecodingStatus(logical_decoding); + + /* + * Invalidate logical slots if we are in hot standby and the primary + * disabled the logical decoding. + */ + if (!logical_decoding && InRecovery && InHotStandby) + InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_LEVEL, + 0, InvalidOid, + InvalidTransactionId); + + LWLockAcquire(ControlFileLock, LW_EXCLUSIVE); + ControlFile->logicalDecodingEnabled = logical_decoding; + UpdateControlFile(); + LWLockRelease(ControlFileLock); + } thanks Shveta