Chao Li <[email protected]> wrote:

> While reading logical replication code, I found an issue in 
> LogicalConfirmReceivedLocation().
> 
> In LogicalConfirmReceivedLocation(), updated_restart is tracked independently 
> from updated_xmin, and the slot is marked dirty and saved when either one 
> changed. But after that, ReplicationSlotsComputeRequiredLSN() is still only 
> called inside "if (updated_xmin)”. 
> 
> So for the restart-only case:
> 
> * updated_restart = true
> * updated_xmin = false
> * ReplicationSlotSave() runs
> * ReplicationSlotsComputeRequiredLSN() does not run because updated_xmin is 
> false
> 
> That means the global retention point managed by 
> XLogSetReplicationSlotMinimumLSN() can stay stale until some later unrelated 
> event recomputes it. Since ReplicationSlotsComputeRequiredLSN() derives the 
> global minimum from slot restat_lsn, skipping it after a restart-only advance 
> can retain excess WAL and may lead to WAL bloat.
> 
> This patch fixes the problem by moving ReplicationSlotsComputeRequiredLSN() 
> under “if (updated_restart)”.

FYI, this overlaps with another post in the REPACK thread [1].

> Looks like this issue has been there for a long time, so if this analysis is 
> correct, it may also be worth back-patching.

As REPACK in PG 19 does not let xmin advance (that should be fixed in the
future), I think makes sense to apply [1] to v19. However, during logical
replication, xmin (IMO) gets updated rather often, so the problem should not
be that severe in earlier versions.

[1] 
https://www.postgresql.org/message-id/TYRPR01MB14195633567DA00ABD42570B794592%40TYRPR01MB14195.jpnprd01.prod.outlook.com

-- 
Antonin Houska
Web: https://www.cybertec-postgresql.com


Reply via email to