Hi folks, I'd like to restart the discussion about providing an xid-based slot invalidation mechanism. The previous effort [1] presented an XID and time-based invalidation and the inactive time-based approach was implemented first. The latest XID based patch from Bharath Rupireddy can be found here [2].
When thinking about availability of the database, inactive replication slots cause two main pain points: 1) WAL accumulation 2) Replication slots with xmin/catalog_xmin can hold back vacuuming leading to wrap-around The first issue can be mitigated by 'max_slot_wal_keep_size'. However in the second case there are no good mechanisms to prioritize write availability of the database and avoid wraparound. The new GUC 'idle_replication_slot_timeout' partially addresses the concern if you have similar workloads. However it's hard to set the same setting across a fleet of different applications. It's easy to imagine a high-XID churning workload in one cluster while another has large batch jobs where changes get synced out periodically. There isn't a "one-size" fits all setting for 'idle_replication_slot_timeout' in these two cases. The attached patch addresses this by introducing 'max_slot_xid_age' in a similar fashion. Replication slots with transaction ID greater than the set age will get invalidated allowing vacuum to proceed, biasing towards database availability. Invalidation happens in CHECKPOINT, similar to 'idle_replication_slot_timeout', and when VACUUM occurs. The patch currently attempts to invalidate once-per-autovacuum worker. We're wondering if it should attempt invalidation on a per-relation basis within the vacuum call itself. That would account for scenarios where the cost_delay or naptime is high between autovac executions. Thanks, John H [1] https://www.postgresql.org/message-id/flat/CALj2ACW4aUe-_uFQOjdWCEN-xXoLGhmvRFnL8SNw_TZ5nJe%2Baw%40mail.gmail.com [2] https://www.postgresql.org/message-id/flat/CALj2ACXe8%2BxSNdMXTMaSRWUwX7v61Ad4iddUwnn%3DdjSwx3GLLg%40mail.gmail.com -- John Hsu - Amazon Web Services
0044-Add-XID-age-based-replication-slot-invalidation.patch
Description: Binary data
