On Sun, Oct 12, 2025 at 11:27 PM Bertrand Drouvot
<[email protected]> wrote:
>
> Hi,
>
> On Thu, Oct 09, 2025 at 10:49:39AM +0800, suyu.cmj wrote:
> > Hi,
> > Thank you for the reference to commit 818fefd8fd4 and the related 
> > discussion thread. I understand the intent of introducing 
> > initial_restart_lsn was to preserve a consistent invalidation cause 
> > throughout the invalidation loop.
> > However, I still have a few concerns about this design change:
> > 1. I understand the intention to keep the invalidation cause consistent, 
> > but If a slot's restart_lsn advances significantly during the invalidation 
> > check—indicating it is actively in use—shouldn't we reconsider invalidating 
> > it?
> > 2. What potential issues arise if we refrain from invalidating slots whose 
> > restart_lsn advances during the invalidation process? Intuitively, an 
> > actively used slot that has moved it's restart_lsn beyond the problematic 
> > point should not be marked invalid.
> > 3. If the current approach is indeed correct, should we consider making 
> > PG15 and earlier consistent with this behavior? The behavioral difference 
> > across versions may lead to different operational outcomes in otherwise 
> > similar situations.
> > I would appreciate your insights on these points.
>
> I agree that before 818fefd8fd4 the invalidation cause could move from
> RS_INVAL_WAL_REMOVED to RS_INVAL_NONE if the slot restart lsn has been able to
> advance enough between the time we release the mutex and do the next check.
>
> With 818fefd8fd4 that's not the case anymore and we keep WAL_REMOVED as the
> invalidation cause (even if  the slot restart lsn has been able to advance
> enough).
>
> That looks safe to use the pre 818fefd8fd4 behavior for the slot restart lsn
> case because the WAL files have not yet been removed by the 
> checkpointer/startup
> process when it's busy in InvalidatePossiblyObsoleteSlot().
>
> I think that we could get rid of the initial_restart_lsn and just use
> s->data.restart_lsn here (while keeping initial xmin ones to preserve the
> intent of 818fefd8fd4 for those).

IIUC with the proposed patch, it's possible that we report the slot
invalidation once but don't actually invalidate the slot if slot's
restart_lsn gets advanced and becomes greater than the oldestLSN after
the report, is that right?

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com


Reply via email to