On Thu, Mar 21, 2024 at 11:23 AM Bertrand Drouvot
<bertranddrouvot...@gmail.com> wrote:
>
> On Thu, Mar 21, 2024 at 08:47:18AM +0530, Amit Kapila wrote:
> > On Wed, Mar 20, 2024 at 1:51 PM Bertrand Drouvot
> > <bertranddrouvot...@gmail.com> wrote:
> > >
> > > On Wed, Mar 20, 2024 at 12:48:55AM +0530, Bharath Rupireddy wrote:
> > > >
> > > > 2. last_inactive_at and inactive_timeout are now tracked in on-disk
> > > > replication slot data structure.
> > >
> > > Should last_inactive_at be tracked on disk? Say the engine is down for a 
> > > period
> > > of time > inactive_timeout then the slot will be invalidated after the 
> > > engine
> > > re-start (if no activity before we invalidate the slot). Should the time 
> > > the
> > > engine is down be counted as "inactive" time? I've the feeling it should 
> > > not, and
> > > that we should only take into account inactive time while the engine is 
> > > up.
> > >
> >
> > Good point. The question is how do we achieve this without persisting
> > the 'last_inactive_at'? Say, 'last_inactive_at' for a particular slot
> > had some valid value before we shut down but it still didn't cross the
> > configured 'inactive_timeout' value, so, we won't be able to
> > invalidate it. Now, after the restart, as we don't know the
> > last_inactive_at's value before the shutdown, we will initialize it
> > with 0 (this is what Bharath seems to have done in the latest
> > v13-0002* patch). After this, even if walsender or backend never
> > acquires the slot, we won't invalidate it. OTOH, if we track
> > 'last_inactive_at' on the disk, after, restart, we could initialize it
> > to the current time if the value is non-zero. Do you have any better
> > ideas?
> >
>
> I think that setting last_inactive_at when we restart makes sense if the slot
> has been active previously. I think the idea is because it's holding 
> xmin/catalog_xmin
> and that we don't want to prevent rows removal longer that the timeout.
>
> So what about relying on xmin/catalog_xmin instead that way?
>

That doesn't sound like a great idea because xmin/catalog_xmin values
won't tell us before restart whether it was active or not. It could
have been inactive for long time before restart but the xmin values
could still be valid. What about we always set 'last_inactive_at' at
restart (if the slot's inactive_timeout has non-zero value) and reset
it as soon as someone acquires that slot? Now, if the slot doesn't get
acquired till 'inactive_timeout', checkpointer will invalidate the
slot.

-- 
With Regards,
Amit Kapila.


Reply via email to