At Tue, 16 Jun 2020 14:31:43 -0400, Alvaro Herrera <alvhe...@2ndquadrant.com> 
wrote in 
> On 2020-Jun-16, Kyotaro Horiguchi wrote:
> 
> > I noticed the another issue. If some required WALs are removed, the
> > slot will be "invalidated", that is, restart_lsn is set to invalid
> > value. As the result we hardly see the "lost" state.
> > 
> > It can be "fixed" by remembering the validity of a slot separately
> > from restart_lsn. Is that worth doing?
> 
> We discussed this before.  I agree it would be better to do this
> in some way, but I fear that if we do it naively, some code might exist
> that reads the LSN without realizing that it needs to check the validity
> flag first.

Yes, that was my main concern on it. That's error-prone. How about
remembering the LSN where invalidation happened?  It's safe since no
others than slot-monitoring functions would look
last_invalidated_lsn. It can be reset if active_pid is a valid pid.

InvalidateObsoleteReplicationSlots:
 ...
                SpinLockAcquire(&s->mutex);
+               s->data.last_invalidated_lsn = s->data.restart_lsn;
                s->data.restart_lsn = InvalidXLogRecPtr;
                SpinLockRelease(&s->mutex);

> On the other hand, maybe this is not a problem in practice, because if
> such a bug occurs, what will happen is that trying to read WAL from such
> a slot will return the error message that the WAL file cannot be found.
> Maybe this is acceptable?

I'm not sure.  For my part a problem of that would we need to look
into server logs to know what is acutally going on.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


Reply via email to