На пт, 12.09.2025 г. в 3:37 Michael Paquier <[email protected]> написа:

> Okay, the bit about the cascading standby is a useful piece of
> information.  Do you have some data about the relation reported in the
> error message this is choking on based on its OID?  Is this actively
> used in read-only workloads, with the relation looked at in the
> cascading standby?


This objoid=767325170 is non-existent, nor was it present in the previous
shutdown (objoid=4169049057). So I guess it is something quasi-temporary
that has been dropped afterwards.


>   Is hot_standby_feedback enabled in the cascading
> standby?


Yes, hot_standby_feedback = on.


> With which process has this cascading standby been created?
> Does the workload of the primary involve a high consumption of OIDs
> for relations, say many temporary tables?
>

Yes, we have around 150 entries added and deleted per second in pg_class,
and around 800 in pg_attribute. So something is actively creating and
dropping tables all the time.


>
> Another thing that may help is the WAL record history.  Are you for
> example seeing attempts to drop twice the same pgstats entry in WAL
> records?  Perhaps the origin of the problem is in this area.  A
> refcount of 2 is relevant, of course.
>

How could we dig into this, i.e. inspecting such attempts in the WAL
records?


>
> I have looked a bit around but nothing has popped up here, so as far
> as I know you seem to be the only one impacted by that.
>
> 1d6a03ea4146 and dc5f9054186a are in 17.3, so perhaps something is
> still off with the drop when applied to cascading standbys.  A vital
> piece of information may also be with "generation", which would show
> up in the error report thanks to bdda6ba30cbe, and that's included in
> 17.6.  A first thing would be to update to 17.6 and see how things
> go for these cascading setups.  If it takes a couple of weeks to have
> one report, we have a hunt that may take a few months at least, except
> if somebody is able to find out the race condition here, me or someone
> else.
>
>
Is it enough to upgrade the replicas or we need to upgrade the primary as
well?

--
Kouber

Reply via email to