On Fri, Mar 6, 2026 at 4:13 PM Shinya Kato <[email protected]> wrote:
>
> On Mon, Mar 2, 2026 at 11:44 PM Fujii Masao <[email protected]> wrote:
> > With the patch applied, I set up a logical replication and inserted a row 
> > every
> > second. Even with continuous inserts, NULL was shown in the lag columns of
> > pg_stat_replication. That makes me wonder whether the patch's approach is
> > sufficient to address the issue.
>
> Thank you for the review and testing! I had only considered the issue
> in the context of physical replication, but as you pointed out, my
> approach is insufficient for logical replication.
>
> > Relying solely on replies from the standby or subscriber seems a bit 
> > fragile to
> > me. If the goal is to keep showing the last measured lag for some time,
> > perhaps we should introduce a rate limit on when NULL is displayed in the 
> > lag
> > columns?
>
> My primary goal was to ensure that the source code comments match the
> actual behavior, as the comment stating "the second such message must
> result from wal_receiver_status_interval expiring on the standby" is
> inaccurate. However, as you noted, the patch alone is not sufficient
> to fully address the issue.
>
> > For example, if there has been no activity (i.e., sentPtr == applyPtr and
> > applyPtr has not changed since the previous cycle) for, say, 10 seconds,
> > then we could allow NULL to be shown. Thought?
>
> I considered a time-based rate limit, but it is difficult to choose an
> appropriate threshold. Furthermore, the walsender has no way of
> knowing the standby's or subscriber's wal_receiver_status_interval
> setting.
>
> The attached v2 patch takes a different approach: it additionally
> requires that all reported positions (write/flush/apply) remain
> unchanged from the previous reply. This directly detects a truly idle
> system without relying on timeouts—if any position has advanced, new
> WAL activity must have occurred, so we should not clear the lag values
> even if the lag tracker is empty.

This approach looks good to me.

One comment: currently, the lag becomes NULL basically after about one
wal_receiver_status_interval during periods of no activity. OTOH, with this
approach, it seems it would take about twice wal_receiver_status_interval.
Is this understanding correct?

Regards,

-- 
Fujii Masao


Reply via email to