Re: pg_stat_replication.*_lag sometimes shows NULL during active replication

Shinya Kato Sun, 15 Mar 2026 17:27:11 -0700

On Fri, Mar 13, 2026 at 12:27 AM Fujii Masao <[email protected]> wrote:


> Thanks for testing and for the clarification! You're right.
>
> However, if we apply this change, the time required for the lag information to
> be reset would effectively double. I start wondering if that's really
> acceptable, especially for back branches. Although the docs doesn't clearly
> specify this timing, doubling it could affect systems that monitor
> replication lag, for example. It might still be reasonable to apply
> such a change in master, though.

Yes, I agree. Doubling the lag reset time should be avoided in back
branches if possible.

> On further thought, the root cause seems to be that walreceiver can send
> two consecutive status reply messages with identical WAL locations even
> when wal_receiver_status_interval has not yet elapsed. Addressing that
> behavior directly might resolve the issue you reported. I've attached a PoC
> patch that does this. Thought?

Thank you for the v4 patch. I think this approach is better than mine.
I tested the patch and confirmed that the issue no longer reproduces
with physical replication. However, with logical replication, the lag
columns in pg_stat_replication still show NULL periodically at
wal_receiver_status_interval, since send_feedback() in worker.c can
still send duplicate positions.

+ * previsou update, i.e., when 'replyApply' is true.

One minor thing: there is a typo "previsou". It should be "previous".


-- 
Best regards,
Shinya Kato
NTT OSS Center

Re: pg_stat_replication.*_lag sometimes shows NULL during active replication

Reply via email to