On Fri, Jul 25, 2025 at 7:13 PM Greg Sabino Mullane <htamf...@gmail.com> wrote: > > On Fri, Jul 25, 2025 at 9:57 AM Jon Zeppieri <zeppi...@gmail.com> wrote: >> >> Thanks for the response, Nick. I'm curious why the situation you describe >> wouldn't also lead to the write_lag and flush_lag also being >> high. If the problem is simply keeping up with the primary, wouldn't you >> expect all three lag times to be elevated? > > > No - write and flush are pretty quick and simple, it's just putting the WAL > onto the local disk. Replay involves a lot more work as we have to parse the > WAL and apply the changes, which means doing a lot of I/O across many files. > Still, *hours* to me indicates more than just a lot of extra traffic. Check > that recovery_min_apply_delay is still 0, then log onto the replica and see > what's going on with regards to open transactions and locks.
Thanks Greg. `recovery_min_apply_delay` is 0, just checked. Also, I didn't mention in my initial post that it seemed the cause of the delay was long-running queries on the replica, rather than the primary. It's possible, of course, that I'm wrong, but I was able to get the replica moving again when I killed off old queries on the replica. If those were the problem, though, then I don't understand why the max_standby_streaming_delay didn't prevent that situation. - Jon