Hi Marco, On Thu, Jan 29, 2026 at 1:03 AM Marco Nenciarini <[email protected]> wrote: > > Hi hackers, > > I've encountered a bug in PostgreSQL's streaming replication where cascading > standbys fail to reconnect after falling back to archive recovery. The issue > occurs when the upstream standby uses archive-only recovery. > > The standby requests streaming from the wrong WAL position (next segment > boundary > instead of the current position), causing connection failures with this error: > > ERROR: requested starting point 0/A000000 is ahead of the WAL flush > position of this server 0/9000000 > > Attached are two shell scripts that reliably reproduce the issue on PostgreSQL > 17.x and 18.x: > > 1. reproducer_restart_upstream_portable.sh - triggers by restarting upstream > 2. reproducer_cascade_restart_portable.sh - triggers by restarting the cascade > > The scripts set up this topology: > - Primary with archiving enabled > - Standby using only archive recovery (no streaming from primary) > - Cascading standby streaming from the archive-only standby > > When the cascade loses its streaming connection and falls back to archive > recovery, > it cannot reconnect. The issue appears to be in xlogrecovery.c around line > 3880, > where the position passed to RequestXLogStreaming() determines which segment > boundary is requested. > > The cascade restart reproducer shows that even restarting the cascade itself > triggers the bug, which affects routine maintenance operations. > > Scripts require PostgreSQL binaries in PATH and use ports 15432-15434. > > Best regards, > Marco >
Thanks for your report. I can reliably reproduce the issue on HEAD using your scripts. I’ve analyzed the problem and am proposing a patch to fix it. --- Analysis When a cascading standby streams from an archive-only upstream: 1. The upstream's GetStandbyFlushRecPtr() returns only replay position (no received-but-not-replayed buffer since there's no walreceiver) 2. When streaming ends and the cascade falls back to archive recovery, it can restore WAL segments from its own archive access 3. The cascade's read position (RecPtr) advances beyond what the upstream has replayed 4. On reconnect, the cascade requests streaming from RecPtr, which the upstream rejects as "ahead of flush position" --- Proposed Fix Track the last confirmed flush position from streaming (lastStreamedFlush) and clamp the streaming start request when it exceeds that position: - Same timeline: clamp to lastStreamedFlush if RecPtr > lastStreamedFlush - Timeline switch: fall back to timeline switchpoint as safe boundary This ensures the cascade requests from a position the upstream definitely has, rather than assuming the upstream can serve whatever the cascade restored locally from archive. I’m not a fan of using sleep in TAP tests, but I haven’t found a better way to reproduce this behavior yet. -- Best, Xuneng
v1-0001-Fix-cascading-standby-reconnect-failure-after-arc.patch
Description: Binary data
