Hi Bertrand,

On Sat, Jun 6, 2026 at 7:07 PM Bertrand Drouvot
<[email protected]> wrote:
>
> Hi Alexander,
>
> On Sat, Jun 06, 2026 at 12:00:00PM +0300, Alexander Lakhin wrote:
> > Hello hackers,
> >
> > That is, walsender requested WAL segment for timeline 1, while in a
> > successful run, it reads WAL for timeline 2.
> >
> > I've managed to reproduce this failure with:
>
> Thanks for the report and the repro!
>
> > As far as I can see, the timeline is chosen in logical_read_xlog_page()
> > depending on the recovery state:
> >         am_cascading_walsender = RecoveryInProgress();
> >
> >         if (am_cascading_walsender)
> >                 GetXLogReplayRecPtr(&currTLI);
> >         else
> >                 currTLI = GetWALInsertionTimeLine();
>
> Yeah, it looks like there is a race condition here. I think we should check if
> the insertion timeline has already been set (like the walsummarizer is doing).
>
> I'll work on a fix early next week.

This looks like the right direction to fix. We may want to apply
similar logic to read_local_xlog_page_guts as well. Although the
failure is reported in walsender, SQL logical decoding uses the local
WAL reader and has the same recovery/TLI pattern.

--
Regards,
Xuneng Zhou
HighGo Software Co., Ltd.


Reply via email to