Hi, On 2022-04-18 22:45:07 +1200, Thomas Munro wrote: > On Mon, Apr 18, 2022 at 7:19 PM Michael Paquier <mich...@paquier.xyz> wrote: > > On Sat, Apr 16, 2022 at 02:36:33PM -0700, Andres Freund wrote: > > > which I haven't seen locally. Looks like we have some race between > > > startup process and walreceiver? That seems not great. I'm a bit > > > confused that walreceiver and archiving are both active at the same time > > > in the first place - that doesn't seem right as things are set up > > > currently. > > > > Yeah, that should be exclusively one or the other, never both. > > WaitForWALToBecomeAvailable() would be a hot spot when it comes to > > decide when a WAL receiver should be spawned by the startup process. > > Except from the recent refactoring of xlog.c or the WAL prefetch work, > > there has not been many changes in this area lately. > > Hmm, well I'm not sure what is happening here and will try to dig > tomorrow, but one observation from some log scraping is that kestrel > logged similar output with "could not link file" several times before > the main prefetching commit (5dc0418). I looked back 3 months on > kestrel/HEAD and found these:
Kestrel won't go that far back even - I set it up 23 days ago... I'm formally on vacation till Thursday, I'll try to look at earlier instances then. Unless it's already figured out :). I failed at reproducing it locally, despite a fair bit of effort. The BF really should break out individual tests into their own stage logs. The recovery-check stage is 13MB and 150k lines by now. Greetings, Andres Freund