On 10.08.2020 23:20, Robert Haas wrote:
On Sun, Aug 9, 2020 at 1:21 AM Michael Paquier <mich...@paquier.xyz> wrote:
Sorry for the late reply.  I have been looking at that stuff again,
and restore_command can be called in the context of a WAL sender
process within the page_read callback of logical decoding via
XLogReadDetermineTimeline(), as readTimeLineHistory() could look for a
timeline history file.  So restore_command is not used only in the
startup process.
Hmm, interesting. But, does that make this change wrong, apart from
the comments? Like, in the case of primary_conninfo, maybe some
confusion could result if the startup process decided whether to ask
for a WAL receiver based on thinking primary_conninfo being set, while
that process thought that it wasn't actually set after all, as
previously discussed in
http://postgr.es/m/ca+tgmozvmjx1+qtww2tsnphrnkwkzxc3zsrynfb-fpzm1ox...@mail.gmail.com
... but what's the corresponding hazard here, exactly? It doesn't seem
that there's any way in which the decision one process makes affects
the decision the other process makes. There's still a race condition:
it's possible for a walsender
Did you mean walreceiver here?
  to use the old restore_command after the
startup process had already used the new one, or the other way around.
However, it doesn't seem like that should confuse anything inside the
server, and therefore I'm not sure we need to code around it.
I came up with following scenario. Let's say we have xlog files 1,2,3 in dir1 and files 4,5 in dir2. If startup process had only handled files 1 and 2, before we switched restore_command from reading dir1 to reading dir2, it will fail to find next file. IIUC, it will assume that recovery is done, start server and walreceiver. The walreceiver will fail as well. I don't know, how realistic is this case, though.

In general,. this feature looks useful and consistent with previous changes, so I am interested in pushing it forward. Sergey, could you please attach this thread to the upcoming CF, if you're going to continue working on it.

 A few more questions:
- RestoreArchivedFile() is also used by pg_rewind. I don't see any particular problem with it, just want to remind that we should test it too. - How will it interact with possible future optimizations of archive restore? For example, WAL prefetch [1].

 [1] https://www.postgresql.org/message-id/flat/601ee1f5-0b78-47e1-9aae-c15f74a1c...@postgrespro.ru

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Reply via email to