Hello

Sorry for late response.

>>  > ... but what's the corresponding hazard here, exactly? It doesn't seem
>>  > that there's any way in which the decision one process makes affects
>>  > the decision the other process makes. There's still a race condition:
>>  > it's possible for a walsender
>>  Did you mean walreceiver here?
>
> It's logical walsender. restore_command is used within
> logical_read_xlog_page() via XLogReadDetermineTimeline().

Still have no idea what's the corresponding hazard here.

>>  > to use the old restore_command after the
>>  > startup process had already used the new one, or the other way around.
>>  > However, it doesn't seem like that should confuse anything inside the
>>  > server, and therefore I'm not sure we need to code around it.
>>  I came up with following scenario. Let's say we have xlog files 1,2,3
>>  in dir1 and files 4,5 in dir2. If startup process had only handled
>>  files 1 and 2, before we switched restore_command from reading dir1 to
>>  reading dir2, it will fail to find next file. IIUC, it will assume
>>  that recovery is done, start server and walreceiver. The walreceiver
>>  will fail as well. I don't know, how realistic is this case, though.
>
> That operation is somewhat bogus, if the server is not in standby
> mode. In standby mode, startup waits for the next segment safely.

I think it's pilot error. It is already possible to change anything in 
restore_command by wrapping real command into some script:

> restore_command = '/bin/restore_wal.sh "%f" "%p"'

And one can simple replace this file with something else with different logic. 
Or even by using some command with separate own settings. Real world example ( 
https://github.com/wal-g/wal-g ):

> restore_command = '. /etc/wal-g/WALG_AWS_ENV; wal-g wal-fetch "%f" "%p"'

And it is possible to change the real WAL source in ENV script without changing 
the restore_command. We can't track this, so I not see new issues here.

>>  Sergey, could you please attach this thread to the upcoming CF, if
>>  you're going to continue working on it.

Sure, I created one: https://commitfest.postgresql.org/30/2802/

>>  - How will it interact with possible future optimizations of archive
>>  - restore? For example, WAL prefetch [1].

Shouldn't we ask the author of such a patch and not me? In particular, does 
this patch rely on the restore_command not being changed? Probably some form of 
synchronisation would be neccesary in infrastructure for parallel executing 
restore commands. On / off handling of restore_command will most likely be 
required. I did not review this patch.

regards, Sergei


Reply via email to