On Wed, 2010-02-10 at 09:32 +0200, Heikki Linnakangas wrote: > Fujii Masao wrote: > > As I pointed out previously, the standby might restore a partially-filled > > WAL file that is being archived by the primary, and cause a FATAL error. > > And this happened in my box when I was testing the SR. > > > > sby [20088] FATAL: archive file "000000010000000000000087" has > > wrong size: 14139392 instead of 16777216 > > sby [20076] LOG: startup process (PID 20088) exited with exit code 1 > > sby [20076] LOG: terminating any other active server processes > > act [18164] LOG: received immediate shutdown request > > > > If the startup process is in standby mode, I think that it should retry > > starting replication instead of emitting an error when it finds a > > partially-filled file in the archive. Then if the replication has been > > terminated, it has only to restore the archived file again. Thought? > > Hmm, so after running restore_command, check the file size and if it's > too short, treat it the same as if restore_command returned non-zero? > And it will be retried on the next iteration. Works for me, though OTOH > it will then fail to complain about a genuinely WAL file that's > truncated for some reason. I guess there's no way around that, even if > you have a script as restore_command that does the file size check, it > will have the same problem.
Are we trying to re-invent pg_standby here? -- Simon Riggs www.2ndQuadrant.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers