Hi, On Tue, Apr 14, 2009 at 6:35 PM, Simon Riggs <si...@2ndquadrant.com> wrote: > On Mon, 2009-04-13 at 14:52 +0900, Fujii Masao wrote: > >> A lookahead (the +1) may have pg_standby get stuck as follows. >> Am I missing something? >> >> 1. the trigger file containing "smart" is created. >> 2. pg_standby is executed. >> 2-1. nextWALfile is restored. >> 2-2. the trigger file is deleted because nextWALfile+1 doesn't exist. >> 3. the restored nextWALfile is applied. >> 4. pg_standby is executed again to restore nextWALfile+1. > > This can't happen. (4) will never occur when (2-2) has occurred. A > non-zero error code means file not available which will cause recovery > to end and hence no requests for further WAL files are made.
When pg_standby exits with non-zero code, (3) and (4) will never occur, and the transactions in nextWALfile will be lost. So, in (2-2), pg_standby has to call exit(0), I think. On the other hand, if exit(0) is called in (2-2), the above scenario happens. > It does *seem* as if there is a race condition there in that another WAL > file may arrive after we have taken the decision there are no more WAL > files, but it's not a problem. That could happen if we issue the trigger > while the master is still up, which is a mistake - why would we do that? > If we only issue the trigger once we are happy the master is down then > we don't get a problem. Yeah, I agree that such race condition is not a problem. The trigger file has to be created after all the WAL files arrive at the standby server. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers