Simon Riggs wrote:
On Thu, 2009-01-29 at 12:22 +0200, Heikki Linnakangas wrote:
It comes from the fact that we set minSafeStartPoint beyond the actual end of WAL, if the last WAL segment is only partially filled (= fails CRC check at some point). If we crash after setting minSafeStartPoint like that, and then restart recovery, we'll get the error.

Look again please. My proposal would avoid the error when it is not
relevant, yet keep it when it is (while recovering base backups).

I fail to see what base backups have to do with this. The problem arises in this scenario:

0. A base backup is unzipped. recovery.conf is copied in place, and the remaining unarchived WAL segments are copied from the primary server to pg_xlog. The last WAL segment is only partially filled. Let's say that redo point is in WAL segment 1. The last, partial, WAL segment is 3, and WAL ends at 0/3500000
1. postmaster is started, recovery starts.
2. WAL segment 1 is restored from archive.
3. We reach consistent recovery point
4. We restore WAL segment 2 from archive. minSafeStartPoint is advanced to 0/3000000 5. WAL segment 2 is completely replayed, we move on to WAL segment 3. It is not in archive, but it's found in pg_xlog. minSafeStartPoint is advanced to 0/4000000. Note that that's beyond end of WAL. 6. At replay of WAL record 0/3200000, the recovery is interrupted. For example, by a fast shutdown request, or crash.

Now when we restart the recovery, we will never reach minSafeStartPoint, which is now 0/4000000, and we'll fail with the error that Fujii-san pointed out. We're already way past the min recovery point of base backup by then.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to