Simon Riggs wrote:
On Thu, 2009-01-29 at 12:22 +0200, Heikki Linnakangas wrote:
It
comes from the fact that we set minSafeStartPoint beyond the actual end
of WAL, if the last WAL segment is only partially filled (= fails CRC
check at some point). If we crash after setting minSafeStartPoint like
that, and then restart recovery, we'll get the error.
Look again please. My proposal would avoid the error when it is not
relevant, yet keep it when it is (while recovering base backups).
I fail to see what base backups have to do with this. The problem arises
in this scenario:
0. A base backup is unzipped. recovery.conf is copied in place, and the
remaining unarchived WAL segments are copied from the primary server to
pg_xlog. The last WAL segment is only partially filled. Let's say that
redo point is in WAL segment 1. The last, partial, WAL segment is 3, and
WAL ends at 0/3500000
1. postmaster is started, recovery starts.
2. WAL segment 1 is restored from archive.
3. We reach consistent recovery point
4. We restore WAL segment 2 from archive. minSafeStartPoint is advanced
to 0/3000000
5. WAL segment 2 is completely replayed, we move on to WAL segment 3. It
is not in archive, but it's found in pg_xlog. minSafeStartPoint is
advanced to 0/4000000. Note that that's beyond end of WAL.
6. At replay of WAL record 0/3200000, the recovery is interrupted. For
example, by a fast shutdown request, or crash.
Now when we restart the recovery, we will never reach minSafeStartPoint,
which is now 0/4000000, and we'll fail with the error that Fujii-san
pointed out. We're already way past the min recovery point of base
backup by then.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers