Simon Riggs wrote:
Falling back to the secondary checkpoint implies we have a corrupted or
absent WAL file, so making recovery startup work correctly won't avoid
the need to re-run the base backup. We'll end with an unrecoverable
error in either case, so it doesn't seem worth attempting to improve
this in the way you suggest.

That's true whenever you have to fall back to a secondary checkpoint, but we still try to get the database up. One could argue that we shouldn't, of course.

Anyway, the point is that the patch relies on a non-obvious assumption. Even if the secondary checkpoint issue is a non-issue, it's not obvious (to me at least) that there isn't other similar scenarios. And someone might inadvertently break the assumption in a future patch, because it's not an obvious one; calling ReadRecord looks very innocent. We shouldn't introduce an assumption like that when we don't have to.

I think we should completely prevent access to secondary checkpoints
during archive recovery, because if the primary checkpoint isn't present
or is corrupt we aren't ever going to get passed it to get to the
pg_stop_backup() point. Hence an archive recovery can never be valid in
that case. I'll do a separate patch for that because they are unrelated
issues.

Well, we already don't use the secondary checkpoint if a backup label file is present. And you can take a base backup without pg_start_backup()/pg_stop_backup() if you shut down the system first (a cold backup).

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches

Reply via email to