On 18.10.2010 01:48, Jeff Davis wrote:
On Fri, 2010-10-15 at 15:58 -0700, Jeff Davis wrote:
I don't have a fix yet, because I think it requires a little discussion.
For instance, it seems to be dangerous to assume that we're starting up
from a backup with access to the archive when it might have been a crash
of the primary system. This is obviously wrong in the case of an
automatic restart, or one with no restore_command. Fixing this issue
might also remove the annoying "If you are not restoring from a backup,
try removing..." PANIC error message.

Also, in general we should do more logging during recovery, at least the
first stages, indicating what WAL segments it's looking for to get
started, why it thinks it needs that segment (from backup or control
data), etc. Ideally we would verify that the necessary files exist (at
least the initial ones) before making permanent changes. It was pretty
painful trying to work backwards on this problem from the final
controldata (where checkpoint and prior checkpoint are the same, and
redo is before both), a crash, a PANIC, a backup_label.old, and not much
else.


Here's a proposed fix. I didn't solve the problem of determining whether
we really are restoring a backup, or if there's just a backup_label file
left around.

I did two things:
   1. If reading a checkpoint from the backup_label location, verify that
the REDO location for that checkpoint exists in addition to the
checkpoint itself. If not, elog with a FATAL immediately.

Makes sense. I wonder if we could just move the rename() after reading the checkpoint record?

   2. Change the error that happens when the checkpoint location
referenced in the backup_label doesn't exist to a FATAL. If it can
happen due to a normal crash, a FATAL seems more appropriate than a
PANIC.

I guess, although it's really not appropriate that the database doesn't recover after a crash during a base backup.

I still think it would be nice if postgres knew whether it was restoring
a backup or recovering from a crash, otherwise it's hard to
automatically recover from failures. I thought about using the presence
of recoveryRestoreCommand or PrimaryConnInfo to determine that. But it
seemed potentially dangerous if the person restoring a backup simply
forgot to set those, and then it tries restoring from the controldata
instead (which is unsafe to do during a backup).

Right, that's not good either.

One alternative is to not remove any WAL files during a base backup. The obvious downside is that if the backup takes a long time, you run out of disk space.

The fundamental problem is that by definition, a base backup is completely indistinguishable from the data directory in the original server. Or is it? We recommend that you exclude the files under pg_xlog from the backup. So we could create a "pg_xlog/just_kidding" file along with backup_label. When starting recovery, if just_kidding exists, we can assume that we're doing crash recovery and ignore backup_label.

Excluding pg_xlog is just a recommendation at the moment, though, so we would need a big warning in the docs. And some way to enforce that just_kidding is not included in the backup would be nice, maybe we could remove read-permission from it?

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Reply via email to