On 26.04.2013 12:16, Simon Riggs wrote:
On 18 April 2013 19:11, Heikki Linnakangas<hlinnakan...@vmware.com> wrote:
I just found out that if you use continuous archiving and online backups,
it's surprisingly difficult to restore a backup, without replaying any more
WAL than necessary.
I didn't add it myself because I don't see the need, if we think more carefully.
Why would you want your recovery end time to be governed solely by the
time that the *backup* ended? How can that have any bearing on what
you want at recovery time? If you have access to more WAL data, why
would you not apply them as well - unless you have some specific
reason not to - i.e. an incorrect xid or known problem time?
If you're storing only a few of the WAL files with the backup then it
will end naturally without assistance when the last file runs out.
What is the difference between stopping at an exact point in WAL half
way through a file and ending at the end of the file? If the end point
is arbitrary, why the need to specify it so closely?
I can't see a time when I have access to more WAL files *and* I want
to stop early at some imprecise point. But you could write a
restore_command script that stopped after a specific file forcing
recovery to end.
Well, I ran into this with VMware's Data Director, which manages backups
among other things. In a typical setup, you have a WAL archive, and
every now and then (daily, typically) a full backup is taken. Full
backups are retained for some time, like a few weeks or months. The user
can also manually request a full backup to be taken at any time.
There is an option to perform PITR. The system figures out the latest
full backup that precedes the chosen point-in-time, sets
recovery_target_time, and starts up Postgres. But there is also an
operation to simply "restore a backup". The idea of that is to, well,
restore to the chosen backup, and nothing more. In most cases, it
probably wouldn't hurt if a one or two extra WAL files are replayed
beyond the backup end time, but you certainly don't want to replay all
the history. Yes, you could set recovery_target_time to the point where
the backup ended, but that's complicated. You'd have to read the
end-of-backup timestamp from the backup history file. And because
timestamps are always a bit fuzzy, I think you'd have to add at least a
few seconds to that to be sure.
To illustrate why it would be bad to replay more WAL than necessary,
imagine that the user is about to perform some dangerous action he might
want to undo later. For example, he's about to purge old data that isn't
needed anymore, so with "DELETE FROM data WHERE year <= '2010'". The
first thing he does is to take a backup with label
"before-purging-2010". Immediately after the backup has finished, he
performs the deletion. Now, the application stops working because it
actually still needs the data, so he restores from the backup. If
recovery decides to replay a few more WAL files after the end-of-backup,
that could include the deletion, and that's no good.
One solution is to create restore point after the backup ends. Then you
have a clearly defined point in time you can restore to. But it would be
convenient to not have to do that. Or another way to think of this is
that it would be convenient if there was an implicit restore point at
the end of each backup.
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers