[HACKERS] PITR Backup state validity checking

Simon Riggs Sun, 25 Jul 2004 13:56:27 -0700

Problem Summary (from previous posts)
The archive recovery must stop AFTER the end of the backup which the
recovery used as its "starting point". If not, incorrect database states
are likely.


In general, this is a small window for error and procedures should exist
to return to the prior backup. Nonetheless, this check should be made
i.e. stop time/point > backup end time

Solution Design:
Before a backup is taken, write a file to data directory that identifies
which backup this is. When the backup is taken this file will be copied
with the backup, and later restored when the backup is restored.
When backup completes write a file to xlog directory that contains the
start backup identifier and the end time. When recovery occurs the
backup identifier can be used to find the end backup file and read this
to find the end backup time.

Additional aspects:
- Can't assume that archive allows direct access, so anything written to
log must be read in sequential order it was written. 
- Backup may be taken when postmaster is down, so solution must not rely
on postmaster being up.
- It *is* posssible to do incremental backups, as long as the backup
checks each file's change data against files already archived (or on
write-once media). The previously backed-up files are thus able to be
considered as being part of *this* backup as well as the one in which
the backup took place. (So we still write start now and end shortly
afterwards).
- We want to offer the user an interface now, so that when later changes
occur, we will not be requiring them to change again.

Implementation Options:
------------------------
User Interface
Two user interfaces have been suggested:
- Write a server function which can be called from anywhere...
- Write an external program

External program will still work when postmaster is down, so is the
option suggested in further detail here...

- Call Sequence
It has been suggested that there should be an "API call" issued before
and after the backup. That requires the user to issue 3 calls in
sequence to get a correct backup.
- For full backups taken all at once, a single call is desirable,
ensuring that no API call was missed.

Implementation Design
----------------------
Implement an external program, called pg_backup. (I guess there's some
historical baggage there, but may be time to leave that behind now....)

pg_backup will do:
1. If postmaster is up, issue a manual CHECKPOINT
2. Write a file called backup_start_<backupid>.info
where <backupid> is the time when backup starts
contains: systemid, time(now)
3. Remove all previous backup_start*.info files
4. Issue the users backup_command via system(3)
5. Write a second file called backup_end_<backupid> to pg_xlog AND write
a backup_end_<backupid>.ready to archive_status. 
backup_end_<backupid> contains: systemid, time of backup end
backup_end_<backupid>.ready is empty

Other changes:
- Alter archiver to always archive backup_end* files first, so they are
written to archive in time sequence order.
- Alter recovery so that it requests backup_end_<backupid> first. We
then read the time in this file and compare with our end time, if there
is one. If there is and we fail the > test, we stop. If no target time
exists, we rollforward though can still fail the test at our selected
stopping point (an Xid).
- If recovery ends at an Xid, but when this is reached we are still less
than backup end time, then we alter our target to being the backup end
time (inclusive) and continue to roll forward. WARNING issued.
- If recovery ends before it has read the backup_end* file then we issue
a WARNING error saying "recovery using incomplete backup", HINT:"you
will need to start recovery from the next earliest backup". (Later
change this to an ERROR and add an option to override and ignore it, for
when you're really up to your neck in it)

pg_backup -opts [BACKUP COMMAND]
opts:
-D      data directory (defaults to PGDATA)

usage examples:
pg_backup tar zcvhf /dev/rmt0 $PGDATA
uses PGDATA to identify data directory, then creates a tape archive on
the default tape device

pg_backup write_to_BAR_system p1 p2 p3

Not hugely happy with the above. I'm sure someone will come up with a
few streamlining comments, eh?

I'd certainly prefer a solution that involved writing WAL records to
indicate start and end, which seems cleaner and more integrated.
However, we need to be able to cater for cold/offline backups.

Best Regards, Simon Riggs



---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

[HACKERS] PITR Backup state validity checking

Reply via email to