>    Remember I've only been talking about the backed up files being in a
>    self-consistent state and not requiring roll-back or roll-forward of
>    any transaction logs after restore.

If you don't want your database to do roll-back or roll-forward
backup-snapshots will work for you. Neither will power-failures. And in
theory a clean shutdown could also leave the database in a state where
it required roll-forward on restart, but my guess is that's not the
case.


> In order to have consistent (from a database and application
> perspective) files on a filesystem backup of the database files, you
> MUST -- ABSOLUTELY MUST -- find some mechanism to synchronise between
> the database and the filesystem for the duration of the time it takes to
> make the filesystem backp, whether thats the few seconds it takes to
> create a "snapshot", or the many minutes it takes to make a verbatim
> backup using 'dump' or whatever.  If all contingencies are to be covered
> the only sure way to do this is to shut down the database engine and
> ensure it has closed all of its files.  There must be no writes to the
> filesystem by the database process(es) during the time the backup or
> snapshot being made.  None.  Not to the database files, nor to the
> transaction log.  None whatsoever.

There are no writes to the filesystem at all while the snapshot is in
progress. The LVM-driver will block all data-access until it is
completed. If not, it wouldn't be a snapshot, would it?

> With great care and a separate backup of the database transaction log
> taken after the filesystem backup it may be possible to re-build
> database consistency after a restore, but I wouldn't ever want to risk
> having to do that in a disaster recovery scenario.  I would either want
> a guaranteed self-consistent filesystem copy of the database files, or a
> properly co-ordinated pg_dump of the database contents (and preferrably
> I want both, and both representing the exact same state, though here
> there's more leeway for using a transaction log to record db state
> changes between one form of backup and the other :-)

Huh? Why would you want a seperate backup of the database transaction
log? The log is stored in a file together with the rest of the database,
and will be included in the snapshot and the backup.

If you didn't backup the transaction-log at the exact same time you
backed up the rest of the files it would not work - it would be
inconsistent and there could be data missing.

> In the end if your DBA and/or developers have not taken into account the
> need to shut down the database for at least a short time on a regular
> basis in order to obtain good backups then you may have more serious
> problems on your hands (and you should find a new and more experienced
> DBA too! ;-).  The literature is full of ways of making secure backups
> of transactions -- but such techniques need to be considered in the
> design of your systems.  For example it's not unheard of in financial
> applications to run the transaction log direct to a hardware-mirrored
> tape drive, pausing the database engine (but not stopping it) only long
> enough to change tapes, and immediately couriering one tape to a secure
> off-site location.  Full backups with the database shut down are then
> taken only on much longer intervals and disaster recovery involves
> replaying all the transactions since the last full backup.  There are
> also tricks you can play with RAID-1 which are much faster and
> potentially safer than OS-level filesystem snapshots (and of course such
> tricks don't require snapshot support, which is still relatively rare
> and unproven in many of the systems where it is available).  These
> tricks allow you to get away with shutting down the database only so
> long as it takes to swap the mirror disk from one set to another, at
> which point you can make a leisurely backup of the quiescent side of the
> mirror.  Then the RAID hardware can do the reconstruction of the mirror
> while the database remains live.

Splitting a RAID-1 has the exact same properties that a filesystem
snapshot (like if the filesystem is online it must be done atomicly or
with the access to the device blocked (this is just another way of
saying the database is paused)). The only differences are operational,
like:
* splitting the mirror is faster (say 1ms instead of 1s)
* a mirror requires 2x data-size, while snapshots will require less
  unless all the data is changed during backup
* a mirror will have to be resyncronized
* a split mirror has no write-penality


> > Just to make sure there is no (additional) confusion here; what I'm
> > saying is:
> > 1. Meta-data must be updated properly. This is obvious and 
> >    shouldn't require futher explanation...
> > 2. non-journaling filesystems (e.g. ext2 on linux) do update
> >    the inode-metadata on fsync(), but they do not update the
> >    directory. 
> 
> The Unix Fast File System is not a log/journaling filesystem.  However
> it does not suffer the problems you're worried about.

That depends on what UFS-variation you're refering to, but it may very
well be true for some. (e.g. UFS in solaris8 does optional logging)


> Wasn't this question originally about FreeBSD anyway?

I suppose it was.


-- 
Ragnar Kjørstad
Big Storage

Reply via email to