[ On Thursday, June 20, 2002 at 22:17:55 (+0200), Ragnar Kjørstad wrote: ] > Subject: Re: Backing up PostgreSQL? > > On Thu, Jun 20, 2002 at 01:11:09PM -0400, Greg A. Woods wrote: > > [ On Wednesday, June 19, 2002 at 23:53:14 (+0200), Ragnar Kjørstad wrote: ] > > > Subject: Re: Backing up PostgreSQL? > > > > > > By this definition postgresql is consistant at all times > > > > That's simply not possible to be true. PostgreSQL uses multiple files > > in the filesystem namespace to contain its data -- sometimes even > > multiple files per table. It is literally impossible, with the POSIX > > file I/O interfaces, to guarantee that concurrent writes to multiple > > files will all complete at the same time. Remember I've only been > > talking about the backed up files being in a self-consistent state and > > not requiring roll-back or roll-forward of any transaction logs after > > restore. > > Puh - we've been through this already! Postgresql doesn't need this > guarantee, because it writes to it's log to avoid this very problem!
You haven't been paying attention to what I've been saying: Remember I've only been talking about the backed up files being in a self-consistent state and not requiring roll-back or roll-forward of any transaction logs after restore. In order to have consistent (from a database and application perspective) files on a filesystem backup of the database files, you MUST -- ABSOLUTELY MUST -- find some mechanism to synchronise between the database and the filesystem for the duration of the time it takes to make the filesystem backp, whether thats the few seconds it takes to create a "snapshot", or the many minutes it takes to make a verbatim backup using 'dump' or whatever. If all contingencies are to be covered the only sure way to do this is to shut down the database engine and ensure it has closed all of its files. There must be no writes to the filesystem by the database process(es) during the time the backup or snapshot being made. None. Not to the database files, nor to the transaction log. None whatsoever. With great care and a separate backup of the database transaction log taken after the filesystem backup it may be possible to re-build database consistency after a restore, but I wouldn't ever want to risk having to do that in a disaster recovery scenario. I would either want a guaranteed self-consistent filesystem copy of the database files, or a properly co-ordinated pg_dump of the database contents (and preferrably I want both, and both representing the exact same state, though here there's more leeway for using a transaction log to record db state changes between one form of backup and the other :-) In the end if your DBA and/or developers have not taken into account the need to shut down the database for at least a short time on a regular basis in order to obtain good backups then you may have more serious problems on your hands (and you should find a new and more experienced DBA too! ;-). The literature is full of ways of making secure backups of transactions -- but such techniques need to be considered in the design of your systems. For example it's not unheard of in financial applications to run the transaction log direct to a hardware-mirrored tape drive, pausing the database engine (but not stopping it) only long enough to change tapes, and immediately couriering one tape to a secure off-site location. Full backups with the database shut down are then taken only on much longer intervals and disaster recovery involves replaying all the transactions since the last full backup. There are also tricks you can play with RAID-1 which are much faster and potentially safer than OS-level filesystem snapshots (and of course such tricks don't require snapshot support, which is still relatively rare and unproven in many of the systems where it is available). These tricks allow you to get away with shutting down the database only so long as it takes to swap the mirror disk from one set to another, at which point you can make a leisurely backup of the quiescent side of the mirror. Then the RAID hardware can do the reconstruction of the mirror while the database remains live. > Here is the close-code on linux: As far as I know the Linux kernel does not have support for anything quite exactly resembling the Unix filesystem. Your analysis of the code you quoted would seem to confirm that too. :-) > Just to make sure there is no (additional) confusion here; what I'm > saying is: > 1. Meta-data must be updated properly. This is obvious and > shouldn't require futher explanation... > 2. non-journaling filesystems (e.g. ext2 on linux) do update > the inode-metadata on fsync(), but they do not update the > directory. The Unix Fast File System is not a log/journaling filesystem. However it does not suffer the problems you're worried about. Wasn't this question originally about FreeBSD anyway? -- Greg A. Woods +1 416 218-0098; <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]> Planix, Inc. <[EMAIL PROTECTED]>; VE3TCP; Secrets of the Weird <[EMAIL PROTECTED]>