On Thu, Sep 15, 2011 at 10:29 PM, lars hofhansl <[email protected]> wrote: > So you'd create a manifest to avoid copying newer files? That would lock the > earliest recovery time to when we start the copy (not when we end it), which > is nice. What about compactions? They'd remove some of the old files and the > data is now in new files. >
Either move them to an archive or rename them with '.del' and let the copy process clean them up when done? > Copying the WALs should really be a separate process, I think. The base > backup would maybe not even copy them. > The scenario I have in mind is where one would take a base backup whenever it > makes sense (once a day, a week, a month) > and always archive all WALs. The last base backup with WALs could be on disks > and previous ones might be spooled to tape. > Tape? Whats that? WALs are not compressed. Archiving could compress them. They should compress well. > Maybe I am just dreaming, but then one could restore a base backup, and > replay the necessary WALs to bring the state to *any* given point in time. > Yeah. I'd like this. > > Then one of the harder issues - as you say - would be to associate the > correct WALs with the snapshot. > I wonder how precise we really need to be, though. As every edit is > timestamped - in theory - WAL replay would be idempotent. > Roughly, yes. Counters would be a pain if we double-counted. > > And what about new tables? .META. changes are probably logged, but replaying > those we'd somehow need to create the tables. > And then the .META. changes have to be in strict order w.r.t. to the other > WAL replays. > Hmm. > Are the sequenceIds globally ordered? I wonder if they even need to be. > They are not globally ordered. They are in order within the WAL. St.Ack
