Alan Altmark wrote:
> On Monday, 07/24/2006 at 06:35 ZE2, Carsten Otte <[EMAIL PROTECTED]> wrote:
>>> But rather than focus on that "edge" condition, we are all, I think,
> in
>>> violent agreement that you cannot take a volume-by-volume physical
> backup
>>> from outside a running Linux system and expect to have a usable
> backup.
>> That is a wrong assumption, I clearly disagree with it. If planned
>> proper, and I agree that there are lots of things one can do wrong
>> when planning the setup, physical backup of mounted and actively used
>> volumes _is_ reliable.
>
> But you are making assumptions about the applications, something I am not
> willing to do quite yet.  If a database update requires a change to the
> data file, the index file, and the log file, how do you (from the outside)
> know that all changes have been made and that it is safe to copy them? And
> that another transaction has not started?
As for the first part of the question: doing "sync" after the update
ensures that everything relevant has been flushed out to disk proper.
If another transaction has been started, fine. I expect the database
to be capable of rolling back the transaction after restore. That
brings things to the same situation as if I was doing the backup
before the transaction.

> From my days as a database application developer, a the transaction
> journal was meant to be replayed against a copy of the database as it
> existed at when the database was started, not replayed against a more
> current snapshot.  I.e. today's log is replayed against last night's
> backup.  And the transaction log is specifically NOT placed on the same
> device as the data itself.  In Linux terms, I guess that means don't place
> it in the same filesystem since that's the smallest consistent unit of
> data, right?  If you lose the data device, you haven't lost a whole day's
> worth of transactions.  (Maybe database technology no longer requires such
> precautions?)
When using snapshots for backup purposes, you would obviously need a
snapshot of both journal and data at the same time. Therefore you
either need to use dm-snapshot if you have data and log on different
devices, or you need to put both on the same device if you want to use
flashcopy.

> So I'll admit that I'm obviously not "getting it".  If you would summarize
> the steps needed to allow a reliable, usable, uncoordinated live backup of
> Linux volumes, I for one would sincerely appreciate it.  How do you
> integrate them into your server?  How do you automate the process?  Right
> now I'm a fan of SIGNAL SHUTDOWN, FLASHCOPY, XAUTOLOG, but that's just
> me...
Please don't get upset, I am doing my best to explain the situation.
You need:
- the capability of getting a consistent snapshot of all data relevant
to a) the file system _and_ b) the application. If the file system or
the data set relevant to the application spans multiple volumes, you
need the capability to snapshot all volumes at the very same time. The
easy way to fullfill this requirement is to use just a single file
system - which can span multiple physical disks in case of dm-snapshot.
- an application that has consistent on-disk data at all times (which
is a basic requirement to any server application)
- a file system that has consistent on-disk data at all times (such is
ext3)
Now you can:
- take a snapshot backup at any time while the server is doing regular
disk-IO
- pull the plug (crash)
- copy the data back to the original disk
- start the snapshot copy of the server and let the file system replay
its journal, then start the application again

cheers,
Carsten

----------------------------------------------------------------------
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Reply via email to