Re: [RLUG] Backup features

Sebastian Smith Tue, 26 Jul 2005 10:59:56 -0700

I've got an "alpha" version of my script written in bash (I'll be changingthe language soon) that copies any number of local source files to anynumber of local destination files, and allows for exceptions. On *nixsystems this works great because it can backup open files.Unfortunately, open files in Windows cannot be accessed without expensivesoftware -- I'm backing up our current (soon to be replaced with Linux)Windows file server over SMB, and all open files are being ignored.So... I'm going to define "portability" as "it will work on *nix".

As far as snapshots go: I'm thinking of offering limited snapshotcapability in the script so that data files that depend upon each othercan be frozen. Extending these features to the entire filesystem will notwork efficiently, or... well, at all. Full filesystem snapshots can betaken, as James suggested, with aid from the Kernel and LVM2. So, if asnapshot is required before backup, the system can be scripted to take thesnapshot, then execute my scripts -- but, again, custom serverconfiguration will be required to use this feature. I have no info onperformance of the LVM2 snapshot method... but the workstations for therobotics lab just arrived, so I'll test it.


Any ideas on a fast method for freezing a snapshot of individual files?

- Sebastian


On Mon, 25 Jul 2005, Sebastian Smith wrote:

First off... thanks for all the input. I need to research a couple of thingsbefore I dive too deep into this project ;)
The rest of my response is inline:


On Sun, 24 Jul 2005, Brian Chrisman wrote:
James Washer wrote:
The big issues with backups IMHO is getting a consistent snapshot. If youstart copying a large data file, and it changes after you've copied thebeginning, you're screwed.
An exact snapshot will be one of the hardest qualities to achieve. It's notmy intension to modify the current rsync algorithm to add this feature (if itisn't already part of the system... I don't know all the details of the rsyncalgorithm), so I may have to develop a method for freezing a snapshot of afile. Anyone know of any good algorithms?
Other backup schemes I've used involved BCL's (Block changed logging),that allow the admin to only backup up those blocks that have been updatedsince the last master. This is a GREAT space saver for large data farmswith low update rates. I tend to work on larger multi-terrabyte systems,so minimizing backup times is very important.
Rsync uses a method similar to BCL's insofar as it will only sync changedfiles -- based on modification time and size, or a file checksum. ClearlyBCL's are considerably lower in filesystem abstraction "level" (andconsiderably faster), but I'd prefer to stick to a userland process operatingon top of the complete filesystem abstraction for portability reasons (andease of programming).
It's not my intension to create a complete enterprise backup solution... atleast not at this stage. I just need to backup my user's files, databases,etc when they go home at 5 every night. But, simultaneously, I'd like abackup system that is flexible enough to create hourly incremental backups,backup to any media I want at any time I want, and is scriptable. So...prehaps it hits in the "middle ground" -- it's not ntbackup or veritas, butsomewhere inbetween.
Are you planning on your backup scheme being able to handle "hot" backups?
Yes... but functionality will be limited. I'm probably going to make use ofthe Ostrich Algorithm -- stick my head in the sand, and pretend the problemdoesn't exist -- when it comes to filesystem locks, databases, and othercomplex backup issues. At least for the prototype. Again, this really hitsthe portability aspect of the code IMHO. If I bloat it too much it becomesdifficult to modify, and will only work with specific systems.
I guess what I'm trying for is a "core" backup utility that can performadvanced backup operations -- like incremental backups -- very well, but thatcan be easily supplimented for your specific installation via scripting,modification of code, or addon software.
This makes particular sense in open source software. With proprietarybackup solutions, I always leaned against block level incrementals such asprovided by veritas, because without their software, there was pretty muchno way to recover your data... This meant that things like licensemanagement or lame bugs could end up biting you in the butt at the veryworst possible time. There have always been file/tar-based products outthere, but like you said, not so efficient.
Great point Jim! This is by far my most hated feature of proprietary backupsoftware. It costs $1000 for the software, and I have to have it loaded toget to my files?! Ridiculous! This will not be a feature of my software.By default all files will be stored as they appear in your filesystem. Ofcourse, you'll have the options to compress, encrypt, and do whatever elsepeople do to files these days. Due to this storage method, for example, if/etc and /home are backed up to a USB hard disk you could boot from a livedisk and configure it to serve your files until your server is back online.
Thanks again for all of the ideas!  Keep them coming!

- Sebastian

_______________________________________________
RLUG mailing list
[email protected]
http://lists.rlug.org/mailman/listinfo/rlug


_______________________________________________
RLUG mailing list
[email protected]
http://lists.rlug.org/mailman/listinfo/rlug

Re: [RLUG] Backup features

Reply via email to