Re: incremental backups howto?

2004-12-25 Thread Steve Block
On Fri, Dec 24, 2004 at 11:40:10PM -0500, Adam Aube wrote:
 Joao Clemente wrote:
 
  In the latest thread about Synchronize two servers it was
  talked about incremental backups. Well, can you quick-start me
  in this topic?
 
 An incremental backup is done by backing up all files that have changed
 since the last full or incremental backup. How this file list is tracked
 depends on the backup program used (some might use a filesystem flag,
 others might use modification timestamps).
 
 The downside of incremental backups is that, to do a full restore, you need
 the last full backup and ALL the incremental backups since the last full.
 
 A better alternative is a differential backup, which is all files that have
 changed since the last full backup. This is much easier to restore, because
 all you need is the last full backup and the last differential backup.
 
  What do you say?
 
 Another interesting approach is that taken by tools such as dirvish or
 rsnapshot. Both of these tools use rsync to capture snapshots of a
 filesystem (either local or remote) to disk. Within the backup archives,
 files that have not changed between snapshots are hard linked.
 
 This gives the completeness and ease of restoration of full backups without
 requiring nearly as much space to store data. To restore, just copy back
 the desired snapshot.
 
 This is all for general filesystem backup. For databases, check the
 documentation to see what the recommended backup method is.
 
 Adam
 
I use something like this for my own backups. I have a large number of
files on a server which I keep backed up on another machine (the backups
have saved my ass more than once). Later versions of rsync support
automatically making hard links to unchanged files, which saves a lot of space.

What I have done is set up a special backup account on one machine and build
an passwordless ssh key that allows that machine access to the server.
Obviously there are security issues there but I'm taking a calculated risk, as
I want the backups to run from cron. There are ways to make sure that the
backup user can be restricted to only specific processes, and I think
Google can help with that.

I wrote a fairly simple bash script that creates a backup of my home folder
on the server to a folder named with the server name and the backup date. The
script runs from cron every day, and keeps one week's worth of backup folders.
It creates weekly backups as well, and keeps a certain number of those. It
similarly has montly folders. Since it uses hard links, the backup takes only
about 10% more space than any given revision, but allows me to step back a
number of days to fix something that was got broken (last time was a heavily
customized php file that I foolishly overwrote).

The script follows. I hope someone finds it useful.

#!/bin/sh

# Incremental backup script for bash, based on rsync, syncs files on server 
# to this machine. One sync is made every night, incrementals are handled
# with hardlinks to unchanged files.
#
# Once a week the newest daily snapshot becomes the newest weekly snapshot,
# and once a month the newest weekly snapshot becomes the newest monthly
# snapshot, and we will hold three months of backups. Becuase of the hardlinks
# we should be able to keep increments without wasting much more space than
# a simple full backup would already take.

# Start by setting variables: current month, dead month, current week,
# dead week, current day, and dead day
MONTH=server.monthly.`date +%G-%m`
WEEK=server.weekly.`date +%G-%V`
DAY=server.daily.`date +%G-%m-%d`
YESTERDAY=server.daily.`date -d -1day +%G-%m-%d`
DEADMONTH=server.monthly.`date -d -3month +%G-%m`
DEADWEEK=server.weekly.`date -d -4week +%G-%V`
DEADDAY=server.daily.`date -d -7day +%G-%m-%d`

# Rotate the daily backup files. Start by tossing the latest dead file,
# and then create the latest backup with rsync, using hard links.
# This happens every day
if [ -d /home/backup/$DEADDAY ]; then
   rm -rf /home/backup/$DEADDAY
fi
rsync -plrtvz --delete --rsh='ssh -c blowfish' --ignore-errors --stats 
--progress --link-dest=/home/backup/$YESTERDAY [EMAIL PROTECTED]:/home/user/ 
/home/backup/$DAY/

# Check if it is Saturday. If so, rotate the weekly backups. Start by tossing
# the latest dead file, and then copy the latest daily snapshot to the
# weekly snapshot file
if [ `date +%u` = 6 ]; then
   if [ -d /home/backup/$DEADWEEK ]; then
  rm -rf /home/backup/$DEADWEEK
   fi
   cp -al /home/backup/$DAY /home/backup/$WEEK
fi

# Check if it is the first of the month. If so, rotate the monthly backups.
# Start by tossing the latest dead file, and then copy the latest daily
# Snapshot to the monthly snapshot file.
if [ `date +%d` = 1 ]; then
   if [ -d /home/backup/$DEADMONTH ]; then
  rm -rf /home/backup/$DEADMONTH ];
   fi
   cp -al /home/backup/$DAY /home/backup/$MONTH
fi


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED] 
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



incremental backups howto?

2004-12-24 Thread Joao Clemente
Hi people. In the latest thread about Synchronize two servers it was 
talked about incremental backups. Well, can you quick-start me in this 
topic?

I have the notion of what an incremental backup is... it would be 
keeping the delta from the last state, but how is this done in 
practice? Is it done with tar with some flag? Can we compress the 
backups (tar.[gz|bz2]) or it needs to be uncompressed to create the 
delta? Can we do a 2nd_delta from a backup + 1rst_delta? Or the 
2nd_delta is created directly from backup therefore overriding the 
need for the 1rst_delta?
Maybe its not even done with tar... I keep thinking on diff's and 
patch's but maybe its not the same here.

If one needs the (uncompressed) initial tar available from a backup to 
find the delta, this means we need a 2xN disk, where N is the info 
we have in disk.. so you could only do incremental backups in a disk 
with less than 50% occupation.. rigth or wrong?

What do you say? Pointer to right commands/howto's? Thanks
Joao Clemente
--
To UNSUBSCRIBE, email to [EMAIL PROTECTED] 
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: incremental backups howto?

2004-12-24 Thread Alvin Oga

hi ya joao

On Sat, 25 Dec 2004, Joao Clemente wrote:

 I have the notion of what an incremental backup is... it would be 
 keeping the delta from the last state, but how is this done in 
 practice?

in real life ... very few people check that backup does in fact work
by trying to restore ALL their data on a virgin disk 
- in doing that simple rebuild, you will find all kinds
of problems with ones current backup scheme, or at least
things you care about

 Is it done with tar with some flag? Can we compress the 
 backups (tar.[gz|bz2])

i always use tar zcvf backup.date.tgz  list of files

- some folks do NOT like compressed backups

- warm backup servers, like carl's initial post, cannot use tgz files

- but for sanity sake, to know what changed on what day,
it's always a good idea to keep a log of what changed when
( separates teh men from the boyz )

 or it needs to be uncompressed to create the 
 delta? Can we do a 2nd_delta from a backup + 1rst_delta? Or the 
 2nd_delta is created directly from backup therefore overriding the 
 need for the 1rst_delta?

separateing the men from the boyz
- you always backup from the last full backup
and you UNCONDITIONALLY GUARANTEE that full backups 
does in fact worked

- i use 3 methods to recreate a full backup at any given point 
time  ( daily backups, weekly incrementals, monthly incrementals)

- incrementals used to recreate full backups untill today
from a working good full backup 3 months ago ..
when you just found out today, that last 2 monyhs of full
backups was never done for goffy reasons but
incrementals was working .. etc
( or incrementals died but full backups working )

- or if both full and incremental backups failed,
since somebody pulled the power plug or the disks crashed
or ??? ... gazillion ways for backups to fail since nobody
watch it constantly

 Maybe its not even done with tar... I keep thinking on diff's and 
 patch's but maybe its not the same here.

rdiff backups is good .. 
- it'd be super fast to only backup the changed inode
( sorta bleeding edge stuff in linux land )

- remembering, that backups has to be 100% guaranteed that
you can recover corp data at any date in the past

 If one needs the (uncompressed) initial tar available from a backup to 
 find the delta, this means we need a 2xN disk, where N is the info 
 we have in disk.. so you could only do incremental backups in a disk 
 with less than 50% occupation.. rigth or wrong?

depends on data types
- some data... you can compress 10:1

- other data, you cannot compress any more
( *.bz or *.mpeg or *.lib would be hard to compress further )

- most people do NOT have 100% disk utilizations

- usually people have older smaller capacity disks
than what is currently available today for a reasonable
backup budget

- i can usually store 6months - 1 year of full backups
for all servers 
A backups to B
B backups to C
C backups to D

A, B, C backups to (warm,live) MasterBackup

 What do you say? Pointer to right commands/howto's? Thanks

some/lots of the backup examples
Linux-backup.net

- easiest backup ( run it from cron )
-- always manually mount the backups ...

- it avoids the rm -rf / that your backup will
be left unaffected since it's not automounted

#
# last 8 days of changes, do it daily
#
mount /mnt/BACKUP ; tar zcvf /mnt/BACKUP/datecode.daily.tgz \
` find /etc /home -mtime -8 -print ; umount /mnt/BACKUP

#
# last 90 days of changes, do it weekly
mount /mnt/BACKUP; tar zcvf /mnt/BACKUP/datecode.weekly.tgz 
` find /etc /home -mtime -95 -print ; umount /mnt/BACKUP

- use a script to exclude and filter out crap you
do not want in the backups


- pick your backup media
- backup to tape
- backup to disks
- backup to cdrom/dvd
- backup to laptop/palmtop or vice versa

- understand how your backups will fail
- disk crash
- bad nic card
- bad hardware ( bad memory, bad disk, bad computer room policy)
- janitor/you pulling and wiggling the plug

- choose what you want to backup
- /etc ?? /home ??
- windoze backup to linux or linux mount and backup windoze shares
...

- know how to restore a full system from backups only
  onto a brand new disk or raid array

- do not emulate or pretend that backups works
if i did this and that ... do it for real
and diff that disk against the original it supposed
to represent and be a warm backup of the master

- use a backup method that you 

Re: incremental backups howto?

2004-12-24 Thread Adam Aube
Joao Clemente wrote:

 In the latest thread about Synchronize two servers it was
 talked about incremental backups. Well, can you quick-start me
 in this topic?

An incremental backup is done by backing up all files that have changed
since the last full or incremental backup. How this file list is tracked
depends on the backup program used (some might use a filesystem flag,
others might use modification timestamps).

The downside of incremental backups is that, to do a full restore, you need
the last full backup and ALL the incremental backups since the last full.

A better alternative is a differential backup, which is all files that have
changed since the last full backup. This is much easier to restore, because
all you need is the last full backup and the last differential backup.

 What do you say?

Another interesting approach is that taken by tools such as dirvish or
rsnapshot. Both of these tools use rsync to capture snapshots of a
filesystem (either local or remote) to disk. Within the backup archives,
files that have not changed between snapshots are hard linked.

This gives the completeness and ease of restoration of full backups without
requiring nearly as much space to store data. To restore, just copy back
the desired snapshot.

This is all for general filesystem backup. For databases, check the
documentation to see what the recommended backup method is.

Adam


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED] 
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: incremental backups howto?

2004-12-24 Thread Ron Johnson
On Fri, 2004-12-24 at 23:40 -0500, Adam Aube wrote:
 Joao Clemente wrote:
 
  In the latest thread about Synchronize two servers it was
  talked about incremental backups. Well, can you quick-start me
  in this topic?
 
 An incremental backup is done by backing up all files that have changed
 since the last full or incremental backup. How this file list is tracked
 depends on the backup program used (some might use a filesystem flag,
 others might use modification timestamps).
 
 The downside of incremental backups is that, to do a full restore, you need
 the last full backup and ALL the incremental backups since the last full.
 
 A better alternative is a differential backup, which is all files that have
 changed since the last full backup. This is much easier to restore, because
 all you need is the last full backup and the last differential backup.

Note that differential is not a universally-used term.  For
example, the VMS BACKUP /INCREMENTAL is what you call a differ-
ential backup.

-- 
-
Ron Johnson, Jr.
Jefferson, LA USA
PGP Key ID 8834C06B I prefer encrypted mail.

Basically, I got on the plane with a bomb. Basically, I tried to
ignite it. Basically, yeah, I intended to damage the plane.
RICHARD REID, tried to blow up American Airlines Flight 63



signature.asc
Description: This is a digitally signed message part