On Thu, Jan 6, 2011 at 9:35 AM, Carl Cook <cac...@quantum-sci.com> wrote:
>
> I am setting up a backup server for the garage, to back up my HTPC in case of 
> theft or fire.  The HTPC has a 4TB RAID10 array (mdadm, JFS), and will be 
> connected to the backup server using GB ethernet.  The backup server will 
> have a 4TB BTRFS RAID0 array.  Debian Testing running on both.
>
> I want to keep a duplicate copy of the HTPC data, on the backup server, and I 
> think a regular full file copy is not optimal and may take days to do.  So 
> I'm looking for a way to sync the arrays at some interval.  Ideally the sync 
> would scan the HTPC with a CRC check to look for differences, copy over the 
> differences, then email me on success.
>
> Is there a BTRFS tool that would do this?

No, but there's a great tool called rsync that does exactly what you want.  :)

This is (basically) the same setup we use at work to backup all our
remote Linux/FreeBSD systems to a central backups server (although our
server runs FreeBSD+ZFS).

Just run rsync on the backup server, tell it to connect via ssh to the
remote server, and rsync / (root filesystem) into /backups/htpc/ (or
whatever directory you want).  Use an exclude file to exclude the
directories you don't want backed up (like /proc, /sys, /dev).

If you are comfortable compiling software, then you should look into
adding the HPN patches to OpenSSH, and enabling the None cipher.  That
will give you 30-40% network throughput increase.

After the rsync completes, snapshot the filesystem on the backup
server, using the current date for the name.

Then repeat the rsync process the next day, into the exact same
directory.  Only files that have changed will be transferred.  Then
snapshot the filesystem using the current date.

And repeat ad nauseum.  :)

Some useful rsync options to read up on:
  --hard-links
  --numeric-ids
  --delete-during
  --delete-excluded
  --archive

The first time you run the rsync command, it will take awhile, as it
transfers every file on the HTPC to the backups server.  However, you
can stop and restart this process as many times as you like.  rsync
will just pick up where it left off.

> Also with this system, I'm concerned that if there is corruption on the HTPC, 
> it could be propagated to the backup server.  Is there some way to address 
> this?  Longer intervals to sync, so I have a chance to discover?

Using snapshots on the backup server allows you to go back in time to
recover files that may have been accidentally deleted, or to recover
files that have been corrupted.

Be sure to use rsync 3.x, as that will start transferring data a *lot*
sooner, shortening the overall time needed for the sync.  rsync 2.x
scans the entire remote filesystem first, builds a list of files, then
compares that list to the files on the backup server.  rsync 3.x scans
a couple directories, then starts transferring data while scanning
ahead.

Once you have a working command-line for rsync, adding it to a script
and then using cron to schedule it completes the setup.

Works beautifully.  :)  Saved our bacon several times over the past 2 years.
-- 
Freddie Cash
fjwc...@gmail.com
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to