Re: Backup Restore (Was: Re: Getting started with Tarsnap)

2014-02-14 Thread Nick Sivo
> I can't recommend Tarsnap to others as a viable primary backup tool.

Primary backups should always be on-premises, no? Nothing will be
faster than locally attached storage.

-Nick

On Fri, Feb 14, 2014 at 2:02 PM, Scott Wheeler  wrote:
> On Feb 13, 2014, at 6:19 PM, Daniel Staal  wrote:
>
>> Tarsnap is *online* backups - it has to download the data to do a restore. 
>> The problem could be your connection, the server, something upstream, etc. 
>> It's possible there's a problem Colin can/should fix, but that can't be 
>> determined from your posting.  We need to know exactly what you are seeing.
>
> This is definitely a Tarsnap issue.  Restores are extremely slow.  I usually 
> see around 1.5 Mbit/s on a 100 Mbit connection, which jives with what Vijay 
> reported (~1.3 Mbit/s).  I brought this up with Colin a couple years ago and 
> he said that it's an issue of "a lack of pipelining in Tarsnap's archive 
> reading".  You can however run multiple extractions in parallel.
>
> This remains my biggest pain point with Tarsnap -- it would still take us an 
> unseemly amount of time to restore our customer data set (~35 GB -- which 
> would take approximately 53 hours to restore without split up the restores) 
> in a disaster scenario.  As such we currently have a snapshot that gets 
> overwritten daily and Tarsnap for the sequential daily backups, but this 
> issue remains the reason that I can't recommend Tarsnap to others as a viable 
> primary backup tool.
>
> -Scott


Re: Automated tarsnap backups.

2014-02-14 Thread Nick Sivo
I perform 4 automated backups per day. I used tarsnap-keymgmt to make
a key with only rw access. My thinking:

* Access to the key (root on the server) implies access to the data on
the server, so read is already granted.
* Write since it's needed to make backups.

In this case the only thing worse than getting root on the box (and
reading the tarsnap key) would be deleting the data *and* all backups,
which this specifically prevents.

-Nick

On Fri, Feb 14, 2014 at 10:43 AM, Joshua Kolash  wrote:
> Curious Question for people who use tarsnap for automated backups.
>
> I assume most people just have the keyfile as unencrypted, as it doesn't
> require any prompting.
>
> Does anyone keep the keyfile encrypted and have automated backups?
>
> I'm imagining the following server setup.
>
> Have a BackupBox with the encrypted keyfile and the backup contents.
>
> Have a PasswordBox with the password to the keyfile and have the PasswordBox
> simply ssh into the BackupBox and enter the password into tarsnap on a
> regular basis. The PasswordBox can then be sealed off except for
> re-initializing the password and ssh schedule. In effect it is like having a
> single purpose ssh-agent that lasts forever for narrowly defined tasks.
>
> Does anyone do anything like this? Or is this needless complexity for little
> if any security gain? You still need to trust BackupBox to not be evil.
>
> As I want automated backups I think the only point to encrypting the keyfile
> would be for the printed paper backup.


Re: Pre-flight check....

2014-02-14 Thread Nick Sivo
I've found tarsnap to be most effective for multiple point-in-time
backups of uncompressed, mostly unchanging data. Any file type is
supported, but alternate file streams or other unique filesystem
specific features may not be supported. If you don't know if you're
using these, then you're not :)

Uncompressed because tarsnap does compression just fine, and
compression of all your data usually achieves a better ratio than
compressing individual files.

Mostly unchanging because tarsnap is smart enough to re-use blocks
it's seen before - I keep 4 backups per day of around 40GB, but each
archive is only an additional 200-400MB (before compression) because
most of the data is unchanged:

root@znews:~ # tarsnap --keyfile ... --print-stats

   Total size  Compressed size
All archives   1.3 TB   310 GB
  (unique data)395 GB31 GB

So despite being able to restore any of the 80 backups in the 1.3TB
set I'm only paying to store 31GB. Win.

-Nick

On Fri, Feb 14, 2014 at 9:22 AM, John Gamble  wrote:
> Hi,
>
> I'm about to try out Tarsnap for the first time this weekend, but before
> doing do, wanted to check on a couple of things.
>
> 1).  Is Tarsnap suitable for _all_ file types, or are there any types of
> files that it has problems with?  I'm intending to back up quite a different
> variety of file types - text, photographs, music, and so on, and thought it
> might be better to ask this question first.
>
> 2).  Does the archiving process ( i.e. use of the 'tar' command) alter the
> original files in any way?  Again, I'm pretty sure it doesn't, but thought
> it better to be safe than sorry.  I was concerned that if, for some reason,
> I couldn't get the archiving to work, it might damage the original files.
>
> Thanks for any advice on these matters.
>
> Regards,
>
> John Gamble
>
>
>
> -- The Wellcome Trust Sanger Institute is operated by Genome Research
> Limited, a charity registered in England with number 1021457 and a company
> registered in England with number 2742969, whose registered office is 215
> Euston Road, London, NW1 2BE.


Re: Using the --snaptime option

2014-01-23 Thread Nick Sivo
I asked Colin about this earlier, and have included his email below.
Since ZFS tracks when the snapshot was taken, I use the following in a
zsh script to create the file:

  snaptimetemp=`mktemp /tmp/snaptime.`
  latest=
  timestamp=`zfs get -Hp -o value creation $latest`
  log "Snapshot creation timestamp: $timestamp (`date -jr $timestamp`)"
  touch -t `date -jr $timestamp '+%Y%m%d%H%M.%S'` "$snaptimetemp"
  tarsnap -c --keyfile "$2" -f "$latest" --snaptime "$snaptimetemp"
--print-stats --totals "$backuproot"

You're really just interested in the timestamp and touch parts, but I
included the rest for context.

===
Colin's Reply:
===

Hi Nick,

On 04/29/13 13:57, Nick Sivo wrote:
> I'm working on setting up nightly backups of our live server.  We run
> FreeBSD and I've configured a nightly UFS snapshot to be taken, and
> I'd like to back up some directories from that snapshot.  Can I just
> mount it and point tarsnap to the right place?  Do I need the
> --snaptime option?  If so, can I just set it to the timestamp of the
> snapshot itself?

You need the --snaptime option, and it should point at something with a
modification time prior to when the snapshot creation started.  The
easiest way to do this is probably to touch a file on the filesystem
immediately before taking the snapshot, then to use that file as the
snaptime marker.

> To the best of my knowledge tarsnap shouldn't need to know that it's
> reading from a snapshot, but I figured I'd ask since there's an
> option.

This is necessary because of Tarsnap's "this file hasn't changed since I
last read it so I won't waste time re-reading it" logic.  Normally if a
file has the same path, size, inode #, and modification time, Tarsnap
assumes that it hasn't changed and doesn't need to be re-read; but the
file times have limited precision, so the following situation is possible:

t=12345.0001: file is modified
t=12345.0100: tarsnap reads file
t=12345.0500: file is modified again
t=54321.: tarsnap is run again and encounters file again

in which case the second tarsnap run will see a timestamp of "12345" even
though it's a "later" 12345.

To prevent this race condition, tarsnap flags files which with modification
times equal to when tarsnap is running; however, that doesn't work in the
following case:

t=12345.0001: file is modified
t=12345.0100: filesystem snapshot #1 is created
t=12345.0500: file is modified again
t=13000.: tarsnap is run against filesystem snapshot #1
t=5.: filesystem snapshot #2 is created
t=54321.: tarsnap is run against filesystem snapshot #2

since both snapshots will show the file with an mtime of 12345.  Tarsnap
uses the snaptime marker to identify cases like this.

In short: It's a sneaky race condition which you'd have to be very unlucky
to hit, but using the --snaptime option allows tarsnap to identify when it
might be running into the race and play it safe instead.

On Thu, Jan 23, 2014 at 3:43 AM, Albert Peschar  wrote:
> Hi all,
>
>
>
> I've been using tarsnap for some months to do all kinds of backups.
>
> Now, I'd like to use tarsnap to archive ZFS filesystem snapshots. What
>
> is unclear to me is whether and how I should be using the --snaptime
>
> option.
>
>
>
> From the manual:
>
>
>
>  --snaptime file
>
>  (c mode only) This option MUST be specified when
>
>  creating a backup from a filesystem snapshot, and
>
>  file must have a modification time prior to when
>
>  the filesystem snapshot was created.  (This is nec‐
>
>  essary to prevent races between file modification
>
>  and snapshot creation which could result in tarsnap
>
>  failing to recognize that a file has been modi‐
>
>  fied.)
>
>
>
> Should I just specify any file that hasn't been modified during the
>
> snapshot?
>
>
>
> And would someone be willing to explain why this is at all necessary?
>
>
>
>
>
> Thanks,
>
>
>
> Albert