Re[11]: [rdiff-backup-users] Verify times increasing

2009-12-08 Thread listserv . traffic
I know this discussion ended, in practical terms a while ago, but I
started some thoughts back then, and didn't have time to finish them.
So, I did so today, and thought they might be helpful in a conceptual
way.

One way to cut down on verify times would be to limit the number of delta's in 
the system.

(That's why you can verify a tape backup in only 2x the non-verify
times - you don't have to compute anything for each file.)

But you'll need more space to do that, which will make things more expensive.

But I think you're up against the cheap, quick, well-done, pick any two rule.

I'd guess it will still be cheaper to break the system down along
quarters, or months than going to tape. (You're trading compute time
for space [or data/delta size]. If the rdiff compute-time becomes too
excessive, then you can trade back in space. Reduce the number of
RDiffs in the repository. This should allow you to run a recursive
verify on the whole thing in a reasonable time.)

Use a repository for say, a quarter and throw in all the diffs into the same 
repository.

Use a recursive script to verify the whole repository - as often as needed. 

Obviously the longer the time-frame of the run, the longer a comprehensive full 
verify will take at the end of the time-period.

Once you reach the end of the period, you simply backup that repository to a 
couple of off-site media and you're good. You keep that archive repository for 
the length of your backup window.

Your window of time limit for a single repository will simply be how long you 
can tolerate for a comprehensive --verify.

If you wanted, say a years worth of file history and you could have a month in 
each repository before it grows un-verify-able in the time you have to do so, 
here would be the requirements as I see it.

You'd need 12/24 disks for each month's repository to go offsite. (12 or 24 
depending on if you want redundant copies of each repository. We'll call these 
os1-os12 [offsite 1] and osr1-osr12  [offsite redundant 1]

You'd need 3 disks for the current set and it's remote backup. We'll call 
these ol1 [online 1] and rem1 [remote 1], remr1 [remote redundant 1].

So, you backup to ol1 each night (...or whatever your backup period is - it 
could be every hour if you want, or just once a week...)

Between backups, you'll sync the repository growing on ol1 to rem1. You'll also 
sync between rem1 and remr1 so you have three copies of the current repository.

On a weekend, you'll do a full recursive --verify of the ol1 repository.

At the end of the month, you'll sync remr1 to os1 and osr1. (Take the whole 
month repository and make a copy of it to your monthly off-site set of disks.)

You'll kill the current repository on ol1. (Erase it, and start over.)

Then you start the process over - building a repository with a months worth of 
data in it.

Then sync to the next set of off-site [and redundant] disks.

When you get to the end of the year, obviously you'll start recycling the 
oldest os and osr disks.

If you have to do a restore from more than a month ago, you'll grab the 
appropraite set of os or osr disks and restore the file.

If the restore is in the last 30 days, you'll just do it out of the current 
repository.

This method is still a lot cheaper than going to tape for large volumes of data 
that only have moderate changes in them. [i.e. for data-sets that are handled 
well by rdiff-backup.]

If the data set isn't handled well, then you won't get much trade off by doing 
more compute cycles for space and rdiff-backup isn't for you.

---

Hope that's helpful.

-Greg



 [Inline]

 On Nov 25, 2009, at 12:25 PM, listserv.traf...@sloop.net wrote:
 snip explanation of how rdiff-backup works

 Sounds good.

 So, a --verify isn't needed to verify the current files. The very
 nature of RDB is that they're exact. (provided you trust the RDB
 protocol...which we assume.)

 OK, I can accept this (and this makes my backup time shorter, nice).

 A --verify IS needed when you want to check an older version to be
 sure that something hasn't borked your repository for an older delta
 set. [But the current files are already verified, IMO]

 When and why would I ever use this? If I need to restore an old backup
 it might be nice to know that I have access to good data, but I'll  
 take whatever I can get at that point. --verify doesn't seem to be  
 very useful to do a general repository health check (bummer).

 Well, the repository is the current files and then the meta-data
 and rdiffs to get to previous versions of the files.

 It does check the repository. When a backup is done, it stores a SHA1
 hash of the source file.

 So a --verify that completes successfully does the following:
 Takes a current file, applies all the relevant rdiffs as the
 meta-data says it should be applied. Once done, it calcs a SHA1 hash
 for the restored file and compares it to the stored SHA1 hash of
 that file when it was backed up on the relevant date.

 If the two match, 

Re: Re[6]: [rdiff-backup-users] Verify times increasing

2009-11-25 Thread Daniel Miller

I'm not sure what you're doing with your --verify...

It *sounds* like you want a full CRC style check of the *current*
files after the backup is complete. (i.e. File X gets updated with a
delta, and you want to verify that file X is the same both on the
source and destination locations/drives.)



Yes, although it's more of an internal consistency check within the
rdiff-backup repository itself. I'm looking for a way to quickly
verify the integrity my entire rdiff-backup repository.



In my scenario the repository is synced to an external USB drive that
gets rotated each day (i.e. each day I put yesterday's drive in
storage and bring a different drive out of storage to use for the  
next

backup). I use rsync to transfer my rdiff-backup repository (which
gets updated daily) to the USB drive. Then I run rdiff-backup -- 
verify-
at-time to verify that the files on the USB drive are not corrupt.  
But

lately this has been taking too long.



Does that make sense?


Yes, and the USB connection may explain the longish verify times,
since it's somewhat slow, compared to a SATA drive connected directly
to the controller...


USB probably does have something to do with how long it takes. But on  
the other hand yafic can do a full verify in 1/4 of the time on the  
same drive with the same data, etc. So maybe rdiff-backup could be  
made to be faster?



But I see that you want to verify the local RDiff repository to the
of-line one.


I'm not sure what you mean by this statement... I want to do an  
internal consistency check on my rdiff-backup repository after it's  
been rsync'd to the USB disk. I need to be sure that the data on the  
USB disk is valid. I am doing the verify on the USB drive because that  
is the last place that the data will be copied before it goes into  
secure storage (for up to a month, but normally just a few days).  
Maybe an outline of my data flow will help you to understand what I'm  
trying to accomplish.


First the hardware:
- Xserve with raid array - this is being backed up with rdiff-backup
- Firewire 800 drive attached to Xserve - staging location for rdiff- 
backup repository, gets a new revision each night

- Mac Mini - remote backup server
- USB 2.0 drive attached to Mac Mini - gets a copy of the rdiff-backup  
repo from the Firewire 800 drive on the Xserve


Now the data flow:
- Xserve runs rdiff-backup from raid array to local firewire drive
- Xserve runs rdiff-backup --verify-at-time 0B on local firewire drive  
to verify integrity of most recent revision (this step may not be  
necessary)
- Mac Mini runs rsync to copy rdiff-backup repo from Xserve firewire  
800 drive to local USB drive
- Mac Mini would now like to verify the integrity of the rdiff-backup  
repository that it just rsync'd to the USB drive


During this last step I would rather not tie up any resources on the  
Xserve. Instead, I want to do a fully local (to Mac Mini) verification  
of the rdiff-backup repository. This verification should let me know  
if any link in the (hardware) chain is failing: is the firewire 800  
staging drive failing? is the USB drive failing?



Not sure how to do that - I'd guess you could do it with some other
tools - not storing the hashes - just a full compare each time. (How
big is the repository? [I think you said, but I don't recall.]


100 GB mirror + 80 GB of rdiff data. So almost 200 GB


---
But I'd guess your local repository isn't on the same disks as the
data, right?


Right.


If so, then it's probably not a huge deal if it takes 20 hours to
check the local repository against the remote. [Though I guess all
that disk channel activity might impact other disk through-put too...]


The drive will be moved to a secure location, so it needs to happen as  
quickly as possible. If we have a disaster (fire, etc.) a backup  
doesn't do us much good if the most recent snapshot is still online  
being verified (and hence consumed by the fire).



(Add a controller? Dunno...)

I use a similar system and I don't verify the local repository to the
remote, though perhaps I should. (I trust rsync to make sure they're
the same...since it's not just copying the files - it's doing hash
matches like RDiff...)


Even if rsync verifies that they're the same this is only a false  
sense of security since the staging repo (the source that rsync copied  
from) could be corrupt and you'll never know it. This corruption could  
be sneaking into old revisions which you don't bother to verify  
because it takes too long. There needs to be some way to verify that  
ALL of the data is fully intact after it's been copied... --verify-at- 
time almost gets there, but not quite. It could get you there if you  
have lots of time to do a verify-at-time for each revision in the  
repo, but I'm guessing that would be prohibitively expensive in most  
cases.



BTW, is this on a windows platform? (Curious...) Ah, probably not
since yafic isn't... :)


Nope. All machines are running Mac OS. I have 

Re: Re[4]: [rdiff-backup-users] Verify times increasing

2009-11-25 Thread Daniel Miller


On Nov 24, 2009, at 7:01 PM, Alex Samad wrote:

On Tue, Nov 24, 2009 at 04:12:16PM -0500, Daniel Miller wrote:

On Nov 24, 2009, at 3:44 PM, listserv.traf...@sloop.net wrote:


[snip]


In my scenario the repository is synced to an external USB drive
that gets rotated each day (i.e. each day I put yesterday's drive in
storage and bring a different drive out of storage to use for the
next backup). I use rsync to transfer my rdiff-backup repository
(which gets updated daily) to the USB drive. Then I run rdiff-backup
--verify-at-time to verify that the files on the USB drive are not
corrupt. But lately this has been taking too long.


wouldn't the fact that you are reading lots of information from the  
usb

drive be slowing you down, why not run your --verify-at-time on the
local disk repo. and then when using rsync from local to usb us the -c
option to let rsync do a checksum on the files, but yuo are still  
going

to run into the slowness of USB drives


See my response to Greg. I think I addressed all these issues in that.

~ Daniel


___
rdiff-backup-users mailing list at rdiff-backup-users@nongnu.org
http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki


Re: Re[8]: [rdiff-backup-users] Verify times increasing

2009-11-25 Thread Daniel Miller


On Nov 25, 2009, at 12:25 PM, listserv.traf...@sloop.net wrote:

snip explanation of how rdiff-backup works


Sounds good.


So, a --verify isn't needed to verify the current files. The very
nature of RDB is that they're exact. (provided you trust the RDB
protocol...which we assume.)


OK, I can accept this (and this makes my backup time shorter, nice).


A --verify IS needed when you want to check an older version to be
sure that something hasn't borked your repository for an older delta
set. [But the current files are already verified, IMO]


When and why would I ever use this? If I need to restore an old backup  
it might be nice to know that I have access to good data, but I'll  
take whatever I can get at that point. --verify doesn't seem to be  
very useful to do a general repository health check (bummer).



So, your most important data the current data is verified.
[IMO] Progressively older delta sets are each less certain, as they
all get layered on top of each other in reverse order to get to
older sets. [But in general, I consider each older set to be
progressively less important - at least in general.]


I half agree here. I certainly agree that the most important data is  
the most current data. However, I would like to keep (at least) one  
years worth of backup history, and I need to know that my history is  
good.



So, I see your problem as the following.

1) Verify that the current backup completed properly.
(I do this via logs and exit codes. I don't double check the
current backup by doing a --verify on the current backup set. I
implicitly trust that RDB does it's job properly and that at the end
the hashes will match properly and that the current remote files do
equal the current local files. {i.e. the files that were the
source of the backup equal the backup files)


That's very trusting of you. I guess I'm a little more paranoid since  
my job depends on it :)



2) Verify that your older delta's are as intact as possible. That all
the meta-data, deltas and current files can be merged and rolled-back
to whatever desired end-point you want.

(This is where I use --verify - it's not perfect because there's not
a way to check every delta-set for every single file in the
repository - at least not easily. [A recursive loop checking every
version would do that, but as you say, it's going to be very resource
expensive.])


Agreed. This is where I'd like to see a new feature in rdiff-backup.  
I'm willing to write code if I ever get time and no one else does first.



3) Verify that the data is exact from your FW800 drive to the USB
drive on the mac-mini.

(I wouldn't use a --verify for this. As long as the files are equal
from the FW drive to the USB drive, if you can --verify on the FW  
drive

[source] you should be able to --verify on the USB drive too. So I'd
either trust rsync to be sure they're equal - or do something like
you are doing - checking that the FW files are exactly equal to the
USB files.

I'd do a verify on the fastest drive on the most powerful system.
Plus you don't need to do this all the time, say once a week - over a
weekend probably works. [And perhaps a full recursive loop through
all the diffs would be possible. If you write a bash script to do
that, I'd love to have it!])


The bash script would be hugely inefficient. I'd much rather spend the  
time modifying rdiff-backup support an internal consistency check.


The problem with doing it once a week is that it only ever hits one of  
the drives that is normally in secure storage. It would be a matter of  
weeks or possibly months to make sure that all drives have been  
verified (e.g. each time a particular drive is in use on a Friday).



To recap:
** Trust RDB does the backup properly and that source = destination
without additional checks.

** --verify the backup repository on the FW drive, and as much as
possible that all the older deltas and meta-data are intact and
functioning properly.

** check that the FW drive does copy exactly to the off-site USB
drive - but don't use --verify to accomplish this task. Just make
sure that the off-site repository is exactly equal to the on-site
FW drive.


I never do a direct compare between the two drives. I just use rsync  
to copy from the FW to the USB drive. Here's my concerns: without some  
type of regularly executed integrity check of the data on the drive  
(FW or USB), how would I detect that a drive is failing before it is  
catastrophic and the bad data has propagated to all of the redundant  
USB drives? Will rdiff-backup and/or rsync tell me if the drive is  
failing when they do a backup/copy? (I don't think so) The only way  
know that the data is good in my setup is to run some type of  
consistency check on the USB drive each day after the rsync is  
complete. If that fails then I know I have a problem somewhere. BTW it  
looks like yafic won't work for me now either. there seems to be a bug  
that causes it to stop half-way through the check  :(



Re[10]: [rdiff-backup-users] Verify times increasing

2009-11-25 Thread listserv . traffic
[Inline]

 On Nov 25, 2009, at 12:25 PM, listserv.traf...@sloop.net wrote:
 snip explanation of how rdiff-backup works

 Sounds good.

 So, a --verify isn't needed to verify the current files. The very
 nature of RDB is that they're exact. (provided you trust the RDB
 protocol...which we assume.)

 OK, I can accept this (and this makes my backup time shorter, nice).

 A --verify IS needed when you want to check an older version to be
 sure that something hasn't borked your repository for an older delta
 set. [But the current files are already verified, IMO]

 When and why would I ever use this? If I need to restore an old backup
 it might be nice to know that I have access to good data, but I'll  
 take whatever I can get at that point. --verify doesn't seem to be  
 very useful to do a general repository health check (bummer).

Well, the repository is the current files and then the meta-data
and rdiffs to get to previous versions of the files.

It does check the repository. When a backup is done, it stores a SHA1
hash of the source file.

So a --verify that completes successfully does the following:
Takes a current file, applies all the relevant rdiffs as the
meta-data says it should be applied. Once done, it calcs a SHA1 hash
for the restored file and compares it to the stored SHA1 hash of
that file when it was backed up on the relevant date.

If the two match, we know the system worked properly.

So, a -verify back to the oldest backup does do a fairly
comprehensive check - just not exhaustive. It does verify that the
meta-data/rdiffs for a lot of the system does work, and isn't
corrupt.

Again, it's not deterministic, which I'd like - but it's not half bad
either.

A totality/deterministic check would certainly be nice, but I think it's do-able
the way it is now.

 So, your most important data the current data is verified.
 [IMO] Progressively older delta sets are each less certain, as they
 all get layered on top of each other in reverse order to get to
 older sets. [But in general, I consider each older set to be
 progressively less important - at least in general.]

 I half agree here. I certainly agree that the most important data is  
 the most current data. However, I would like to keep (at least) one  
 years worth of backup history, and I need to know that my history is  
 good.

 So, I see your problem as the following.

 1) Verify that the current backup completed properly.
 (I do this via logs and exit codes. I don't double check the
 current backup by doing a --verify on the current backup set. I
 implicitly trust that RDB does it's job properly and that at the end
 the hashes will match properly and that the current remote files do
 equal the current local files. {i.e. the files that were the
 source of the backup equal the backup files)

 That's very trusting of you. I guess I'm a little more paranoid since
 my job depends on it :)

Well, RDB creates SH1 hashes of both files, and then compares it and
if it's different it does all the work to be sure they're the same.

Doing another SHA1 hash compare at the end seems redundant.

Either you trust that the RDB protocol does what it says it does, or
you don't. If you don't, then don't use the tool. [I'm being a bit
bombastic, but I think you get the point...]

And doing a --verify won't get you there, since it's just verifying
the file (reconstructed or not) with the SHA1 hash generated by RDB
at the backup date/time. [If you don't trust RDB, then you shouldn't trust
it's stored SHA1 hash or it's verify either, K? :)]

 2) Verify that your older delta's are as intact as possible. That all
 the meta-data, deltas and current files can be merged and rolled-back
 to whatever desired end-point you want.

 (This is where I use --verify - it's not perfect because there's not
 a way to check every delta-set for every single file in the
 repository - at least not easily. [A recursive loop checking every
 version would do that, but as you say, it's going to be very resource
 expensive.])

 Agreed. This is where I'd like to see a new feature in rdiff-backup.  
 I'm willing to write code if I ever get time and no one else does first.

Agreed - a deterministic, full-repository check would be excellent!

 3) Verify that the data is exact from your FW800 drive to the USB
 drive on the mac-mini.

 (I wouldn't use a --verify for this. As long as the files are equal
 from the FW drive to the USB drive, if you can --verify on the FW  
 drive
 [source] you should be able to --verify on the USB drive too. So I'd
 either trust rsync to be sure they're equal - or do something like
 you are doing - checking that the FW files are exactly equal to the
 USB files.

 I'd do a verify on the fastest drive on the most powerful system.
 Plus you don't need to do this all the time, say once a week - over a
 weekend probably works. [And perhaps a full recursive loop through
 all the diffs would be possible. If you write a bash script to do
 that, I'd love to have it!])

 The 

Re: Re[8]: [rdiff-backup-users] Verify times increasing

2009-11-25 Thread Alex Samad
On Wed, Nov 25, 2009 at 02:12:37PM -0500, Daniel Miller wrote:
 

[snip]

 
 I never do a direct compare between the two drives. I just use rsync
 to copy from the FW to the USB drive. Here's my concerns: without
 some type of regularly executed integrity check of the data on the
 drive (FW or USB), how would I detect that a drive is failing before
 it is catastrophic and the bad data has propagated to all of the
 redundant USB drives? Will rdiff-backup and/or rsync tell me if the
 drive is failing when they do a backup/copy? (I don't think so) The
 only way know that the data is good in my setup is to run some type
 of consistency check on the USB drive each day after the rsync is
 complete. If that fails then I know I have a problem somewhere. BTW
 it looks like yafic won't work for me now either. there seems to be
 a bug that causes it to stop half-way through the check  :(
 
 So back to the drawing board (or google) to find a different utility
 to do the integrity check.

This has been a rather informative thread.

Can I suggest a change on what greg was suggesting. the fastest place
for you to do your check is the firewire drive (with rdiff-backup), once
you are happy with this run you file checker (on linux I would use
md5sum or cksfv) which you can create a checksum for each of the files.  

transfer this checksum over from the xserver to the mini mac and check
your checksums against the files on the usb drive.

The presumption being that the xserver + firewaire 800 is going to allow
you to verify allot faster than the mini + usb - hopefully within in
your allotted time period. The other way to do this would be to rsync -c
(let rsync compare files via checksum )


Or maybe get fw800 drives for the mac mini.

Or  (what i do - this depends on your internet connection ), I
rdiff-backup to another machine on site and then rsync the rdiff-backup
directory offsite to 2 other geographical locations.  I also use
fusecompress to site underneath the rdiff-backup destination and find I
get pretty good compression - I actually rsync the compressed data which
saves me a lot of time.


 
 Thanks a lot for your input and generously patient explanations,
 Greg. I do value your input.
 
 ~ Daniel
 
 
 
 ___
 rdiff-backup-users mailing list at rdiff-backup-users@nongnu.org
 http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
 Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki
 

-- 
In the long run, every program becomes rococo, and then rubble.
-- Alan Perlis


signature.asc
Description: Digital signature
___
rdiff-backup-users mailing list at rdiff-backup-users@nongnu.org
http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki

Re: [rdiff-backup-users] Verify times increasing

2009-11-24 Thread Dominic Raferd

listserv.traf...@sloop.net wrote:
 I'm not aware, so if I'm wrong perhaps someone could correct me, but
 I'd like a command to, in essence, do a comprehensive
 --verify-all-files-in-the-archive. [I'm pretty sure such a thing
 doesn't exist, at least I never saw it in the docs.]

 This would apply all deltas to *all* files (back to the oldest copy) 
and compare the stored

 hashes at the time of backup to the rebuilt file. [Note all the
 files, not just those in a particular target date/delta.]

 This wouldn't verify that every file would be correct in every delta
 version, but it would, I think, get as close as one might come to
 that.

I think it is an excellent proposal (as in, how come rdiff-backup 
doesn't already offer that?) But development of rdiff-backup seems to 
have gone quiet recently, I guess Andrew has been busy?


Dominic


___
rdiff-backup-users mailing list at rdiff-backup-users@nongnu.org
http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki


Re: [rdiff-backup-users] Verify times increasing

2009-11-24 Thread Alex Samad
On Tue, Nov 24, 2009 at 09:00:48AM +, Dominic Raferd wrote:
 listserv.traf...@sloop.net wrote:
  I'm not aware, so if I'm wrong perhaps someone could correct me, but
  I'd like a command to, in essence, do a comprehensive
  --verify-all-files-in-the-archive. [I'm pretty sure such a thing
  doesn't exist, at least I never saw it in the docs.]
 
  This would apply all deltas to *all* files (back to the oldest
 copy) and compare the stored
  hashes at the time of backup to the rebuilt file. [Note all the
  files, not just those in a particular target date/delta.]
 
  This wouldn't verify that every file would be correct in every delta
  version, but it would, I think, get as close as one might come to
  that.
 
 I think it is an excellent proposal (as in, how come rdiff-backup

when you talk about hashes are you talking of something similar to a
md5sum of the whole file or just blocks of the file.


I think it would be good to have a md5sum ( something similar ) attached
to each backed up file for each backup.


 doesn't already offer that?) But development of rdiff-backup seems
 to have gone quiet recently, I guess Andrew has been busy?
 
 Dominic
 
 
 ___
 rdiff-backup-users mailing list at rdiff-backup-users@nongnu.org
 http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
 Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki
 



signature.asc
Description: Digital signature
___
rdiff-backup-users mailing list at rdiff-backup-users@nongnu.org
http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki

Re: [rdiff-backup-users] Verify times increasing

2009-11-24 Thread Gavin
Daniel Miller wrote:
 Why is the time needed to verify a three-month-old backup not leveling
 off? And is there a way to bring down my verification times but still
 be sure that my backup archives are not becoming corrupt due to
 decaying storage media, etc? Is there some other method of
 verification that I could use, perhaps not even related to rdiff-backup?

 ~ Daniel 
I do seem to remember a discussion on list using base tools to verify data.
First you could check that everything that is compressed is passing that
tools checks, should be pretty quick.
Second search the archive there might be something.
Thirdly keep a second remote backup (I'm sure you already do),
rdiff-backup is probably more reliable than hard drives.

Sorry I don't have a script ready to go and it may take some wading
through the rdiff-backup docs :-)

Another thought is http://code.google.com/p/archfs/
if you can at least browse the files there is a good chance that the
backup is mostly intact.

Also consider using rdiff-backup via a tool like backupninja and split
the backup into more manageable sized parts or run multiple scripts to
backup smaller parts.

Some ideas
Gavin


___
rdiff-backup-users mailing list at rdiff-backup-users@nongnu.org
http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki


Re: [rdiff-backup-users] Verify times increasing

2009-11-24 Thread Daniel Miller


Gavin wrote:

Daniel Miller wrote:
Why is the time needed to verify a three-month-old backup not  
leveling

off? And is there a way to bring down my verification times but still
be sure that my backup archives are not becoming corrupt due to
decaying storage media, etc? Is there some other method of
verification that I could use, perhaps not even related to rdiff- 
backup?


~ Daniel
I do seem to remember a discussion on list using base tools to  
verify data.
First you could check that everything that is compressed is passing  
that

tools checks, should be pretty quick.


I found a utility called yafic (http://www.saddi.com/software/yafic/)  
which allows me to verify the entire backup drive. It's pretty fast,  
and for my situation it provides a better solution than rdiff-backup  
because it allows me to verify arbitrary files of my choosing rather  
than just those that rdiff-backup is managing. Note that it would  
still be more convenient to have a faster and more robust verify  
mechanism built into rdiff-backup.



snip

Another thought is http://code.google.com/p/archfs/
if you can at least browse the files there is a good chance that the
backup is mostly intact.

Also consider using rdiff-backup via a tool like backupninja and split
the backup into more manageable sized parts or run multiple scripts to
backup smaller parts.


Thanks for your suggestions Gavin. I'll continue to look into ways to  
simplify my backup, which just got another layer of complexity and  
potential for failure with yafic... :)


~ Daniel



___
rdiff-backup-users mailing list at rdiff-backup-users@nongnu.org
http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki


Re[4]: [rdiff-backup-users] Verify times increasing

2009-11-24 Thread listserv . traffic
I mis-posted, and should have replied to the list, instead of just
Daniel...so here it is...

---
I'm not sure what you're doing with your --verify... [I'm confused, I
think...]

It *sounds* like you want a full CRC style check of the *current*
files after the backup is complete. (i.e. File X gets updated today with a
delta, and you want to verify that file X is the same both on the
source and destination locations/drives after the backup is complete.)

If that's the case, I think you already get it. (It's built-in to
RDiff-backup.)

Before I type a lot of drivel, let me know if that's what you
want/intend.

--
If I understand you right, this is a very different animal than a --verify...



-Greg

 Greg wrote:
 I know Matt corrected this post, but I wanted to address this:

 ---
 If you do a --verify-at-time xyz where xyz is your oldest backup, it
 should verify all files in that backup - so every delta should be
 applied. This should verify that all delta's (backups) are good and
 functioning.

 [In short, it verifies that for each file for which a successful
 verify is returned, that the most current file, all applicable  
 delta's and meta-data
 are good and functioning properly.]

 This is what I thought. It is good to have it confirmed. Thanks.

 ---
 However, if files were added after the initial backup, I'd guess that
 a verify won't check the delta's for those files - since they don't
 exist in the set at time xyz

 So, while a verify to your oldest backup is good, it's not
 comprehensive for all files that have deltas+meta-data.

 This seems to be a weakness in the rdiff-backup verify mechanism. I  
 think there is general consensus here on the list that this could be  
 improved since what many people are looking for is a way to verify  
 that their backup archive (including all past revisions) is free from
 corruption.

 ---
 I'm not aware, so if I'm wrong perhaps someone could correct me, but
 I'd like a command to, in essence, do a comprehensive
 --verify-all-files-in-the-archive. [I'm pretty sure such a thing
 doesn't exist, at least I never saw it in the docs.]

 This would apply all deltas to *all* files (back to the oldest copy)  
 and compare the stored
 hashes at the time of backup to the rebuilt file. [Note all the
 files, not just those in a particular target date/delta.]

 This wouldn't verify that every file would be correct in every delta
 version, but it would, I think, get as close as one might come to
 that.

 I agree, something like this would be great. Although with the speed  
 issue's I'm having it may not be practical (i.e. time feasible) to  
 reconstruct every file this way before comparing it to a signature  
 hash. I would propose that rdiff-backup store some additional meta- 
 data which would consist of signature hashes of the delta files as  
 they exist on the disk after rdiff-backup is finished with a backup  
 (similar to what yafic does - http://www.saddi.com/software/yafic/).  
 This should make the verification process much faster (yafic takes  
 less than two hours to verify an rdiff-backup repo that takes over  
 eight hours to --verify-at-time on my setup). Note that it would not  
 replace the --verify-at-time functionality, which would still be  
 necessary to verify the integrity of files as they existed before the
 backup. But it would provide a fast way to verify the integrity of an
 rdiff-backup repository.

 Then again, doing an intermediate check of the hash and file at each
 delta point wouldn't take too much longer [or so I think without a
 lot of time invested in pondering it] - so if this option/feature  
 doesn't
 exist and one were to code it, it might not be much more code or
 difficulty...

 Ease of implementation may be an argument that favors your proposal.  
 My proposal adds a completely new layer of integrity checking on top  
 of the existing rdiff-backup functionality.

 ~ Daniel



 ___
 rdiff-backup-users mailing list at rdiff-backup-users@nongnu.org
 http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
 Wiki URL:
 http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki



-- 
Best regards,
 listservmailto:listserv.traf...@sloop.net



___
rdiff-backup-users mailing list at rdiff-backup-users@nongnu.org
http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki


Re[4]: [rdiff-backup-users] Verify times increasing

2009-11-24 Thread listserv . traffic
[Now I'm bottom posting... :)]

 ---
 I'm not aware, so if I'm wrong perhaps someone could correct me, but
 I'd like a command to, in essence, do a comprehensive
 --verify-all-files-in-the-archive. [I'm pretty sure such a thing
 doesn't exist, at least I never saw it in the docs.]

 This would apply all deltas to *all* files (back to the oldest copy)  
 and compare the stored
 hashes at the time of backup to the rebuilt file. [Note all the
 files, not just those in a particular target date/delta.]

 This wouldn't verify that every file would be correct in every delta
 version, but it would, I think, get as close as one might come to
 that.

 I agree, something like this would be great. Although with the speed  
 issue's I'm having it may not be practical (i.e. time feasible) to  
 reconstruct every file this way before comparing it to a signature  
 hash. I would propose that rdiff-backup store some additional meta- 
 data which would consist of signature hashes of the delta files as  
 they exist on the disk after rdiff-backup is finished with a backup  
 (similar to what yafic does - http://www.saddi.com/software/yafic/).  
 This should make the verification process much faster (yafic takes  
 less than two hours to verify an rdiff-backup repo that takes over  
 eight hours to --verify-at-time on my setup). Note that it would not  
 replace the --verify-at-time functionality, which would still be  
 necessary to verify the integrity of files as they existed before the
 backup. But it would provide a fast way to verify the integrity of an
 rdiff-backup repository.

Let me address this. Simply checking the a hash of the delta isn't
nearly enough. If the meta-data on how to apply that delta is gone or
corrupt, you're screwed too.

So, if you're going to calc and store a hash, you should store a hash
of both the meta-data and the delta.

Small nit, but thought I should mention it.

[I should note that I have never examined the code, so I'm speaking
from a theoretical point of view - but I've asked about these things
pretty carefully, so I'm pretty sure I'm clear on how things are
handled... I'm welcome to be corrected if I'm wrong.]

-Greg



___
rdiff-backup-users mailing list at rdiff-backup-users@nongnu.org
http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki


Re: Re[4]: [rdiff-backup-users] Verify times increasing

2009-11-24 Thread Daniel Miller

On Nov 24, 2009, at 3:44 PM, listserv.traf...@sloop.net wrote:


I'm not sure what you're doing with your --verify...

It *sounds* like you want a full CRC style check of the *current*
files after the backup is complete. (i.e. File X gets updated with a
delta, and you want to verify that file X is the same both on the
source and destination locations/drives.)


Yes, although it's more of an internal consistency check within the  
rdiff-backup repository itself. I'm looking for a way to quickly  
verify the integrity my entire rdiff-backup repository.


In my scenario the repository is synced to an external USB drive that  
gets rotated each day (i.e. each day I put yesterday's drive in  
storage and bring a different drive out of storage to use for the next  
backup). I use rsync to transfer my rdiff-backup repository (which  
gets updated daily) to the USB drive. Then I run rdiff-backup --verify- 
at-time to verify that the files on the USB drive are not corrupt. But  
lately this has been taking too long.


Does that make sense?

Yesterday I introduced a utility called yafic into my backup scheme.  
Yafic can do a full-repository verification. This works and it's much  
faster than rdiff-backup's --verify-at-time, but it's complicated to  
setup and I have to ignore all the changes that happen each day when  
rdiff-backup updates the repository. It would be nicer to have this  
kind of verification built-in to rdiff-backup so I wouldn't have to  
filter out all the new delta and metadata files. rdiff-backup knows  
which files were added/changed/deleted and would not report those  
changes like yafic does. With my proposed enhancement, rdiff-backup  
would only report warnings or errors if any part of the repository  
became corrupt.



If that's the case, I think you already get it. (It's built-in to
RDiff-backup.)


This is good to know. Does this happen during the backup, or only  
during --verify... ? I assume you're talking about something  
equivalent to --verify-at-time 0B ? Of course, this would only verify  
the current mirror.


BTW, is this documented? I'm going to feel stupid if it is, because I  
did not see it when I read the docs (multiple times) for rdiff-backup.


~ Daniel



___
rdiff-backup-users mailing list at rdiff-backup-users@nongnu.org
http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki


Re[6]: [rdiff-backup-users] Verify times increasing

2009-11-24 Thread listserv . traffic
See inline...

 I'm not sure what you're doing with your --verify...

 It *sounds* like you want a full CRC style check of the *current*
 files after the backup is complete. (i.e. File X gets updated with a
 delta, and you want to verify that file X is the same both on the
 source and destination locations/drives.)

 Yes, although it's more of an internal consistency check within the  
 rdiff-backup repository itself. I'm looking for a way to quickly  
 verify the integrity my entire rdiff-backup repository.

 In my scenario the repository is synced to an external USB drive that
 gets rotated each day (i.e. each day I put yesterday's drive in  
 storage and bring a different drive out of storage to use for the next
 backup). I use rsync to transfer my rdiff-backup repository (which  
 gets updated daily) to the USB drive. Then I run rdiff-backup --verify-
 at-time to verify that the files on the USB drive are not corrupt. But
 lately this has been taking too long.

 Does that make sense?

Yes, and the USB connection may explain the longish verify times,
since it's somewhat slow, compared to a SATA drive connected directly
to the controller...

But I see that you want to verify the local RDiff repository to the
of-line one.

Not sure how to do that - I'd guess you could do it with some other
tools - not storing the hashes - just a full compare each time. (How
big is the repository? [I think you said, but I don't recall.]

---
But I'd guess your local repository isn't on the same disks as the
data, right?

If so, then it's probably not a huge deal if it takes 20 hours to
check the local repository against the remote. [Though I guess all
that disk channel activity might impact other disk through-put too...]

(Add a controller? Dunno...)

I use a similar system and I don't verify the local repository to the
remote, though perhaps I should. (I trust rsync to make sure they're
the same...since it's not just copying the files - it's doing hash
matches like RDiff...)

 Yesterday I introduced a utility called yafic into my backup scheme.  
 Yafic can do a full-repository verification. This works and it's much
 faster than rdiff-backup's --verify-at-time, but it's complicated to  
 setup and I have to ignore all the changes that happen each day when  
 rdiff-backup updates the repository. It would be nicer to have this  
 kind of verification built-in to rdiff-backup so I wouldn't have to  
 filter out all the new delta and metadata files. rdiff-backup knows  
 which files were added/changed/deleted and would not report those  
 changes like yafic does. With my proposed enhancement, rdiff-backup  
 would only report warnings or errors if any part of the repository  
 became corrupt.

 If that's the case, I think you already get it. (It's built-in to
 RDiff-backup.)

 This is good to know. Does this happen during the backup, or only  
 during --verify... ? I assume you're talking about something  
 equivalent to --verify-at-time 0B ? Of course, this would only verify
 the current mirror.

Well, it computes a hash for the whole file at the local and remote
ends, and if they don't match, it then sends the delta's to make them
match.

What you end up with in the end is a current file on the remote who's hash 
matches
the local file - by definition. I don't know if it checks the local
hash vs remote file again [i.e. a recalc of the remote file hash] at
the end to make sure they match, but that would be a redundant step -
tho, probably a good one.

I don't think it's documented per-se, but it's inherent in the protocol.

[But that's not really what you are aiming for in your check
above...so it sorta doesn't apply...]

BTW, is this on a windows platform? (Curious...) Ah, probably not
since yafic isn't... :)

-Greg

 BTW, is this documented? I'm going to feel stupid if it is, because I
 did not see it when I read the docs (multiple times) for rdiff-backup.

 ~ Daniel



-- 
Best regards,
 listservmailto:listserv.traf...@sloop.net



___
rdiff-backup-users mailing list at rdiff-backup-users@nongnu.org
http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki


Re: Re[4]: [rdiff-backup-users] Verify times increasing

2009-11-24 Thread Alex Samad
On Tue, Nov 24, 2009 at 04:12:16PM -0500, Daniel Miller wrote:
 On Nov 24, 2009, at 3:44 PM, listserv.traf...@sloop.net wrote:

[snip]

 In my scenario the repository is synced to an external USB drive
 that gets rotated each day (i.e. each day I put yesterday's drive in
 storage and bring a different drive out of storage to use for the
 next backup). I use rsync to transfer my rdiff-backup repository
 (which gets updated daily) to the USB drive. Then I run rdiff-backup
 --verify-at-time to verify that the files on the USB drive are not
 corrupt. But lately this has been taking too long.

wouldn't the fact that you are reading lots of information from the usb
drive be slowing you down, why not run your --verify-at-time on the
local disk repo. and then when using rsync from local to usb us the -c
option to let rsync do a checksum on the files, but yuo are still going
to run into the slowness of USB drives



 
 Does that make sense?
 
[snip]

 
 BTW, is this documented? I'm going to feel stupid if it is, because
 I did not see it when I read the docs (multiple times) for
 rdiff-backup.
 
 ~ Daniel
 
 
 
 ___
 rdiff-backup-users mailing list at rdiff-backup-users@nongnu.org
 http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
 Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki
 

-- 
You know, it's hard work to try to love her as best as I can, knowing full 
well that the decision I made caused her loved one to be in harm's way.

- George W. Bush
09/30/2004
first presidential debate, Coral Gables, Fla.


signature.asc
Description: Digital signature
___
rdiff-backup-users mailing list at rdiff-backup-users@nongnu.org
http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki

Re: [rdiff-backup-users] Verify times increasing

2009-11-23 Thread Daniel Miller
Ideology: I do the large verify every day on the remote system to  
make

sure my backup history is not becoming corrupt (e.g. due to disk
failure, etc.). Ideally I would like to verify the past year, but  
that

will obviously take way too long to be possible with my setup.

Observations:
Despite reducing the amount of historical data that gets verified  
from

one year to three months,


I think you are misunderstanding how --verify works.  If you say:

rdiff-backup --verify-at-time 1Y

it does not verify the last 1 years worth of backups.  It verifies a
single backup a year ago (I believe the closest backup before that  
exact

time); hence the name verify-at-time.


Yes, I do understand that. But to verify a one-year-old backup it must  
apply each set of differential data over that entire year to  
reconstruct the files as they existed one year ago. This effectively  
verifies that every backup between now and one year ago is also valid  
since any corruption in an increment younger than a year would show up  
in the reconstructed data for the one-year-old backup. The exception  
would be a file that got created within the past year. Such a file  
could be corrupt and I would not know it until it got to be a year  
old. This seems to be a weakness in the rdiff-backup verification  
method, but I do not know of a practical way to get around it.



As the manual says, Check all
the data in the repository *at* the given time. (emphasis added).   
That

explains why you are not seeing the trend you expect.  It also means
you're getting less verification than you thought.


No and no.

I'll address your second assertion first: I'm getting less  
verification than I thought. I maintain that it is effectively  
verifying the integrity of every backup increment between now and the  
point in time that I verify since it uses each of those increments to  
construct the point in time that I'm verifying. Please explain how my  
understanding is wrong.


Now back to your first assertion about why I am not seeing the trend I  
expect. Since I'm verifying a younger backup (only three months old  
rather than a year) it has less diff data to apply to construct that  
point in time and therefore it should take less time. But I have not  
observed a decrease. Instead the time to verify a three-month-old  
backup seems to be increasing at a constant rate.


Here's something I did not mention before: over the weekend (when  
there is enough time) I still do the --verify-at-time 1Y.  
Interestingly this verification takes about the same amount of time  
(sometimes less) than the --verify-at-time 3M which is done during the  
week. That doesn't make any sense at all to me.


Why is the time needed to verify a three-month-old backup not leveling  
off? And is there a way to bring down my verification times but still  
be sure that my backup archives are not becoming corrupt due to  
decaying storage media, etc? Is there some other method of  
verification that I could use, perhaps not even related to rdiff-backup?


~ Daniel



___
rdiff-backup-users mailing list at rdiff-backup-users@nongnu.org
http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki


Re: [rdiff-backup-users] Verify times increasing

2009-11-23 Thread Alex Samad
On Mon, Nov 23, 2009 at 09:06:42AM -0500, Daniel Miller wrote:
 
 Why is the time needed to verify a three-month-old backup not
 leveling off? And is there a way to bring down my verification times
 but still be sure that my backup archives are not becoming corrupt
 due to decaying storage media, etc? Is there some other method of
 verification that I could use, perhaps not even related to
 rdiff-backup?

Hi

I was wondering if you could do a 
-l --list-increment-sizes on the repo, not sure exactly what it going to
show, but might shed some light...

alex


 
 ~ Daniel
 
 
 


signature.asc
Description: Digital signature
___
rdiff-backup-users mailing list at rdiff-backup-users@nongnu.org
http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki

Re: [rdiff-backup-users] Verify times increasing

2009-11-23 Thread Matthew Flaschen
Daniel Miller wrote:
 I think you are misunderstanding how --verify works.  If you say:

 rdiff-backup --verify-at-time 1Y

 it does not verify the last 1 years worth of backups.  It verifies a
 single backup a year ago (I believe the closest backup before that exact
 time); hence the name verify-at-time.
 
 Yes, I do understand that. But to verify a one-year-old backup it must
 apply each set of differential data over that entire year to reconstruct
 the files as they existed one year ago.

That is incorrect.  Every 10 incremental diffs, rdiff-backup stores
another snapshot of the file. [...] During the restore, rdiff-backup
finds the oldest snapshot at least as recent as the desired backup time
(it could be the current mirror, or one of these snapshots).
(http://www.mail-archive.com/rdiff-backup-users@nongnu.org/msg03884.html).

 I'll address your second assertion first: I'm getting less verification
 than I thought. I maintain that it is effectively verifying the
 integrity of every backup increment between now and the point in time
 that I verify since it uses each of those increments to construct the
 point in time that I'm verifying. Please explain how my understanding is
 wrong.

This follows from what I noted above.  To do a verify at 1Y ago, it will
 (on average) only process 5 rdiffs.  It may process 10 or 0, but that's
the average.  You are only verifying the backups immediately after the
1Y ago mark.

Matthew Flaschen


___
rdiff-backup-users mailing list at rdiff-backup-users@nongnu.org
http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki


Re: [rdiff-backup-users] Verify times increasing

2009-11-23 Thread Matthew Flaschen
Matthew Flaschen wrote:
 That is incorrect.  Every 10 incremental diffs, rdiff-backup stores
 another snapshot of the file. [...] During the restore, rdiff-backup
 finds the oldest snapshot at least as recent as the desired backup time
 (it could be the current mirror, or one of these snapshots).
 (http://www.mail-archive.com/rdiff-backup-users@nongnu.org/msg03884.html).

Someone reminded me off list that Andrew's statement in that email is
incorrect.  Only meta-data files get a snapshot every 10 diffs.

I apologize for adding to the confusion.

Matt Flaschen


___
rdiff-backup-users mailing list at rdiff-backup-users@nongnu.org
http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki


Re[2]: [rdiff-backup-users] Verify times increasing

2009-11-23 Thread listserv . traffic
I know Matt corrected this post, but I wanted to address this:

---
If you do a --verify-at-time xyz where xyz is your oldest backup, it
should verify all files in that backup - so every delta should be
applied. This should verify that all delta's (backups) are good and
functioning.

[In short, it verifies that for each file for which a successful
verify is returned, that the most current file, all applicable delta's and 
meta-data
are good and functioning properly.]

---
However, if files were added after the initial backup, I'd guess that
a verify won't check the delta's for those files - since they don't
exist in the set at time xyz

So, while a verify to your oldest backup is good, it's not
comprehensive for all files that have deltas+meta-data.

---
I'm not aware, so if I'm wrong perhaps someone could correct me, but
I'd like a command to, in essence, do a comprehensive
--verify-all-files-in-the-archive. [I'm pretty sure such a thing
doesn't exist, at least I never saw it in the docs.]

This would apply all deltas to *all* files (back to the oldest copy) and 
compare the stored
hashes at the time of backup to the rebuilt file. [Note all the
files, not just those in a particular target date/delta.]

This wouldn't verify that every file would be correct in every delta
version, but it would, I think, get as close as one might come to
that.

Then again, doing an intermediate check of the hash and file at each
delta point wouldn't take too much longer [or so I think without a
lot of time invested in pondering it] - so if this option/feature doesn't
exist and one were to code it, it might not be much more code or
difficulty...

Thoughts?

-Greg



 Daniel Miller wrote:
 I think you are misunderstanding how --verify works.  If you say:

 rdiff-backup --verify-at-time 1Y

 it does not verify the last 1 years worth of backups.  It verifies a
 single backup a year ago (I believe the closest backup before that exact
 time); hence the name verify-at-time.
 
 Yes, I do understand that. But to verify a one-year-old backup it must
 apply each set of differential data over that entire year to reconstruct
 the files as they existed one year ago.

 That is incorrect.  Every 10 incremental diffs, rdiff-backup stores
 another snapshot of the file. [...] During the restore, rdiff-backup
 finds the oldest snapshot at least as recent as the desired backup time
 (it could be the current mirror, or one of these snapshots).
 (http://www.mail-archive.com/rdiff-backup-users@nongnu.org/msg03884.html).

 I'll address your second assertion first: I'm getting less verification
 than I thought. I maintain that it is effectively verifying the
 integrity of every backup increment between now and the point in time
 that I verify since it uses each of those increments to construct the
 point in time that I'm verifying. Please explain how my understanding is
 wrong.

 This follows from what I noted above.  To do a verify at 1Y ago, it will
  (on average) only process 5 rdiffs.  It may process 10 or 0, but that's
 the average.  You are only verifying the backups immediately after the
 1Y ago mark.

 Matthew Flaschen






___
rdiff-backup-users mailing list at rdiff-backup-users@nongnu.org
http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki


Re: [rdiff-backup-users] Verify times increasing

2009-11-20 Thread Matthew Flaschen
Daniel Miller wrote:
 Ideology: I do the large verify every day on the remote system to make
 sure my backup history is not becoming corrupt (e.g. due to disk
 failure, etc.). Ideally I would like to verify the past year, but that
 will obviously take way too long to be possible with my setup.
 
 Observations:
 Despite reducing the amount of historical data that gets verified from
 one year to three months,

I think you are misunderstanding how --verify works.  If you say:

rdiff-backup --verify-at-time 1Y

it does not verify the last 1 years worth of backups.  It verifies a
single backup a year ago (I believe the closest backup before that exact
time); hence the name verify-at-time.  As the manual says, Check all
the data in the repository *at* the given time. (emphasis added).  That
explains why you are not seeing the trend you expect.  It also means
you're getting less verification than you thought.

Matthew Flaschen


___
rdiff-backup-users mailing list at rdiff-backup-users@nongnu.org
http://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki