On Jun 28, 2015, at 7:28 PM, Aaron Poffenberger <a...@hypernote.com> wrote:
>> IMO, you're over thinking it.
>> 
>> Step 1) GET THE DATA OFF THE FAILING DRIVES.  Doing *anything* before
>> that's done means you *want* to lose data.
>> 
>> Step 2) okay, *now* that the data is safe, compare files between trees
>> and delete duplicates
>> 
>> Note that trying to dedup as it's copied will probably *increase* the
>> number of times the data has to be read and thus increase the chance
>> of lost data.
>> 
>> 
>> Philip Guenther
>> 
> 
> Agreed. Save your data first then merge.
> 
> rsync (pkgs) will help you with both steps:

+1 on the "save it first" option.  But I disagree with rsync.  Ideally, you 
want one read per block, and that's it.

I've used dd_rescue (a modified dd that a) doesn't die on read failures, and b) 
uses a dual-blocksize option to try and recover as much data as possible) in 
the past to make image copies of drives.  I had one drive that would read for 
some period of time, heat up, then error.  I'd take the drive outside, let it 
cool down, read some of it, then rinse and repeat til I got the entire drive.

I tend to prefer image captures of failing drives, and keep the seeking and 
reading to a minimum.  You can always mount the image and pull files out of the 
filesystem.

I've also used r-studio for recovering files from filesystem images.  Works 
pretty good (though I have no idea if they support ufs).

I've also done things like:

* Make an image
* Huh, drive seems to still be working, use tar or whatever.
* Stare at drive and finally throw it out.

Drives aren't worth trying to "salvage", in my opinion.

As for having N copies of files: You're just going to have to bite the bullet 
on that.  You have the following problems:

* Duplicate filenames, different data (think a file name "foo", one of which is 
an image, one is text)
* Duplicate filenames, delta data (versions of files, primarily)
* Renamed files.

I've gotten fairly good over the year at doing these n-way merges (using tools 
like melt, the gnu diff -r option, etc).

My only real advice above "back it up first" is: DO NOT use the backup as your 
working copy.  You *will* cry when you realize you just nuked the wrong file -- 
and the HD was dying, and you can't get it back.

Good luck.

Reply via email to