Re: cp from 4 different home folders without overwriting files with different content

2015-06-29 Thread Chris Bennett
Thanks for all the advice so far.
I never thought of working hard to keep the drives cool vs hot
And using dd to avoid seeking is also a very good idea.

tar is a disaster however since it cannot handle directories down to
files that are too long, even though the names are ok on file system.

I used dump instead.

I was smart, I made copies on two different drives. Gee, no new drive
can ever fail! Click of death, click of death. :(

Just in case anyone wants to know how I got into this multiple copies
problem, I also had other stuff on other partitions, that was also a
problem. I now do versioning on my programs, but I now see I should use
git or another system to keep track of work in each country.
Version 1.0.1 done in USA and version 1.0.1 in Guatemala are not the
same! Duh.

I have computers in the USA and Guatemala. I bus or cab to Mexico, fly
to San Antonio and bus to Austin. I go both ways every month.
I drag a few drives on my carry on but usually at least one stays put
in both countries. So thats how this big mess got created.

More ideas are welcome. I am learning a lot from these emails

Thanks,

Chris Bennett



cp from 4 different home folders without overwriting files with different content

2015-06-28 Thread Chris Bennett
I had 4 different hardrives that were failing.
I bought a 2TB usb drive to back up all the home folders.

I now would like to cp all of the folders and files to another empty
partition.

But I don't want to overwrite any files with same name but different
content.

For example:

/homeX/index.html to /homePerfect
/homeY/index.html to /homePerfect

both have same name but different contents.

I googled but couldn't find any solutions.
Ideally I would like a list of failed file copies.

Any ideas or scripts or ports?
Browsing through 4 home folders is a nightmare.

Chris Bennett



Re: cp from 4 different home folders without overwriting files with different content

2015-06-28 Thread Steven McDonald
On Sun, 28 Jun 2015 17:39:18 -0500
Chris Bennett chrisbenn...@bennettconstruction.us wrote:

 But I don't want to overwrite any files with same name but different
 content.

You could try GNU cp (gcp in the coreutils package) with the -n option:

   -n, --no-clobber
  do not overwrite an existing file (overrides a previous -i
  option)



Re: cp from 4 different home folders without overwriting files with different content

2015-06-28 Thread nerv
On Sun, 28 Jun 2015 17:39:18 -0500
Chris Bennett chrisbenn...@bennettconstruction.us wrote:

 I had 4 different hardrives that were failing.
 I bought a 2TB usb drive to back up all the home folders.

 I now would like to cp all of the folders and files to another empty
 partition.

 But I don't want to overwrite any files with same name but different
 content.

 For example:

 /homeX/index.html to /homePerfect
 /homeY/index.html to /homePerfect

 both have same name but different contents.

 I googled but couldn't find any solutions.
 Ideally I would like a list of failed file copies.

 Any ideas or scripts or ports?
 Browsing through 4 home folders is a nightmare.

 Chris Bennett


If you can't find a switch for cp you may have an easier time using
rsync, but I'm not too familiar with it so I couldn't tell you
what switches to use (It may be able to natively do what you're asking
however).
Writing a script for it using cp should be quite easy, for each of the
partition have the script recursively go into all folders and copy the
files after verifying if the name already exists in the target
partition. If it does, compare checksums,
same checksum : do nothing and go to the next file,
different checksum : copy and append a number to its name (or append to
it a name for the source partition).

Hope this helps

--

Goto Daichi (nerv) n...@fastmail.fm
mQENBFVl0RcBCADHL0fGKZ/4MAciOo9GqKnCz6f9qu1Q+1gOSu7anHTEALePUXrI
VFXdYfcB9D91mfYhSPdI6Wf4f3YNqJJozIaGo1p7g7Oo0j2n8KR/xgxtGLSqkyc7
I4Pkhg0SCa5pm2ty9cyfrUWrRwgopEj4bJlR2L0HHhTQBoVo9h30XtWeLPwwg+O1
vUGDgiLniHKBwna5jMp0I/bZxuM9ztxWXEmiEkqIh65dT6mcjJx2visSDAZGB033
pU/EQFTxyavFOlypZG+WCGo8VNJkzEf6cHMVKJsi6aBi8ewGiw0SuYfYSY9Fed8I
rLq0990FfB2NT26BRmJM+6Svs8+fJe3o+YNnABEBAAG0JUdvdG8gRGFpY2hpIChu
ZXJ2KSA8bmVydkBmYXN0bWFpbC5mbT6JATgEEwECACIFAlVl0RcCGwMGCwkIBwMC
BhUIAgkKCwQWAgMBAh4BAheAAAoJEIH6UEN73OdVe+sH/i5I5C1A8EzvK3wuetsK
8mPAiTFdw+x1tYrvS2A/eYAjKP1wfx9csB+Q9n94HFv7FtP5IbceZ5BdMtjagBa3
uWmHA/Pf5zoE3MaTSeY16mBEr141bTWzIdWofLgi0IrKPch8onEnTdd2hBWvJTPU
F8Zb176trSEpYEACo+6QUppFUmXDGhvVzAfOMJZU8mjfQvf5haamcYTeOifG0riW
vXjSDJJCFuMtj5uTRES9bRxKsyL2zW9B+DW9es4YIJ2zCgnSajoBGQu+kjrWzZG9
qlz5L0SbgQ4cRy4BT9o9AToK5Rs1eixEvHIten2agC7yMUbhGMyXYNRk+3NSJcJb
Zfi5AQ0EVWXRFwEIANwbm4X50uUHDYgT038WI8LfEd8Gh0UABAxRjn4AlpuaXJKL
mVY24iRTEHdspuBP12e11E9FiYO6/As7XSBIH/ZUFogffQGPh3Dyr4r9mBPBp+qR
NDy5tP5g6qbAYtJnDznaEldjsrF4FzrFcS3/9oCjOX3in98qYh+PS6DU3+emUn7V
P7socUmxgckidhvaWkAj6dsmZbg4kkWhGvarzCbehCZxKGgtfRfyTWeQfTYbSrSD
sxYZRb6lMBcVlY1Us6Uanw+au9vJPnS3nbZQJDhfJ/utTmaBpyIn6+4f4Ku049qp
YntER2RJiX+bHhVNa8IR5E4946pxZfBt6dY5Fo8AEQEAAYkBHgQYAQIACQUCVWXR
FwIbDAAKCRCB+lBDe9znVcxyB/iEiBpDbN8siHNCfJlFL98Au/GV9fE7H8IgCZ6o
rKKEjWEPML+FhlAYfbVlVnqSnmoLFloSYqhDymY+4S0IS/QcMnY2u017Rb1AIbF1
5BYzK1cTGDbeLObeJaIVr+DHEl+goPL9YgHg/X3WmFrO7nGP3Fv/n+VFn+S4zGE0
1yGFU9vdNGZkC7ddlDhGvophLJHHxfGSiGnjXKq9vR+xq2yyH0EZqLlCEprMmTo1
X+EpRNLZA4p5oee5RI/t6zk92DElTLuDqbPTnQNQd9tVwPeNQXsgWR+SPYD7vLQI
hez47/0guyHoHwMDjkiXq4uwgGT0YdZ8lDoT2Z8BiApLMRI=
=t27C
-END PGP PUBLIC KEY BLOCK-

[demime 1.01d removed an attachment of type application/pgp-signature]



Re: cp from 4 different home folders without overwriting files with different content

2015-06-28 Thread Chris Bennett
On Mon, Jun 29, 2015 at 12:56:04AM +0200, nerv wrote:
 On Sun, 28 Jun 2015 17:39:18 -0500
 Chris Bennett chrisbenn...@bennettconstruction.us wrote:
 
  I had 4 different hardrives that were failing.
  I bought a 2TB usb drive to back up all the home folders.
 
  I now would like to cp all of the folders and files to another empty
  partition.
 
  But I don't want to overwrite any files with same name but different
  content.
 
  For example:
 
  /homeX/index.html to /homePerfect
  /homeY/index.html to /homePerfect
 
  both have same name but different contents.
 
  I googled but couldn't find any solutions.
  Ideally I would like a list of failed file copies.
 
  Any ideas or scripts or ports?
  Browsing through 4 home folders is a nightmare.
 
  Chris Bennett
 
 
 If you can't find a switch for cp you may have an easier time using
 rsync, but I'm not too familiar with it so I couldn't tell you
 what switches to use (It may be able to natively do what you're asking
 however).
 Writing a script for it using cp should be quite easy, for each of the
 partition have the script recursively go into all folders and copy the
 files after verifying if the name already exists in the target
 partition. If it does, compare checksums,
 same checksum : do nothing and go to the next file,
 different checksum : copy and append a number to its name (or append to
 it a name for the source partition).

I looked at rsync and cp and gnu cp.
noclobber just won't do what I want.

Using checksums seems like a good part of the answer, but name changing
would be very complicated. I have everything read-only except for
regular /home, /var, / and /tmp. I do some of my programming in /home
folder and I also have many html files. I already wrote software to
change file contents to new values, but that adds even more
complications for both of those areas.

And I want to do this for 4 home folders!!???

Chris Bennett



Re: cp from 4 different home folders without overwriting files with different content

2015-06-28 Thread Philip Guenther
On Sun, Jun 28, 2015 at 4:27 PM, Chris Bennett
chrisbenn...@bennettconstruction.us wrote:
 On Mon, Jun 29, 2015 at 12:56:04AM +0200, nerv wrote:
 On Sun, 28 Jun 2015 17:39:18 -0500
 Chris Bennett chrisbenn...@bennettconstruction.us wrote:

  I had 4 different hardrives that were failing.
  I bought a 2TB usb drive to back up all the home folders.
 
  I now would like to cp all of the folders and files to another empty
  partition.
 
  But I don't want to overwrite any files with same name but different
  content.
 
  For example:
 
  /homeX/index.html to /homePerfect
  /homeY/index.html to /homePerfect
 
  both have same name but different contents.
 
  I googled but couldn't find any solutions.
  Ideally I would like a list of failed file copies.
 
  Any ideas or scripts or ports?
  Browsing through 4 home folders is a nightmare.
 
  Chris Bennett
 

 If you can't find a switch for cp you may have an easier time using
 rsync, but I'm not too familiar with it so I couldn't tell you
 what switches to use (It may be able to natively do what you're asking
 however).
 Writing a script for it using cp should be quite easy, for each of the
 partition have the script recursively go into all folders and copy the
 files after verifying if the name already exists in the target
 partition. If it does, compare checksums,
 same checksum : do nothing and go to the next file,
 different checksum : copy and append a number to its name (or append to
 it a name for the source partition).

 I looked at rsync and cp and gnu cp.
 noclobber just won't do what I want.

 Using checksums seems like a good part of the answer, but name changing
 would be very complicated. I have everything read-only except for
 regular /home, /var, / and /tmp. I do some of my programming in /home
 folder and I also have many html files. I already wrote software to
 change file contents to new values, but that adds even more
 complications for both of those areas.

 And I want to do this for 4 home folders!!???

IMO, you're over thinking it.

Step 1) GET THE DATA OFF THE FAILING DRIVES.  Doing *anything* before
that's done means you *want* to lose data.

Step 2) okay, *now* that the data is safe, compare files between trees
and delete duplicates

Note that trying to dedup as it's copied will probably *increase* the
number of times the data has to be read and thus increase the chance
of lost data.


Philip Guenther



Re: cp from 4 different home folders without overwriting files with different content

2015-06-28 Thread Aaron Poffenberger
 On Jun 28, 2015, at 18:35, Philip Guenther guent...@gmail.com wrote:
 
 On Sun, Jun 28, 2015 at 4:27 PM, Chris Bennett
 chrisbenn...@bennettconstruction.us wrote:
 On Mon, Jun 29, 2015 at 12:56:04AM +0200, nerv wrote:
 On Sun, 28 Jun 2015 17:39:18 -0500
 Chris Bennett chrisbenn...@bennettconstruction.us wrote:
 
 I had 4 different hardrives that were failing.
 I bought a 2TB usb drive to back up all the home folders.
 
 I now would like to cp all of the folders and files to another empty
 partition.
 
 But I don't want to overwrite any files with same name but different
 content.
 
 For example:
 
 /homeX/index.html to /homePerfect
 /homeY/index.html to /homePerfect
 
 both have same name but different contents.
 
 I googled but couldn't find any solutions.
 Ideally I would like a list of failed file copies.
 
 Any ideas or scripts or ports?
 Browsing through 4 home folders is a nightmare.
 
 Chris Bennett
 
 
 If you can't find a switch for cp you may have an easier time using
 rsync, but I'm not too familiar with it so I couldn't tell you
 what switches to use (It may be able to natively do what you're asking
 however).
 Writing a script for it using cp should be quite easy, for each of the
 partition have the script recursively go into all folders and copy the
 files after verifying if the name already exists in the target
 partition. If it does, compare checksums,
 same checksum : do nothing and go to the next file,
 different checksum : copy and append a number to its name (or append to
 it a name for the source partition).
 
 I looked at rsync and cp and gnu cp.
 noclobber just won't do what I want.
 
 Using checksums seems like a good part of the answer, but name changing
 would be very complicated. I have everything read-only except for
 regular /home, /var, / and /tmp. I do some of my programming in /home
 folder and I also have many html files. I already wrote software to
 change file contents to new values, but that adds even more
 complications for both of those areas.
 
 And I want to do this for 4 home folders!!???
 
 IMO, you're over thinking it.
 
 Step 1) GET THE DATA OFF THE FAILING DRIVES.  Doing *anything* before
 that's done means you *want* to lose data.
 
 Step 2) okay, *now* that the data is safe, compare files between trees
 and delete duplicates
 
 Note that trying to dedup as it's copied will probably *increase* the
 number of times the data has to be read and thus increase the chance
 of lost data.
 
 
 Philip Guenther
 

Agreed. Save your data first then merge.

rsync (pkgs) will help you with both steps:

For initial save:
# -a preserves dates, time and permissions
# -H preserves hard links - can be memory intensive
# -v if you want to see each file by name
# --progress to see name + ETA
rsync -aH /mnt/failing_w/homes/ /mnt/2tb/w/
rsync -aH /mnt/failing_x/homes/ /mnt/2tb/x/
rsync -aH /mnt/failing_y/homes/ /mnt/2tb/y/
rsync -aH /mnt/failing_z/homes/ /mnt/2tb/z/

Merging:
rsync-ing with the --backup --backup-suffix options will backup
existing files into the same directory before copying changed.

Following is an example. I recommend reading the rsync man
page to understand the options first.

# disk w archive
rsync -aH /mnt/2tb/w/ /mnt/2tb/merged/

# disk x archive
# -b == --backup
#
# -c == --checksum
#
# set a backup suffix that means something to you and change
# it for each drive
rsync -aHcb /mnt/2tb/w/ /mnt/2tb/merged/ --backup-suffix=_x_sync.bak

Repeat changing disk and backup-suffix.

Another option is to just use --dry-run to see the differences.

rsync -aH /mnt/2tb/w/ /mnt/2tb/merged/

rsync -aHcv /mnt/2tb/x/ /mnt/2tb/merged/ --dry-run

Using --dry-run alone shows what has changed or been added.
Add --delete to see what doesn't exist as well.

rsync -aHcv /mnt/2tb/x/ /mnt/2tb/merged/ --dry-run --delete

--Aaron



Re: cp from 4 different home folders without overwriting files with different content

2015-06-28 Thread Benny Lofgren
On 2015-06-29 00:39, Chris Bennett wrote:
 I had 4 different hardrives that were failing.
 I bought a 2TB usb drive to back up all the home folders.
 I now would like to cp all of the folders and files to another empty
 partition.
 But I don't want to overwrite any files with same name but different
 content.
 For example:
 
 /homeX/index.html to /homePerfect
 /homeY/index.html to /homePerfect
 
 both have same name but different contents.
 
 I googled but couldn't find any solutions.
 Ideally I would like a list of failed file copies.
 
 Any ideas or scripts or ports?
 Browsing through 4 home folders is a nightmare.

Definitely listen to Philip Guenther's advice. Get as much data off the
drives as you can first! That is your number one priority.


Make a directory on your new drive for each of the old and just throw
the contents in there:


# mount -o ro /dev/old0 /mnt # - ALWAYS mount failing disks read-only!

# mkdir /2TB/disk1   # One destination dir per disk!

If you have rsync installed, use that:

# time rsync -aHv /mnt/ /2TB/disk1 2/dev/rsyncerr.log

I often time longer runs. That way if I leave the computer, when I
come back it gives me a quick indication of whether the command ran
right or not. If a job that's expected to take several hours ends in a
few seconds, then something's clearly gone wrong.

Also, capture stderr to a log file. You want to see which files, if any,
fails to copy due to disk error.

You probably also want to glance periodically at /var/log/messages for
signs of disk read errors. Maybe tail -f it in another window.


If you don't have rsync, use cpio (cp -R doesn't preserve hard links):

# cd /mnt
# find . -print | cpio -pdmv /2TB/disk1 2/tmp/cpioerr.log
# if [ $? -ne 0 ]
# then
 echo At least one error occured, check /tmp/cpioerr.log!
 fi
# cd; umount /mnt   # -- cd away from the disk or umount fails

If you do use cpio/tar/pax, pay attention to max file sizes. If you have
files larger than 4/8 GB (depending on archive format), you're shit out
of luck and needs to install something like rsync.

And, last but not least, don't forget to copy ALL partitions on each
drive, if there are more than one. :-)


Rinse, lather, repeat until all disks are processed.


Then you can also use rsync to help you merge the contents to a common
directory, but as I write this I noticed that someone else just
commented with a nice instruction about that. :-)


More tips:

* Move the failing disks as little and as carefully as humanly possible.

* If they are in a USB enclosure, I recommend removing that and plugging
them directly into a SATA port in your computer, if possible. Three
reasons: safer error reporting (I think), faster transfer and different
controller/power supply, in case either of those are what is really failing.

* Always lay disks flat on their belly. Standing up on their long edge
is okay too (just don't tip them over!), but never ever run a spinning
hard drive upside down.

* Make sure they have good cooling. If you run failing disks hot they
will most likely fail faster and harder. In extreme cases I've even had
success reading unreadable disks by running them in a freezer. The
tiny amount of expansion/contraction of the drive heads and platters due
to heat/cold just maaay be enough to make a difference. But probably
not, so do this as a last-ditch effort.

* Related to the previous item, if you plug in a failing disk that is
cold, let it run for a few minutes before attempting to read from it. By
letting the drive reach its normal working temperature you give its read
heads the best chance possible to align themselves to what's already
written on the platters.

* If everything else fails, and you really need to rescue something from
a bad drive, try this software:

  https://www.grc.com/

I know I know, the web site looks like a joke, a relic left over from
the 90's (and it probably is), but the software, SpinRite, really can
wring every last possible bit out of a bad sector. It's $89, but well
worth the money to save the family photos...

* If the file system structure is damaged and you need to run fsck
before reading from the drive, DON'T. Instead, make an image copy (using
dd for example) to another drive of at least the same size, and then run
it from there. If the copy produces read errors, you might want to zero
out those blocks on the destination before trying an fsck.

* When you are absolutely certain you can't get more off of your drives,
DON'T KEEP THEM AROUND. Throw them out! In a year's time, you will have
forgotten that they are full of errors, and try to reuse them again... :-)


Good luck.


Regards,
/Benny


-- 
internetlabbet.se / work:   +46 8 551 124 80  / Words must
Benny Lofgren/  mobile: +46 70 718 11 90 /   be weighed,
/   fax:+46 8 551 124 89/not counted.
   /email:  benny -at- internetlabbet.se



Re: cp from 4 different home folders without overwriting files with different content

2015-06-28 Thread Sean Kamath
On Jun 28, 2015, at 7:28 PM, Aaron Poffenberger a...@hypernote.com wrote:
 IMO, you're over thinking it.
 
 Step 1) GET THE DATA OFF THE FAILING DRIVES.  Doing *anything* before
 that's done means you *want* to lose data.
 
 Step 2) okay, *now* that the data is safe, compare files between trees
 and delete duplicates
 
 Note that trying to dedup as it's copied will probably *increase* the
 number of times the data has to be read and thus increase the chance
 of lost data.
 
 
 Philip Guenther
 
 
 Agreed. Save your data first then merge.
 
 rsync (pkgs) will help you with both steps:

+1 on the save it first option.  But I disagree with rsync.  Ideally, you 
want one read per block, and that's it.

I've used dd_rescue (a modified dd that a) doesn't die on read failures, and b) 
uses a dual-blocksize option to try and recover as much data as possible) in 
the past to make image copies of drives.  I had one drive that would read for 
some period of time, heat up, then error.  I'd take the drive outside, let it 
cool down, read some of it, then rinse and repeat til I got the entire drive.

I tend to prefer image captures of failing drives, and keep the seeking and 
reading to a minimum.  You can always mount the image and pull files out of the 
filesystem.

I've also used r-studio for recovering files from filesystem images.  Works 
pretty good (though I have no idea if they support ufs).

I've also done things like:

* Make an image
* Huh, drive seems to still be working, use tar or whatever.
* Stare at drive and finally throw it out.

Drives aren't worth trying to salvage, in my opinion.

As for having N copies of files: You're just going to have to bite the bullet 
on that.  You have the following problems:

* Duplicate filenames, different data (think a file name foo, one of which is 
an image, one is text)
* Duplicate filenames, delta data (versions of files, primarily)
* Renamed files.

I've gotten fairly good over the year at doing these n-way merges (using tools 
like melt, the gnu diff -r option, etc).

My only real advice above back it up first is: DO NOT use the backup as your 
working copy.  You *will* cry when you realize you just nuked the wrong file -- 
and the HD was dying, and you can't get it back.

Good luck.