On 25/06/2016 3:50 AM, Austin S. Hemmelgarn wrote:
> On 2016-06-24 13:43, Steven Haigh wrote:
>> On 25/06/16 03:40, Austin S. Hemmelgarn wrote:
>>> On 2016-06-24 13:05, Steven Haigh wrote:
>>>> On 25/06/16 02:59, ronnie sahlberg wrote:
>>>> What I have in mind here is that a file seems to get CREATED when I
>>>> copy
>>>> the file that crashes the system in the target directory. I'm thinking
>>>> if I 'cp -an source/ target/' that it will make this somewhat easier
>>>> (it
>>>> won't overwrite the zero byte file).
>>> You may want to try with rsync (rsync -vahogSHAXOP should get just about
>>> everything possible out of the filesystem except for some security
>>> attributes (stuff like SELinux context), and will give you nice
>>> information about progress as well).  It will keep running in the face
>>> of individual read errors, and will only try each file once.  It also
>>> has the advantage of showing you the transfer rate and exactly where in
>>> the directory structure you are, and handles partial copies sanely too
>>> (it's more reliable restarting an rsync transfer than a cp one that got
>>> interrupted part way through).
>>
>> I may try that - I came up with this:
>> #!/bin/bash
>>
>> mount -o ro,nossd,degraded /dev/xvdc /mnt/fileshare/
>>
>> find /mnt/fileshare/data/Photos/ -type f -print0 |
>>     while IFS= read -r -d $'\0' line; do
>>         echo "Processing $line"
>>         DIR=`dirname "$line"`
>>         mkdir -p "/mnt/recover/$DIR"
>>         if [ ! -e "/mnt/recover/$line" ]; then
>>                 echo "Copying $line to /mnt/recover/$line"
>>                 touch "/mnt/recover/$line"
>>                 sync
>>                 cp -f "$line" "/mnt/recover/$line"
>>                 sync
>>         fi
>>     done
>>
>> umount /mnt/fileshare
>>
>> I'm slowly picking through the data - and it has crashed a few times...
>> It seems that there are some checksum failures that don't crash the
>> entire system - so that's a good thing to know - not sure if that means
>> that it is correcting the data with parity - or something else.
>>
>> I'll see how much data I can extract with this and go from there - as it
>> may be good enough to call it a success.
>>
> AH, if you're having issues with crashes when you hit errors, you may
> want to avoid rsync then, it will try to reread any files that don't
> match in size and mtime, so it would likely just keep crashing on the
> same file over and over again.
> 
> Also, looking at the script you've got, that will probably run faster
> too because it shouldn't need to call stat() on everything like rsync
> does (because of the size and mtime comparison).

Well, as a data point, the data is slowly coming off the RAID6 array.
Some stuff is just dead and crashes the entire host whenever you try to
access it. At the moment, my average uptime is about 2-3 minutes...

I've added my recovery rsync script to /etc/rc.local - and I'm just
starting / destroying the VM every time it crashes.

I'm also rsync'ing the data from that system out to other areas of
storage so I can pull off as much data as possible (I don't have a spare
4.4Tb to use).

I lost a total of 5 photos out of 83Gb worth - which is good. My music
collection doesn't seem to be that lucky - which means lots of time
ripping CDs in the future :P

I haven't tried the applications / ISOs directory yet - but we'll see
how that goes when I get there...

The photos were the main thing I was concerned about, the rest is just
handy.

Interesting though that EVERY crash references:
        kernel BUG at fs/btrfs/extent_io.c:2401!


-- 
Steven Haigh

Email: net...@crc.id.au
Web: https://www.crc.id.au
Phone: (03) 9001 6090 - 0412 935 897

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to